All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: X86 ML <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Dave Jones <davej@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 3/4] x86,mm: Improve _install_special_mapping and fix x86 vdso naming
Date: Tue, 20 May 2014 21:47:59 +0400	[thread overview]
Message-ID: <20140520174759.GK2185@moon> (raw)
In-Reply-To: <CALCETrWSgjc+iymPrvC9xiz1z4PqQS9e9F5mRLNnuabWTjQGQQ@mail.gmail.com>

On Tue, May 20, 2014 at 10:24:49AM -0700, Andy Lutomirski wrote:
> On Tue, May 20, 2014 at 10:21 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > On Mon, May 19, 2014 at 03:58:33PM -0700, Andy Lutomirski wrote:
> >> Using arch_vma_name to give special mappings a name is awkward.  x86
> >> currently implements it by comparing the start address of the vma to
> >> the expected address of the vdso.  This requires tracking the start
> >> address of special mappings and is probably buggy if a special vma
> >> is split or moved.
> >>
> >> Improve _install_special_mapping to just name the vma directly.  Use
> >> it to give the x86 vvar area a name, which should make CRIU's life
> >> easier.
> >>
> >> As a side effect, the vvar area will show up in core dumps.  This
> >> could be considered weird and is fixable.  Thoughts?
> >>
> >> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> >> Cc: Pavel Emelyanov <xemul@parallels.com>
> >> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> >
> > Hi Andy, thanks a lot for this! I must confess I don't yet know how
> > would we deal with compat tasks but this is 'must have' mark which
> > allow us to detect vvar area!
> 
> Out of curiosity, how does CRIU currently handle checkpointing a
> restored task?  In current kernels, the "[vdso]" name in maps goes
> away after mremapping the vdso.

  We use not only [vdso] mark to detect vdso area but also page frame
number of the living vdso. If mark is not present in procfs output
we examinate executable areas and check if pfn == vdso_pfn, it's
a slow path because there migh be a bunch of executable areas and
touching every of it is not that fast thing, but we simply have no
choise.

  The situation get worse when task was dumped on one kernel and
then restored on another kernel where vdso content is different
from one save in image -- is such case as I mentioned we need
that named vdso proxy which redirect calls to vdso of the system
where task is restoring. And when such "restored" task get checkpointed
second time we don't dump new living vdso but save only old vdso
proxy on disk (detecting it is a different story, in short we
inject a unique mark into elf header).

> 
> I suspect that you'll need kernel changes for compat tasks, since I
> think that mremapping the vdso on any reasonably modern hardware in a
> 32-bit task will cause sigreturn to blow up.  This could be fixed by
> making mremap magical, although adding a new prctl or arch_prctl to
> reliably move the vdso might be a better bet.

Well, as far as I understand compat code uses abs addressing for
vvar data and if vvar data position doesn't change we're safe,
but same time because vvar addresses are not abi I fear one day
we indeed hit the problems and the only solution would be
to use kernel's help. But again, Andy, I didn't think much
about implementing compat mode in criu yet so i might be
missing some details.

WARNING: multiple messages have this Message-ID (diff)
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: X86 ML <x86@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sasha Levin <sasha.levin@oracle.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Dave Jones <davej@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 3/4] x86,mm: Improve _install_special_mapping and fix x86 vdso naming
Date: Tue, 20 May 2014 21:47:59 +0400	[thread overview]
Message-ID: <20140520174759.GK2185@moon> (raw)
In-Reply-To: <CALCETrWSgjc+iymPrvC9xiz1z4PqQS9e9F5mRLNnuabWTjQGQQ@mail.gmail.com>

On Tue, May 20, 2014 at 10:24:49AM -0700, Andy Lutomirski wrote:
> On Tue, May 20, 2014 at 10:21 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > On Mon, May 19, 2014 at 03:58:33PM -0700, Andy Lutomirski wrote:
> >> Using arch_vma_name to give special mappings a name is awkward.  x86
> >> currently implements it by comparing the start address of the vma to
> >> the expected address of the vdso.  This requires tracking the start
> >> address of special mappings and is probably buggy if a special vma
> >> is split or moved.
> >>
> >> Improve _install_special_mapping to just name the vma directly.  Use
> >> it to give the x86 vvar area a name, which should make CRIU's life
> >> easier.
> >>
> >> As a side effect, the vvar area will show up in core dumps.  This
> >> could be considered weird and is fixable.  Thoughts?
> >>
> >> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> >> Cc: Pavel Emelyanov <xemul@parallels.com>
> >> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> >
> > Hi Andy, thanks a lot for this! I must confess I don't yet know how
> > would we deal with compat tasks but this is 'must have' mark which
> > allow us to detect vvar area!
> 
> Out of curiosity, how does CRIU currently handle checkpointing a
> restored task?  In current kernels, the "[vdso]" name in maps goes
> away after mremapping the vdso.

  We use not only [vdso] mark to detect vdso area but also page frame
number of the living vdso. If mark is not present in procfs output
we examinate executable areas and check if pfn == vdso_pfn, it's
a slow path because there migh be a bunch of executable areas and
touching every of it is not that fast thing, but we simply have no
choise.

  The situation get worse when task was dumped on one kernel and
then restored on another kernel where vdso content is different
from one save in image -- is such case as I mentioned we need
that named vdso proxy which redirect calls to vdso of the system
where task is restoring. And when such "restored" task get checkpointed
second time we don't dump new living vdso but save only old vdso
proxy on disk (detecting it is a different story, in short we
inject a unique mark into elf header).

> 
> I suspect that you'll need kernel changes for compat tasks, since I
> think that mremapping the vdso on any reasonably modern hardware in a
> 32-bit task will cause sigreturn to blow up.  This could be fixed by
> making mremap magical, although adding a new prctl or arch_prctl to
> reliably move the vdso might be a better bet.

Well, as far as I understand compat code uses abs addressing for
vvar data and if vvar data position doesn't change we're safe,
but same time because vvar addresses are not abi I fear one day
we indeed hit the problems and the only solution would be
to use kernel's help. But again, Andy, I didn't think much
about implementing compat mode in criu yet so i might be
missing some details.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-05-20 17:48 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-19 22:58 [PATCH 0/4] x86,mm: vdso fixes for an OOPS and /proc/PID/maps Andy Lutomirski
2014-05-19 22:58 ` Andy Lutomirski
2014-05-19 22:58 ` [PATCH 1/4] x86,vdso: Fix an OOPS accessing the hpet mapping w/o an hpet Andy Lutomirski
2014-05-19 22:58   ` Andy Lutomirski
2014-05-21 23:21   ` [tip:x86/vdso] x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET tip-bot for Andy Lutomirski
2014-05-19 22:58 ` [PATCH 2/4] mm,fs: Add vm_ops->name as an alternative to arch_vma_name Andy Lutomirski
2014-05-19 22:58   ` Andy Lutomirski
2014-05-21 23:21   ` [tip:x86/vdso] mm, fs: Add vm_ops-> name " tip-bot for Andy Lutomirski
2014-05-19 22:58 ` [PATCH 3/4] x86,mm: Improve _install_special_mapping and fix x86 vdso naming Andy Lutomirski
2014-05-19 22:58   ` Andy Lutomirski
2014-05-20 17:21   ` Cyrill Gorcunov
2014-05-20 17:21     ` Cyrill Gorcunov
2014-05-20 17:24     ` Andy Lutomirski
2014-05-20 17:24       ` Andy Lutomirski
2014-05-20 17:47       ` Cyrill Gorcunov [this message]
2014-05-20 17:47         ` Cyrill Gorcunov
2014-05-20 17:52         ` Andy Lutomirski
2014-05-20 17:52           ` Andy Lutomirski
2014-05-20 18:01           ` Cyrill Gorcunov
2014-05-20 18:01             ` Cyrill Gorcunov
2014-05-20 18:18             ` H. Peter Anvin
2014-05-20 18:18               ` H. Peter Anvin
2014-05-20 18:24               ` Andy Lutomirski
2014-05-20 18:24                 ` Andy Lutomirski
2014-05-20 18:27                 ` H. Peter Anvin
2014-05-20 18:27                   ` H. Peter Anvin
2014-05-20 18:38                   ` Andy Lutomirski
2014-05-20 18:38                     ` Andy Lutomirski
2014-05-20 18:39                 ` Cyrill Gorcunov
2014-05-20 18:39                   ` Cyrill Gorcunov
2014-05-20 18:37   ` H. Peter Anvin
2014-05-20 18:37     ` H. Peter Anvin
2014-05-21 23:21   ` [tip:x86/vdso] x86, mm: " tip-bot for Andy Lutomirski
2014-05-19 22:58 ` [PATCH 4/4] x86,mm: Replace arch_vma_name with vm_ops->name for vsyscalls Andy Lutomirski
2014-05-19 22:58   ` Andy Lutomirski
2014-05-21 23:22   ` [tip:x86/vdso] x86, mm: Replace arch_vma_name with vm_ops-> name " tip-bot for Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140520174759.GK2185@moon \
    --to=gorcunov@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=sasha.levin@oracle.com \
    --cc=x86@kernel.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.