All of lore.kernel.org
 help / color / mirror / Atom feed
* vvar, gup && coredump
       [not found]       ` <20150311200052.GA22654@redhat.com>
@ 2015-03-12 14:34         ` Oleg Nesterov
  2015-03-12 16:29           ` Andy Lutomirski
  0 siblings, 1 reply; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 14:34 UTC (permalink / raw)
  To: Jan Kratochvil, Andy Lutomirski
  Cc: Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel

Add cc's, change subject.

On 03/11, Oleg Nesterov wrote:
>
> On 03/05, Jan Kratochvil wrote:
> >
> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
> > > > Currently it also tries to dump [vvar] (by default rules) but that is
> > > > unreadable for some reason, causing:
> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
> > > >                                                                 ^^^^^^^^^^^^^^
>
> > It would be good to get a reply from a kernel aware person what does it mean
> > before such patch gets accepted.  It can be also just a Linux kernel bug.
>
> _So far_ this doesn't look like a kernel bug to me.
>
> But! I need to recheck. In fact, it seems to me that I should discuss
> this on lkml. I have some concerns, but most probably this is only my
> misunderstanding, I need to read this (new to me) code more carefully.

Hi Andy, we need your help ;)

So, the problem is that gdb can't access the "vvar" mapping which looks
like the "normal" vma from user-space pov.

Technically this is clear. vvar_mapping->pages is the "dummy" no_pages[]
array, get_user_pages() can't succeed. In fact even follow_page() can't
work because of VM_PFNMAP/_PAGE_SPECIAL set by remap_pfn_range().

What is not clear: do we really want gup() to fail? Or it is not trivial
to turn __vvar_page into the "normal" page? (to simplify the discussion,
lets ignore hpet mapping for now).

Because this doesn't look consistent. gdb tries to "coredump" the live
process like the kernel does, but fails to dump the "r--p ... [vvar]"
region.


OK, gdb can look at VM_DONTDUMP bit in "VmFlags:" field in /proc/pid/smaps
and skip this vma. But, why (afaics) the kernel dumps this vma then? Lets
look at vma_dump_size(),

	/* always dump the vdso and vsyscall sections */
	if (always_dump_vma(vma))
		goto whole;

	if (vma->vm_flags & VM_DONTDUMP)
		return 0;

so the kernel ignores VM_DONTDUMP in this case, always_dump_vma() returns
true because of special_mapping_name(). Perhaps we should check VM_DONTDUMP
before always_dump_vma() ?


Or. We can teach gdb to read and dump its own "vvar" mapping to mimic the
kernel behaviour, this is the same read-only memory. But this hack doesn't
look nice, gdb should not know "too much" about the kernel internals.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 14:34         ` vvar, gup && coredump Oleg Nesterov
@ 2015-03-12 16:29           ` Andy Lutomirski
  2015-03-12 16:54             ` Oleg Nesterov
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 16:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> Add cc's, change subject.
>
> On 03/11, Oleg Nesterov wrote:
>>
>> On 03/05, Jan Kratochvil wrote:
>> >
>> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
>> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
>> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>> > > > Currently it also tries to dump [vvar] (by default rules) but that is
>> > > > unreadable for some reason, causing:
>> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
>> > > >                                                                 ^^^^^^^^^^^^^^
>>
>> > It would be good to get a reply from a kernel aware person what does it mean
>> > before such patch gets accepted.  It can be also just a Linux kernel bug.
>>
>> _So far_ this doesn't look like a kernel bug to me.
>>
>> But! I need to recheck. In fact, it seems to me that I should discuss
>> this on lkml. I have some concerns, but most probably this is only my
>> misunderstanding, I need to read this (new to me) code more carefully.
>
> Hi Andy, we need your help ;)
>
> So, the problem is that gdb can't access the "vvar" mapping which looks
> like the "normal" vma from user-space pov.
>
> Technically this is clear. vvar_mapping->pages is the "dummy" no_pages[]
> array, get_user_pages() can't succeed. In fact even follow_page() can't
> work because of VM_PFNMAP/_PAGE_SPECIAL set by remap_pfn_range().
>
> What is not clear: do we really want gup() to fail? Or it is not trivial
> to turn __vvar_page into the "normal" page? (to simplify the discussion,
> lets ignore hpet mapping for now).

We could presumably fiddle with the vma to allow get_user_pages to
work on at least the first vvar page.  There are some decently large
caveats, though:

 - We don't want to COW it.  If someone pokes at that page with
ptrace, for example, and it gets COWed, everything will stop working
because the offending process will no longer see updates.  That way
lies infinite loops.

 - The implementation could be odd.  The vma is either VM_MIXEDMAP or
VM_PFNMAP, and I don't see any practical way to change that.

 - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
a struct page at all, so you can't possibly get_user_pages it.

>
> Because this doesn't look consistent. gdb tries to "coredump" the live
> process like the kernel does, but fails to dump the "r--p ... [vvar]"
> region.
>
>
> OK, gdb can look at VM_DONTDUMP bit in "VmFlags:" field in /proc/pid/smaps
> and skip this vma. But, why (afaics) the kernel dumps this vma then? Lets
> look at vma_dump_size(),
>
>         /* always dump the vdso and vsyscall sections */
>         if (always_dump_vma(vma))
>                 goto whole;
>
>         if (vma->vm_flags & VM_DONTDUMP)
>                 return 0;
>
> so the kernel ignores VM_DONTDUMP in this case, always_dump_vma() returns
> true because of special_mapping_name(). Perhaps we should check VM_DONTDUMP
> before always_dump_vma() ?
>

That sounds reasonable to me.  I'll write the patch later today.  gdb
will still need changes, though, right?

--Andy

>
> Or. We can teach gdb to read and dump its own "vvar" mapping to mimic the
> kernel behaviour, this is the same read-only memory. But this hack doesn't
> look nice, gdb should not know "too much" about the kernel internals.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:29           ` Andy Lutomirski
@ 2015-03-12 16:54             ` Oleg Nesterov
  2015-03-12 17:17               ` Andy Lutomirski
  2015-03-12 17:46               ` Oleg Nesterov
  0 siblings, 2 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 16:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > What is not clear: do we really want gup() to fail? Or it is not trivial
> > to turn __vvar_page into the "normal" page? (to simplify the discussion,
> > lets ignore hpet mapping for now).
>
> We could presumably fiddle with the vma to allow get_user_pages to
> work on at least the first vvar page.  There are some decently large
> caveats, though:
>
>  - We don't want to COW it.  If someone pokes at that page with
> ptrace, for example, and it gets COWed, everything will stop working
> because the offending process will no longer see updates.  That way
> lies infinite loops.

Of course, but this looks simple... is_cow_mapping() == F so FOLL_FORCE
won't work anyway?

>  - The implementation could be odd.  The vma is either VM_MIXEDMAP or
> VM_PFNMAP, and I don't see any practical way to change that.
>
>  - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
> a struct page at all, so you can't possibly get_user_pages it.

Yes, this is true. OK, lets not dump it. I'll probably send a patch which
changes vma_dump_size() to check VM_DONTDUMP first...

But this leads to another question: why do we want to expose this
"vvar" vma at all?

For the moment, forget about compat 32-bit applications running under
64-bit kernel.

Can't we simply add FIX_VVAR_PAGE into fixed_addresses{}, map it into
init_mm via set_fixmap(FIX_VVAR_PAGE, __PAGE_USER) and change __vdso.*
functions to use fix_to_virt() address?

I don't really understand the low-level details, I'd like to understand
if this can work or not. And if it can work, why this is undesirable.

As for 32-bit applications. Yes, this can't work because 32-bit simply
can't access this "high" memory. But you know, it would be very nice to
have the fixmap-like "global" area in init_mm which is also visible to
compat applications. If we had it, uprobes could work without xol vma's.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:54             ` Oleg Nesterov
@ 2015-03-12 17:17               ` Andy Lutomirski
  2015-03-12 17:39                 ` Oleg Nesterov
  2015-03-12 17:46               ` Oleg Nesterov
  1 sibling, 1 reply; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Andy Lutomirski wrote:
>>
>> On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> >
>> > What is not clear: do we really want gup() to fail? Or it is not trivial
>> > to turn __vvar_page into the "normal" page? (to simplify the discussion,
>> > lets ignore hpet mapping for now).
>>
>> We could presumably fiddle with the vma to allow get_user_pages to
>> work on at least the first vvar page.  There are some decently large
>> caveats, though:
>>
>>  - We don't want to COW it.  If someone pokes at that page with
>> ptrace, for example, and it gets COWed, everything will stop working
>> because the offending process will no longer see updates.  That way
>> lies infinite loops.
>
> Of course, but this looks simple... is_cow_mapping() == F so FOLL_FORCE
> won't work anyway?
>
>>  - The implementation could be odd.  The vma is either VM_MIXEDMAP or
>> VM_PFNMAP, and I don't see any practical way to change that.
>>
>>  - The HPET and perhaps pvclock stuff.  The HPET probably doesn't have
>> a struct page at all, so you can't possibly get_user_pages it.
>
> Yes, this is true. OK, lets not dump it. I'll probably send a patch which
> changes vma_dump_size() to check VM_DONTDUMP first...
>
> But this leads to another question: why do we want to expose this
> "vvar" vma at all?
>
> For the moment, forget about compat 32-bit applications running under
> 64-bit kernel.
>
> Can't we simply add FIX_VVAR_PAGE into fixed_addresses{}, map it into
> init_mm via set_fixmap(FIX_VVAR_PAGE, __PAGE_USER) and change __vdso.*
> functions to use fix_to_virt() address?
>
> I don't really understand the low-level details, I'd like to understand
> if this can work or not. And if it can work, why this is undesirable.
>
> As for 32-bit applications. Yes, this can't work because 32-bit simply
> can't access this "high" memory. But you know, it would be very nice to
> have the fixmap-like "global" area in init_mm which is also visible to
> compat applications. If we had it, uprobes could work without xol vma's.
>

It could work for 32-bit native, but not for 32-bit compat.  Also, I
have grand plans to add per-task vvar overrides for seccomp and such.
And RIP-relative addressing is a bit nicer than absolute :)

It used to work that way, but we changed it in 3.15 IIRC.

On a related note, I'm hoping to rework the mm part pretty heavily:

http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net

--Andy

> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:17               ` Andy Lutomirski
@ 2015-03-12 17:39                 ` Oleg Nesterov
  2015-03-12 17:45                   ` Sergio Durigan Junior
  2015-03-12 17:55                   ` Andy Lutomirski
  0 siblings, 2 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 17:39 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 03/12, Andy Lutomirski wrote:
> >>
> > As for 32-bit applications. Yes, this can't work because 32-bit simply
> > can't access this "high" memory. But you know, it would be very nice to
> > have the fixmap-like "global" area in init_mm which is also visible to
> > compat applications. If we had it, uprobes could work without xol vma's.
> >
> It could work for 32-bit native, but not for 32-bit compat.

Yes, yes, I meant 32-bit compat apps. Once again, it would be nice if we
had the "low" fixmaps in init_mm. But unlikely this is possible...

> On a related note, I'm hoping to rework the mm part pretty heavily:
>
> http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net

OK... not that I really understand this email.

Well. Speaking of vdso. I understand that unlikely we can do this, but
for uprobes it would be nice to have a anon-inode file behind this mapping,
so that vma_interval_tree_foreach() could work, etc. OK, this is completely
off-topic, please forget.



And I noticed that I didn't read your previous email carefully enough...

> That sounds reasonable to me.  I'll write the patch later today.

Sure, please send a patch if you want to do this.

> gdb will still need changes, though, right?

This is up to gdb developers. To me, it should simply skip this
VM_DONTDUMP vma.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:39                 ` Oleg Nesterov
@ 2015-03-12 17:45                   ` Sergio Durigan Junior
  2015-03-12 18:02                     ` Oleg Nesterov
  2015-03-12 17:55                   ` Andy Lutomirski
  1 sibling, 1 reply; 31+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 17:45 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> gdb will still need changes, though, right?
>
> This is up to gdb developers. To me, it should simply skip this
> VM_DONTDUMP vma.

If I understood this discussion correctly (and thanks Andy and Oleg for,
*ahem*, dumping all this useful information for us!), GDB will not need
modifications in the Linux kernel in this area.  In fact, my patch
already implements the "ignore VM_DONTDUMP mappings" part, so we're
pretty much covered.

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 16:54             ` Oleg Nesterov
  2015-03-12 17:17               ` Andy Lutomirski
@ 2015-03-12 17:46               ` Oleg Nesterov
  2015-03-12 17:54                 ` Andy Lutomirski
                                   ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 17:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Oleg Nesterov wrote:
>
> Yes, this is true. OK, lets not dump it.

OTOH. We can probably add ->access() into special_mapping_vmops, this
way __access_remote_vm() could work even if gup() fails ?

Jan, Sergio. How much do we want do dump this area ? The change above
should be justified.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:46               ` Oleg Nesterov
@ 2015-03-12 17:54                 ` Andy Lutomirski
  2015-03-12 18:05                   ` Oleg Nesterov
  2015-03-12 18:19                 ` Pedro Alves
  2015-03-16 19:01                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2 siblings, 1 reply; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 10:46 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> Yes, this is true. OK, lets not dump it.
>
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?

Let's wait until my special_mapping vmops rework lands to do that.
I'll dust it off and resubmit it.

>
> Jan, Sergio. How much do we want do dump this area ? The change above
> should be justified.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:39                 ` Oleg Nesterov
  2015-03-12 17:45                   ` Sergio Durigan Junior
@ 2015-03-12 17:55                   ` Andy Lutomirski
  2015-03-12 18:08                     ` Oleg Nesterov
  1 sibling, 1 reply; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 17:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On Thu, Mar 12, 2015 at 10:39 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Andy Lutomirski wrote:
>>
>> On Thu, Mar 12, 2015 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > On 03/12, Andy Lutomirski wrote:
>> >>
>> > As for 32-bit applications. Yes, this can't work because 32-bit simply
>> > can't access this "high" memory. But you know, it would be very nice to
>> > have the fixmap-like "global" area in init_mm which is also visible to
>> > compat applications. If we had it, uprobes could work without xol vma's.
>> >
>> It could work for 32-bit native, but not for 32-bit compat.
>
> Yes, yes, I meant 32-bit compat apps. Once again, it would be nice if we
> had the "low" fixmaps in init_mm. But unlikely this is possible...
>
>> On a related note, I'm hoping to rework the mm part pretty heavily:
>>
>> http://lkml.kernel.org/r/cover.1414629045.git.luto@amacapital.net
>
> OK... not that I really understand this email.
>
> Well. Speaking of vdso. I understand that unlikely we can do this, but
> for uprobes it would be nice to have a anon-inode file behind this mapping,
> so that vma_interval_tree_foreach() could work, etc. OK, this is completely
> off-topic, please forget.

Couldn't you do that directly in the uprobes code?  That is, create an
anon_inode file and just map it the old-fashioned way?

--Andy

>
>
>
> And I noticed that I didn't read your previous email carefully enough...
>
>> That sounds reasonable to me.  I'll write the patch later today.
>
> Sure, please send a patch if you want to do this.
>
>> gdb will still need changes, though, right?
>
> This is up to gdb developers. To me, it should simply skip this
> VM_DONTDUMP vma.
>
> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:45                   ` Sergio Durigan Junior
@ 2015-03-12 18:02                     ` Oleg Nesterov
  2015-03-13  4:50                       ` Sergio Durigan Junior
  0 siblings, 1 reply; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:02 UTC (permalink / raw)
  To: Sergio Durigan Junior
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On 03/12, Sergio Durigan Junior wrote:
>
> On Thursday, March 12 2015, Oleg Nesterov wrote:
>
> >> gdb will still need changes, though, right?
> >
> > This is up to gdb developers. To me, it should simply skip this
> > VM_DONTDUMP vma.
>
> If I understood this discussion correctly (and thanks Andy and Oleg for,
> *ahem*, dumping all this useful information for us!), GDB will not need
> modifications in the Linux kernel in this area.  In fact, my patch
> already implements the "ignore VM_DONTDUMP mappings" part, so we're
> pretty much covered.

OK, thanks.

And it seems that we all agree that the kernel should not dump this vma
too. Could you confirm that this is fine from gdb pov just in case?


However. Even if we do not want it in the coredump, this can confuse gdb
users which might want to read this memory during debugging. So perhaps
we still can add ->access() to "fix" PTRACE_PEEK/access_remote_vm later.

But I see another email from Andy, so lets forget about this for now.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:54                 ` Andy Lutomirski
@ 2015-03-12 18:05                   ` Oleg Nesterov
  2015-03-12 18:23                     ` Sergio Durigan Junior
  0 siblings, 1 reply; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:05 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 10:46 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > On 03/12, Oleg Nesterov wrote:
> >>
> >> Yes, this is true. OK, lets not dump it.
> >
> > OTOH. We can probably add ->access() into special_mapping_vmops, this
> > way __access_remote_vm() could work even if gup() fails ?
>
> Let's wait until my special_mapping vmops rework lands to do that.
> I'll dust it off and resubmit it.

OK. Please CC me. Not that I think I can help, just I want to understand
what you are going to do.

Although currently I do not even read most of emails I get ;)

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:55                   ` Andy Lutomirski
@ 2015-03-12 18:08                     ` Oleg Nesterov
  2015-03-12 18:33                       ` Andy Lutomirski
  0 siblings, 1 reply; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-12 18:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> On Thu, Mar 12, 2015 at 10:39 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> > Well. Speaking of vdso. I understand that unlikely we can do this, but
> > for uprobes it would be nice to have a anon-inode file behind this mapping,
> > so that vma_interval_tree_foreach() could work, etc. OK, this is completely
> > off-topic, please forget.
>
> Couldn't you do that directly in the uprobes code?  That is, create an
> anon_inode file and just map it the old-fashioned way?

This won't help. Uprobes wants this file mmaped by all applications, so
that build_map_info() can find mm's, vma's, etc to install the system-wide
breakpoint. But again, this is off-topic and unlikely possible.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 17:46               ` Oleg Nesterov
  2015-03-12 17:54                 ` Andy Lutomirski
@ 2015-03-12 18:19                 ` Pedro Alves
  2015-03-12 18:25                   ` Andy Lutomirski
  2015-03-16 19:01                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2 siblings, 1 reply; 31+ messages in thread
From: Pedro Alves @ 2015-03-12 18:19 UTC (permalink / raw)
  To: Oleg Nesterov, Andy Lutomirski
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, linux-kernel

On 03/12/2015 05:46 PM, Oleg Nesterov wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> Yes, this is true. OK, lets not dump it.
> 
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?
> 
> Jan, Sergio. How much do we want do dump this area ? The change above
> should be justified.

Memory mappings that weren't touched since they were initially mapped can
be retrieved from the program binary and the shared libraries, even if
the core dump is moved to another machine.  However, in vvar case,
sounds like  there's nowhere to read it from offline?  In that case,
it could be justified to dump it.

Thanks,
Pedro Alves


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:05                   ` Oleg Nesterov
@ 2015-03-12 18:23                     ` Sergio Durigan Junior
  0 siblings, 0 replies; 31+ messages in thread
From: Sergio Durigan Junior @ 2015-03-12 18:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> Let's wait until my special_mapping vmops rework lands to do that.
>> I'll dust it off and resubmit it.
>
> OK. Please CC me. Not that I think I can help, just I want to understand
> what you are going to do.

If you think your work will impact the core dumping part, please include
me in the Cc as well.

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:19                 ` Pedro Alves
@ 2015-03-12 18:25                   ` Andy Lutomirski
  0 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 18:25 UTC (permalink / raw)
  To: Pedro Alves
  Cc: Oleg Nesterov, Jan Kratochvil, Sergio Durigan Junior,
	GDB Patches, linux-kernel

On Thu, Mar 12, 2015 at 11:19 AM, Pedro Alves <palves@redhat.com> wrote:
> On 03/12/2015 05:46 PM, Oleg Nesterov wrote:
>> On 03/12, Oleg Nesterov wrote:
>>>
>>> Yes, this is true. OK, lets not dump it.
>>
>> OTOH. We can probably add ->access() into special_mapping_vmops, this
>> way __access_remote_vm() could work even if gup() fails ?
>>
>> Jan, Sergio. How much do we want do dump this area ? The change above
>> should be justified.
>
> Memory mappings that weren't touched since they were initially mapped can
> be retrieved from the program binary and the shared libraries, even if
> the core dump is moved to another machine.  However, in vvar case,
> sounds like  there's nowhere to read it from offline?  In that case,
> it could be justified to dump it.

This is why we currently dump the vdso text.  On arm64 (the only other
architecture that uses a real vma for vvar data IIRC), we use a more
normal vma and we dump it.  x86 is the odd one out here.

We could just leave the kernel alone.  The data that gets dumped is of
dubious value, but it could be slightly helpful when debugging vdso
crashes*, but, of course, dumping it is inherently racy.


* The vdso never crashes :)

--Andy

>
> Thanks,
> Pedro Alves
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:08                     ` Oleg Nesterov
@ 2015-03-12 18:33                       ` Andy Lutomirski
  2015-03-13 15:22                         ` Oleg Nesterov
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-12 18:33 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: linux-kernel

[removed people who will be bored]

On Thu, Mar 12, 2015 at 11:08 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Andy Lutomirski wrote:
>>
>> On Thu, Mar 12, 2015 at 10:39 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> > Well. Speaking of vdso. I understand that unlikely we can do this, but
>> > for uprobes it would be nice to have a anon-inode file behind this mapping,
>> > so that vma_interval_tree_foreach() could work, etc. OK, this is completely
>> > off-topic, please forget.
>>
>> Couldn't you do that directly in the uprobes code?  That is, create an
>> anon_inode file and just map it the old-fashioned way?
>
> This won't help. Uprobes wants this file mmaped by all applications, so
> that build_map_info() can find mm's, vma's, etc to install the system-wide
> breakpoint. But again, this is off-topic and unlikely possible.
>

What's wrong with off-topic? :)

As vvar demonstrates, it's possible to add new per-process vmas that
show up in all processes automatically.  The trickiest part is making
it work with CRIU.

--Andy

> Oleg.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:02                     ` Oleg Nesterov
@ 2015-03-13  4:50                       ` Sergio Durigan Junior
  2015-03-13 15:04                         ` Oleg Nesterov
  0 siblings, 1 reply; 31+ messages in thread
From: Sergio Durigan Junior @ 2015-03-13  4:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On Thursday, March 12 2015, Oleg Nesterov wrote:

>> If I understood this discussion correctly (and thanks Andy and Oleg for,
>> *ahem*, dumping all this useful information for us!), GDB will not need
>> modifications in the Linux kernel in this area.  In fact, my patch
>> already implements the "ignore VM_DONTDUMP mappings" part, so we're
>> pretty much covered.
>
> OK, thanks.
>
> And it seems that we all agree that the kernel should not dump this vma
> too. Could you confirm that this is fine from gdb pov just in case?

Yes, this is what we expect from the GDB side.  This mapping is marked
as "dd", so it does not make sense to dump it.

While I have you guys, would it be possible for the Linux kernel to
include a new flag on VmFlags to uniquely identify an anonymous mapping?
Currently, there is no easy way to do that from userspace.  My patch
implements the following heuristic on GDB:

  if (pathname == "/dev/zero (deleted)"
      || pathname == "/SYSV%08x (deleted)"
      || pathname == "<file> (deleted)"
      || "Anonymous:" field is > 0 kB
      || "AnonHugePages:" field is > 0 kB)
    mapping is anonymous;

However, this can be fragile.  The Linux kernel checks for i_nlink == 0,
but there is no easy way for GDB to check this on userspace (as Jan
mentioned, one could look at /proc/PID/map_files/, but they are only
accessible by root).  That's why I think it would be good to provide
this info right away in /proc/PID/smaps...

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-13  4:50                       ` Sergio Durigan Junior
@ 2015-03-13 15:04                         ` Oleg Nesterov
  0 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-13 15:04 UTC (permalink / raw)
  To: Sergio Durigan Junior
  Cc: Andy Lutomirski, Jan Kratochvil, GDB Patches, Pedro Alves, linux-kernel

On 03/13, Sergio Durigan Junior wrote:
>
> On Thursday, March 12 2015, Oleg Nesterov wrote:
>
> > And it seems that we all agree that the kernel should not dump this vma
> > too. Could you confirm that this is fine from gdb pov just in case?
>
> Yes, this is what we expect from the GDB side.  This mapping is marked
> as "dd", so it does not make sense to dump it.

OK.

> While I have you guys, would it be possible for the Linux kernel to
> include a new flag on VmFlags to uniquely identify an anonymous mapping?

Note that "anonymous" is not the right term here... I mean it is a bit
confusing. Lets discuss this again on debug-list, then we will see if
gdb needs more info from kernel.

> Currently, there is no easy way to do that from userspace.  My patch
> implements the following heuristic on GDB:
>
>   if (pathname == "/dev/zero (deleted)"
>       || pathname == "/SYSV%08x (deleted)"
>       || pathname == "<file> (deleted)"

And for example, this is not anonymous mapping. But,

>     mapping is anonymous;

I agree, gdb should treat it as anonymous.

> However, this can be fragile.  The Linux kernel checks for i_nlink == 0,

Yes, as we already disccussed, I think the kernel should be changed.

It should do something like shmem_mapping() || d_unlinked(), I think.
But this needs another discussion on lkml, and in another thread.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: vvar, gup && coredump
  2015-03-12 18:33                       ` Andy Lutomirski
@ 2015-03-13 15:22                         ` Oleg Nesterov
  0 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-13 15:22 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: linux-kernel

On 03/12, Andy Lutomirski wrote:
>
> What's wrong with off-topic? :)

OK ;)

> As vvar demonstrates, it's possible to add new per-process vmas that
> show up in all processes automatically.  The trickiest part is making
> it work with CRIU.

Ah... I seem to finaly understand why exactly you added this "vvar" vma...

To make the life of c/r developers more interesting, right?

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-12 17:46               ` Oleg Nesterov
  2015-03-12 17:54                 ` Andy Lutomirski
  2015-03-12 18:19                 ` Pedro Alves
@ 2015-03-16 19:01                 ` Oleg Nesterov
  2015-03-16 19:20                     ` Andy Lutomirski
  2015-03-16 19:40                   ` Pedro Alves
  2 siblings, 2 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-16 19:01 UTC (permalink / raw)
  To: Andy Lutomirski, Hugh Dickins, Linus Torvalds
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, Pedro Alves,
	linux-kernel

On 03/12, Oleg Nesterov wrote:
>
> OTOH. We can probably add ->access() into special_mapping_vmops, this
> way __access_remote_vm() could work even if gup() fails ?

So I tried to think how special_mapping_vmops->access() can work, it
needs to rely on ->vm_pgoff.

But afaics this logic is just broken. Lets even forget about vvar vma
which uses remap_file_pages(). Lets look at "[vdso]" which uses the
"normal" pages.

The comment in special_mapping_fault() says

	 * special mappings have no vm_file, and in that case, the mm
	 * uses vm_pgoff internally.

Yes. But afaics mm/ doesn't do this correctly. So

	 * do not copy this code into drivers!

looks like a good recommendation ;)

I think that this logic is wrong even if ARRAY_SIZE(pages) == 1, but I am
not sure. But since vdso use 2 pages, it is trivial to show that this logic
is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
but this test-case can show the problem too:

	#include <stdio.h>
	#include <unistd.h>
	#include <stdlib.h>
	#include <string.h>
	#include <sys/mman.h>
	#include <assert.h>

	void *find_vdso_vaddr(void)
	{
		FILE *perl;
		char buf[32] = {};

		perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
				"/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
		fread(buf, sizeof(buf), 1, perl);
		fclose(perl);

		return (void *)atol(buf);
	}

	#define PAGE_SIZE	4096

	int main(void)
	{
		void *vdso = find_vdso_vaddr();
		assert(vdso);

		// of course they should differ, and they do so far
		printf("vdso pages differ: %d\n",
			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

		// split into 2 vma's
		assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);

		// force another fault on the next check
		assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

		// now they no longer differ, the 2nd vm_pgoff is wrong
		printf("vdso pages differ: %d\n",
			!!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));

		return 0;
	}

output:

	vdso pages differ: 1
	vdso pages differ: 0

And not only "split_vma" is wrong, I think that "move_vma" is not right too.
Note this check in copy_vma(),

	/*
	 * If anonymous vma has not yet been faulted, update new pgoff
	 * to match new location, to increase its chance of merging.
	 */
	if (unlikely(!vma->vm_file && !vma->anon_vma)) {
		pgoff = addr >> PAGE_SHIFT;
		faulted_in_anon_vma = false;
	}

I can easily misread this code. But it doesn't look right too. If vdso was cow'ed
(breakpoint installed by gdb) and sys_nremap()'ed, then the new pgoff will be wrong
too after, say, MADV_DONTNEED.

Or I am totally confused?

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:01                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
@ 2015-03-16 19:20                     ` Andy Lutomirski
  2015-03-16 19:40                   ` Pedro Alves
  1 sibling, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-16 19:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

[cc: linux-mm]

On Mon, Mar 16, 2015 at 12:01 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> OTOH. We can probably add ->access() into special_mapping_vmops, this
>> way __access_remote_vm() could work even if gup() fails ?
>
> So I tried to think how special_mapping_vmops->access() can work, it
> needs to rely on ->vm_pgoff.
>
> But afaics this logic is just broken. Lets even forget about vvar vma
> which uses remap_file_pages(). Lets look at "[vdso]" which uses the
> "normal" pages.
>
> The comment in special_mapping_fault() says
>
>          * special mappings have no vm_file, and in that case, the mm
>          * uses vm_pgoff internally.
>
> Yes. But afaics mm/ doesn't do this correctly. So
>
>          * do not copy this code into drivers!
>
> looks like a good recommendation ;)
>
> I think that this logic is wrong even if ARRAY_SIZE(pages) == 1, but I am
> not sure. But since vdso use 2 pages, it is trivial to show that this logic
> is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
> but this test-case can show the problem too:
>
>         #include <stdio.h>
>         #include <unistd.h>
>         #include <stdlib.h>
>         #include <string.h>
>         #include <sys/mman.h>
>         #include <assert.h>
>
>         void *find_vdso_vaddr(void)
>         {
>                 FILE *perl;
>                 char buf[32] = {};
>
>                 perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
>                                 "/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
>                 fread(buf, sizeof(buf), 1, perl);
>                 fclose(perl);
>
>                 return (void *)atol(buf);
>         }
>
>         #define PAGE_SIZE       4096
>
>         int main(void)
>         {
>                 void *vdso = find_vdso_vaddr();
>                 assert(vdso);
>
>                 // of course they should differ, and they do so far
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 // split into 2 vma's
>                 assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);
>
>                 // force another fault on the next check
>                 assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

I really hope this doesn't do anything (or fails) on the vvar page,
which is a pfnmap.

>
>                 // now they no longer differ, the 2nd vm_pgoff is wrong
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 return 0;
>         }
>
> output:
>
>         vdso pages differ: 1
>         vdso pages differ: 0
>
> And not only "split_vma" is wrong, I think that "move_vma" is not right too.
> Note this check in copy_vma(),
>
>         /*
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
>         if (unlikely(!vma->vm_file && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
>
> I can easily misread this code. But it doesn't look right too. If vdso was cow'ed
> (breakpoint installed by gdb) and sys_nremap()'ed, then the new pgoff will be wrong
> too after, say, MADV_DONTNEED.
>
> Or I am totally confused?

Ick, you're probably right.  For what it's worth, the vdso *seems* to
be okay (on 64-bit only, and only if you don't poke at it too hard) if
you mremap it in one piece.  CRIU does that.

What does the mm code do with vm_pgoff for vmas with no vm_file?  I'm
mystified.  There's this comment:

 * The way we recognize COWed pages within VM_PFNMAP mappings is through the
 * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
 * set, and the vm_pgoff will point to the first PFN mapped: thus every special
 * mapping will always honor the rule
 *
 *    pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)

Is that referring to special mappings in the install_special_mapping
sense or to something else.  FWIW, the vdso ins't a VM_PFNMAP at all.

--Andy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
@ 2015-03-16 19:20                     ` Andy Lutomirski
  0 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-16 19:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

[cc: linux-mm]

On Mon, Mar 16, 2015 at 12:01 PM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/12, Oleg Nesterov wrote:
>>
>> OTOH. We can probably add ->access() into special_mapping_vmops, this
>> way __access_remote_vm() could work even if gup() fails ?
>
> So I tried to think how special_mapping_vmops->access() can work, it
> needs to rely on ->vm_pgoff.
>
> But afaics this logic is just broken. Lets even forget about vvar vma
> which uses remap_file_pages(). Lets look at "[vdso]" which uses the
> "normal" pages.
>
> The comment in special_mapping_fault() says
>
>          * special mappings have no vm_file, and in that case, the mm
>          * uses vm_pgoff internally.
>
> Yes. But afaics mm/ doesn't do this correctly. So
>
>          * do not copy this code into drivers!
>
> looks like a good recommendation ;)
>
> I think that this logic is wrong even if ARRAY_SIZE(pages) == 1, but I am
> not sure. But since vdso use 2 pages, it is trivial to show that this logic
> is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
> but this test-case can show the problem too:
>
>         #include <stdio.h>
>         #include <unistd.h>
>         #include <stdlib.h>
>         #include <string.h>
>         #include <sys/mman.h>
>         #include <assert.h>
>
>         void *find_vdso_vaddr(void)
>         {
>                 FILE *perl;
>                 char buf[32] = {};
>
>                 perl = popen("perl -e 'open STDIN,qq|/proc/@{[getppid]}/maps|;"
>                                 "/^(.*?)-.*vdso/ && print hex $1 while <>'", "r");
>                 fread(buf, sizeof(buf), 1, perl);
>                 fclose(perl);
>
>                 return (void *)atol(buf);
>         }
>
>         #define PAGE_SIZE       4096
>
>         int main(void)
>         {
>                 void *vdso = find_vdso_vaddr();
>                 assert(vdso);
>
>                 // of course they should differ, and they do so far
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 // split into 2 vma's
>                 assert(mprotect(vdso, PAGE_SIZE, PROT_READ) == 0);
>
>                 // force another fault on the next check
>                 assert(madvise(vdso, 2 * PAGE_SIZE, MADV_DONTNEED) == 0);

I really hope this doesn't do anything (or fails) on the vvar page,
which is a pfnmap.

>
>                 // now they no longer differ, the 2nd vm_pgoff is wrong
>                 printf("vdso pages differ: %d\n",
>                         !!memcmp(vdso, vdso + PAGE_SIZE, PAGE_SIZE));
>
>                 return 0;
>         }
>
> output:
>
>         vdso pages differ: 1
>         vdso pages differ: 0
>
> And not only "split_vma" is wrong, I think that "move_vma" is not right too.
> Note this check in copy_vma(),
>
>         /*
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
>         if (unlikely(!vma->vm_file && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
>
> I can easily misread this code. But it doesn't look right too. If vdso was cow'ed
> (breakpoint installed by gdb) and sys_nremap()'ed, then the new pgoff will be wrong
> too after, say, MADV_DONTNEED.
>
> Or I am totally confused?

Ick, you're probably right.  For what it's worth, the vdso *seems* to
be okay (on 64-bit only, and only if you don't poke at it too hard) if
you mremap it in one piece.  CRIU does that.

What does the mm code do with vm_pgoff for vmas with no vm_file?  I'm
mystified.  There's this comment:

 * The way we recognize COWed pages within VM_PFNMAP mappings is through the
 * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
 * set, and the vm_pgoff will point to the first PFN mapped: thus every special
 * mapping will always honor the rule
 *
 *    pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)

Is that referring to special mappings in the install_special_mapping
sense or to something else.  FWIW, the vdso ins't a VM_PFNMAP at all.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:01                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
  2015-03-16 19:20                     ` Andy Lutomirski
@ 2015-03-16 19:40                   ` Pedro Alves
  1 sibling, 0 replies; 31+ messages in thread
From: Pedro Alves @ 2015-03-16 19:40 UTC (permalink / raw)
  To: Oleg Nesterov, Andy Lutomirski, Hugh Dickins, Linus Torvalds
  Cc: Jan Kratochvil, Sergio Durigan Junior, GDB Patches, linux-kernel

Thanks for looking over all this, guys.  Really appreciated.

On 03/16/2015 07:01 PM, Oleg Nesterov wrote:
> is wrong. To verify, I changed show_map_vma() to expose pgoff even if !file,
> but this test-case can show the problem too:

Might be good to add tests like this to selftests/ once all
this is sorted.

Pedro Alves


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:20                     ` Andy Lutomirski
@ 2015-03-16 19:44                       ` Oleg Nesterov
  -1 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-16 19:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Andy Lutomirski wrote:
>
> Ick, you're probably right.  For what it's worth, the vdso *seems* to
> be okay (on 64-bit only, and only if you don't poke at it too hard) if
> you mremap it in one piece.  CRIU does that.

I need to run away till tomorrow, but looking at this code even if "one piece"
case doesn't look right if it was cow'ed. I'll verify tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
@ 2015-03-16 19:44                       ` Oleg Nesterov
  0 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-16 19:44 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Andy Lutomirski wrote:
>
> Ick, you're probably right.  For what it's worth, the vdso *seems* to
> be okay (on 64-bit only, and only if you don't poke at it too hard) if
> you mremap it in one piece.  CRIU does that.

I need to run away till tomorrow, but looking at this code even if "one piece"
case doesn't look right if it was cow'ed. I'll verify tomorrow.

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-16 19:44                       ` Oleg Nesterov
@ 2015-03-17 13:43                         ` Oleg Nesterov
  -1 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-17 13:43 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Oleg Nesterov wrote:
>
> On 03/16, Andy Lutomirski wrote:
> >
> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
> > you mremap it in one piece.  CRIU does that.
>
> I need to run away till tomorrow, but looking at this code even if "one piece"
> case doesn't look right if it was cow'ed. I'll verify tomorrow.

And I am still not sure this all is 100% correct, but I got lost in this code.
Probably this is fine...

But at least the bug exposed by the test-case looks clear:

	do_linear_fault:

		vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
				+ vma->vm_pgoff;
		...

		special_mapping_fault:

			pgoff = vmf->pgoff - vma->vm_pgoff;


So special_mapping_fault() can only work if this mapping starts from the
first page in ->pages[].

So perhaps we need _something like_ the (wrong/incomplete) patch below...

Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
simply mmap the anon_inode file...

Oleg.

--- x/mm/mmap.c
+++ x/mm/mmap.c
@@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
 	return 0;
 }
 
+bool is_special_vma(struct vm_area_struct *vma);
+
 /*
  * Copy the vma structure to a new location in the same mm,
  * prior to moving page table entries, to effect an mremap move.
@@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 	 * If anonymous vma has not yet been faulted, update new pgoff
 	 * to match new location, to increase its chance of merging.
 	 */
-	if (unlikely(!vma->vm_file && !vma->anon_vma)) {
+	if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
 		pgoff = addr >> PAGE_SHIFT;
 		faulted_in_anon_vma = false;
 	}
@@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
 	.fault = special_mapping_fault,
 };
 
+bool is_special_vma(struct vm_area_struct *vma)
+{
+	return vma->vm_ops == &special_mapping_vmops;
+}
+
 static int special_mapping_fault(struct vm_area_struct *vma,
 				struct vm_fault *vmf)
 {
@@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
 	 * We are allowed to do this because we are the mm; do not copy
 	 * this code into drivers!
 	 */
-	pgoff = vmf->pgoff - vma->vm_pgoff;
+	pgoff = vmf->pgoff;
 
 	if (vma->vm_ops == &legacy_special_mapping_vmops)
 		pages = vma->vm_private_data;
@@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
 	if (ret)
 		goto out;
 
+	vma->vm_pgoff = 0;
 	mm->total_vm += len >> PAGE_SHIFT;
 
 	perf_event_mmap(vma);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
@ 2015-03-17 13:43                         ` Oleg Nesterov
  0 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-17 13:43 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/16, Oleg Nesterov wrote:
>
> On 03/16, Andy Lutomirski wrote:
> >
> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
> > you mremap it in one piece.  CRIU does that.
>
> I need to run away till tomorrow, but looking at this code even if "one piece"
> case doesn't look right if it was cow'ed. I'll verify tomorrow.

And I am still not sure this all is 100% correct, but I got lost in this code.
Probably this is fine...

But at least the bug exposed by the test-case looks clear:

	do_linear_fault:

		vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
				+ vma->vm_pgoff;
		...

		special_mapping_fault:

			pgoff = vmf->pgoff - vma->vm_pgoff;


So special_mapping_fault() can only work if this mapping starts from the
first page in ->pages[].

So perhaps we need _something like_ the (wrong/incomplete) patch below...

Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
simply mmap the anon_inode file...

Oleg.

--- x/mm/mmap.c
+++ x/mm/mmap.c
@@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
 	return 0;
 }
 
+bool is_special_vma(struct vm_area_struct *vma);
+
 /*
  * Copy the vma structure to a new location in the same mm,
  * prior to moving page table entries, to effect an mremap move.
@@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
 	 * If anonymous vma has not yet been faulted, update new pgoff
 	 * to match new location, to increase its chance of merging.
 	 */
-	if (unlikely(!vma->vm_file && !vma->anon_vma)) {
+	if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
 		pgoff = addr >> PAGE_SHIFT;
 		faulted_in_anon_vma = false;
 	}
@@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
 	.fault = special_mapping_fault,
 };
 
+bool is_special_vma(struct vm_area_struct *vma)
+{
+	return vma->vm_ops == &special_mapping_vmops;
+}
+
 static int special_mapping_fault(struct vm_area_struct *vma,
 				struct vm_fault *vmf)
 {
@@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
 	 * We are allowed to do this because we are the mm; do not copy
 	 * this code into drivers!
 	 */
-	pgoff = vmf->pgoff - vma->vm_pgoff;
+	pgoff = vmf->pgoff;
 
 	if (vma->vm_ops == &legacy_special_mapping_vmops)
 		pages = vma->vm_private_data;
@@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
 	if (ret)
 		goto out;
 
+	vma->vm_pgoff = 0;
 	mm->total_vm += len >> PAGE_SHIFT;
 
 	perf_event_mmap(vma);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-17 13:43                         ` Oleg Nesterov
@ 2015-03-18  1:44                           ` Andy Lutomirski
  -1 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-18  1:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/16, Oleg Nesterov wrote:
>>
>> On 03/16, Andy Lutomirski wrote:
>> >
>> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
>> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
>> > you mremap it in one piece.  CRIU does that.
>>
>> I need to run away till tomorrow, but looking at this code even if "one piece"
>> case doesn't look right if it was cow'ed. I'll verify tomorrow.
>
> And I am still not sure this all is 100% correct, but I got lost in this code.
> Probably this is fine...
>
> But at least the bug exposed by the test-case looks clear:
>
>         do_linear_fault:
>
>                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
>                                 + vma->vm_pgoff;
>                 ...
>
>                 special_mapping_fault:
>
>                         pgoff = vmf->pgoff - vma->vm_pgoff;
>
>
> So special_mapping_fault() can only work if this mapping starts from the
> first page in ->pages[].
>
> So perhaps we need _something like_ the (wrong/incomplete) patch below...
>
> Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> simply mmap the anon_inode file...

That's slightly tricky, I think, because it could start showing up in
/proc/PID/map_files or whatever it's called, and I don't think we want
that.  I also don't want to commit to all special mappings everywhere
being semantically identical (there are already two kinds on both x86
and arm64, and I'd eventually like to have them vary per-process as
well).  None of that precludes using non-null vm_file, but it's a
complication.

Your patch does look like a considerable improvement, though.  Let me
see if I can find some time to fold it in with the rest of my special
mapping rework over the next few days.

--Andy

>
> Oleg.
>
> --- x/mm/mmap.c
> +++ x/mm/mmap.c
> @@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
>         return 0;
>  }
>
> +bool is_special_vma(struct vm_area_struct *vma);
> +
>  /*
>   * Copy the vma structure to a new location in the same mm,
>   * prior to moving page table entries, to effect an mremap move.
> @@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
> -       if (unlikely(!vma->vm_file && !vma->anon_vma)) {
> +       if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
> @@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
>         .fault = special_mapping_fault,
>  };
>
> +bool is_special_vma(struct vm_area_struct *vma)
> +{
> +       return vma->vm_ops == &special_mapping_vmops;
> +}
> +
>  static int special_mapping_fault(struct vm_area_struct *vma,
>                                 struct vm_fault *vmf)
>  {
> @@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
>          * We are allowed to do this because we are the mm; do not copy
>          * this code into drivers!
>          */
> -       pgoff = vmf->pgoff - vma->vm_pgoff;
> +       pgoff = vmf->pgoff;
>
>         if (vma->vm_ops == &legacy_special_mapping_vmops)
>                 pages = vma->vm_private_data;
> @@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
>         if (ret)
>                 goto out;
>
> +       vma->vm_pgoff = 0;
>         mm->total_vm += len >> PAGE_SHIFT;
>
>         perf_event_mmap(vma);
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
@ 2015-03-18  1:44                           ` Andy Lutomirski
  0 siblings, 0 replies; 31+ messages in thread
From: Andy Lutomirski @ 2015-03-18  1:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> On 03/16, Oleg Nesterov wrote:
>>
>> On 03/16, Andy Lutomirski wrote:
>> >
>> > Ick, you're probably right.  For what it's worth, the vdso *seems* to
>> > be okay (on 64-bit only, and only if you don't poke at it too hard) if
>> > you mremap it in one piece.  CRIU does that.
>>
>> I need to run away till tomorrow, but looking at this code even if "one piece"
>> case doesn't look right if it was cow'ed. I'll verify tomorrow.
>
> And I am still not sure this all is 100% correct, but I got lost in this code.
> Probably this is fine...
>
> But at least the bug exposed by the test-case looks clear:
>
>         do_linear_fault:
>
>                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
>                                 + vma->vm_pgoff;
>                 ...
>
>                 special_mapping_fault:
>
>                         pgoff = vmf->pgoff - vma->vm_pgoff;
>
>
> So special_mapping_fault() can only work if this mapping starts from the
> first page in ->pages[].
>
> So perhaps we need _something like_ the (wrong/incomplete) patch below...
>
> Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> simply mmap the anon_inode file...

That's slightly tricky, I think, because it could start showing up in
/proc/PID/map_files or whatever it's called, and I don't think we want
that.  I also don't want to commit to all special mappings everywhere
being semantically identical (there are already two kinds on both x86
and arm64, and I'd eventually like to have them vary per-process as
well).  None of that precludes using non-null vm_file, but it's a
complication.

Your patch does look like a considerable improvement, though.  Let me
see if I can find some time to fold it in with the rest of my special
mapping rework over the next few days.

--Andy

>
> Oleg.
>
> --- x/mm/mmap.c
> +++ x/mm/mmap.c
> @@ -2832,6 +2832,8 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
>         return 0;
>  }
>
> +bool is_special_vma(struct vm_area_struct *vma);
> +
>  /*
>   * Copy the vma structure to a new location in the same mm,
>   * prior to moving page table entries, to effect an mremap move.
> @@ -2851,7 +2853,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
>          * If anonymous vma has not yet been faulted, update new pgoff
>          * to match new location, to increase its chance of merging.
>          */
> -       if (unlikely(!vma->vm_file && !vma->anon_vma)) {
> +       if (unlikely(!vma->vm_file && !is_special_vma(vma) && !vma->anon_vma)) {
>                 pgoff = addr >> PAGE_SHIFT;
>                 faulted_in_anon_vma = false;
>         }
> @@ -2953,6 +2955,11 @@ static const struct vm_operations_struct legacy_special_mapping_vmops = {
>         .fault = special_mapping_fault,
>  };
>
> +bool is_special_vma(struct vm_area_struct *vma)
> +{
> +       return vma->vm_ops == &special_mapping_vmops;
> +}
> +
>  static int special_mapping_fault(struct vm_area_struct *vma,
>                                 struct vm_fault *vmf)
>  {
> @@ -2965,7 +2972,7 @@ static int special_mapping_fault(struct vm_area_struct *vma,
>          * We are allowed to do this because we are the mm; do not copy
>          * this code into drivers!
>          */
> -       pgoff = vmf->pgoff - vma->vm_pgoff;
> +       pgoff = vmf->pgoff;
>
>         if (vma->vm_ops == &legacy_special_mapping_vmops)
>                 pages = vma->vm_private_data;
> @@ -3014,6 +3021,7 @@ static struct vm_area_struct *__install_special_mapping(
>         if (ret)
>                 goto out;
>
> +       vma->vm_pgoff = 0;
>         mm->total_vm += len >> PAGE_SHIFT;
>
>         perf_event_mmap(vma);
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
  2015-03-18  1:44                           ` Andy Lutomirski
@ 2015-03-18 18:06                             ` Oleg Nesterov
  -1 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-18 18:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/17, Andy Lutomirski wrote:
>
> On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > But at least the bug exposed by the test-case looks clear:
> >
> >         do_linear_fault:
> >
> >                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
> >                                 + vma->vm_pgoff;
> >                 ...
> >
> >                 special_mapping_fault:
> >
> >                         pgoff = vmf->pgoff - vma->vm_pgoff;
> >
> >
> > So special_mapping_fault() can only work if this mapping starts from the
> > first page in ->pages[].
> >
> > So perhaps we need _something like_ the (wrong/incomplete) patch below...
> >
> > Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> > simply mmap the anon_inode file...
>
> That's slightly tricky, I think, because it could start showing up in
> /proc/PID/map_files or whatever it's called, and I don't think we want
> that.

Hmm. To me this looke liks improvement. And again, with this change
uprobe-in-vdso can work.

OK, this is off-topic right now, lets forget this for the moment.

> Your patch does look like a considerable improvement, though.  Let me
> see if I can find some time to fold it in with the rest of my special
> mapping rework over the next few days.

I'll try to recheck... Perhaps I'll send this (changed) patch for review.
This is a bugfix, even if the bug is minor.

And note that with this change vvar->access() becomes trivial. I think it
makes sense to fix "gup() fails in vvar" too. Gdb developers have enough
other problems with the poor kernel interfaces ;)

Oleg.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: install_special_mapping && vm_pgoff (Was: vvar, gup && coredump)
@ 2015-03-18 18:06                             ` Oleg Nesterov
  0 siblings, 0 replies; 31+ messages in thread
From: Oleg Nesterov @ 2015-03-18 18:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Hugh Dickins, Linus Torvalds, Jan Kratochvil,
	Sergio Durigan Junior, GDB Patches, Pedro Alves, linux-kernel,
	linux-mm

On 03/17, Andy Lutomirski wrote:
>
> On Tue, Mar 17, 2015 at 6:43 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > But at least the bug exposed by the test-case looks clear:
> >
> >         do_linear_fault:
> >
> >                 vmf->pgoff = (((address & PAGE_MASK) - vma->vm_start) >> PAGE_SHIFT)
> >                                 + vma->vm_pgoff;
> >                 ...
> >
> >                 special_mapping_fault:
> >
> >                         pgoff = vmf->pgoff - vma->vm_pgoff;
> >
> >
> > So special_mapping_fault() can only work if this mapping starts from the
> > first page in ->pages[].
> >
> > So perhaps we need _something like_ the (wrong/incomplete) patch below...
> >
> > Or, really, perhaps we can create vdso_mapping ? So that map_vdso() could
> > simply mmap the anon_inode file...
>
> That's slightly tricky, I think, because it could start showing up in
> /proc/PID/map_files or whatever it's called, and I don't think we want
> that.

Hmm. To me this looke liks improvement. And again, with this change
uprobe-in-vdso can work.

OK, this is off-topic right now, lets forget this for the moment.

> Your patch does look like a considerable improvement, though.  Let me
> see if I can find some time to fold it in with the rest of my special
> mapping rework over the next few days.

I'll try to recheck... Perhaps I'll send this (changed) patch for review.
This is a bugfix, even if the bug is minor.

And note that with this change vvar->access() becomes trivial. I think it
makes sense to fix "gup() fails in vvar" too. Gdb developers have enough
other problems with the poor kernel interfaces ;)

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2015-03-18 18:51 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <878ufc9kau.fsf@redhat.com>
     [not found] ` <20150305154827.GA9441@host1.jankratochvil.net>
     [not found]   ` <87zj7r5fpz.fsf@redhat.com>
     [not found]     ` <20150305205744.GA13165@host1.jankratochvil.net>
     [not found]       ` <20150311200052.GA22654@redhat.com>
2015-03-12 14:34         ` vvar, gup && coredump Oleg Nesterov
2015-03-12 16:29           ` Andy Lutomirski
2015-03-12 16:54             ` Oleg Nesterov
2015-03-12 17:17               ` Andy Lutomirski
2015-03-12 17:39                 ` Oleg Nesterov
2015-03-12 17:45                   ` Sergio Durigan Junior
2015-03-12 18:02                     ` Oleg Nesterov
2015-03-13  4:50                       ` Sergio Durigan Junior
2015-03-13 15:04                         ` Oleg Nesterov
2015-03-12 17:55                   ` Andy Lutomirski
2015-03-12 18:08                     ` Oleg Nesterov
2015-03-12 18:33                       ` Andy Lutomirski
2015-03-13 15:22                         ` Oleg Nesterov
2015-03-12 17:46               ` Oleg Nesterov
2015-03-12 17:54                 ` Andy Lutomirski
2015-03-12 18:05                   ` Oleg Nesterov
2015-03-12 18:23                     ` Sergio Durigan Junior
2015-03-12 18:19                 ` Pedro Alves
2015-03-12 18:25                   ` Andy Lutomirski
2015-03-16 19:01                 ` install_special_mapping && vm_pgoff (Was: vvar, gup && coredump) Oleg Nesterov
2015-03-16 19:20                   ` Andy Lutomirski
2015-03-16 19:20                     ` Andy Lutomirski
2015-03-16 19:44                     ` Oleg Nesterov
2015-03-16 19:44                       ` Oleg Nesterov
2015-03-17 13:43                       ` Oleg Nesterov
2015-03-17 13:43                         ` Oleg Nesterov
2015-03-18  1:44                         ` Andy Lutomirski
2015-03-18  1:44                           ` Andy Lutomirski
2015-03-18 18:06                           ` Oleg Nesterov
2015-03-18 18:06                             ` Oleg Nesterov
2015-03-16 19:40                   ` Pedro Alves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.