linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
@ 2014-04-04  8:28 Hugh Dickins
  2014-04-04 12:32 ` Kirill A. Shutemov
  2014-04-24 13:30 ` Oleg Nesterov
  0 siblings, 2 replies; 7+ messages in thread
From: Hugh Dickins @ 2014-04-04  8:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Jan Kara, Roland Dreier, Oleg Nesterov,
	Konstantin Khlebnikov, Kirill A. Shutemov, Mauro Carvalho Chehab,
	Omar Ramirez Luna, Inki Dae, linux-kernel, linux-mm, linux-rdma,
	linux-media

get_user_pages(write=1, force=1) has always had odd behaviour on write-
protected shared mappings: although it demands FMODE_WRITE-access to the
underlying object (do_mmap_pgoff sets neither VM_SHARED nor VM_MAYWRITE
without that), it ends up with do_wp_page substituting private anonymous
Copied-On-Write pages for the shared file pages in the area.

That was long ago intentional, as a safety measure to prevent ptrace
setting a breakpoint (or POKETEXT or POKEDATA) from inadvertently
corrupting the underlying executable.  Yet exec and dynamic loaders
open the file read-only, and use MAP_PRIVATE rather than MAP_SHARED.

The traditional odd behaviour still causes surprises and bugs in mm,
and is probably not what any caller wants - even the comment on the flag
says "You do not want this" (although it's undoubtedly necessary for
overriding userspace protections in some contexts, and good when !write).

Let's stop doing that.  But it would be dangerous to remove the long-
standing safety at this stage, so just make get_user_pages(write,force)
fail with EFAULT when applied to a write-protected shared area.
Infiniband may in future want to force write through to underlying
object: we can add another FOLL_flag later to enable that if required.

Odd though the old behaviour was, there is no doubt that we may turn
out to break userspace with this change, and have to revert it quickly.
Issue a WARN_ON_ONCE to help debug the changed case (easily triggered
by userspace, so only once to prevent spamming the logs); and delay a
few associated cleanups until this change is proved.

get_user_pages callers who might see trouble from this change:
  ptrace poking, or writing to /proc/<pid>/mem
  drivers/infiniband/
  drivers/media/v4l2-core/
  drivers/gpu/drm/exynos/exynos_drm_gem.c
  drivers/staging/tidspbridge/core/tiomap3430.c
if they ever apply get_user_pages to write-protected shared mappings
of an object which was opened for writing.

I went to apply the same change to mm/nommu.c, but retreated.  NOMMU
has no place for COW, and its VM_flags conventions are not the same:
I'd be more likely to screw up NOMMU than make an improvement there.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
---

You suggested something like this in the LKML discussion of 887843961c4b
and "bad rss-counter message in 3.14rc5" on March 18th, and I agreed to
remind you "early in the 3.15 merge window".

Sorry, this comes a few days later than intended: and it's the first
time I've posted it, so it's not seen exposure in mmotm or linux-next;
nor any approval from those Cc'ed, though I did mention it to a few.
Up to you: you may prefer to hold it over, or give it exposure soonest;
I've seen no problem from it, but then I'm not likely to.

I may have exaggerated the accounting difficulties of the present
behaviour: even with this change, write-force still violates the
Committed_AS accounting which Konstantin had patches to fix; but
I hope we can do that more simply now, with some kind of cmpxchg
setting VM_ACCOUNT here in vm_flags, without mmap_sem for writing.

 mm/memory.c |   66 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 45 insertions(+), 21 deletions(-)

--- 3.14/mm/memory.c	2014-03-30 20:40:15.000000000 -0700
+++ linux/mm/memory.c	2014-04-03 15:26:41.884372480 -0700
@@ -1705,15 +1705,6 @@ long __get_user_pages(struct task_struct
 
 	VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
 
-	/* 
-	 * Require read or write permissions.
-	 * If FOLL_FORCE is set, we only require the "MAY" flags.
-	 */
-	vm_flags  = (gup_flags & FOLL_WRITE) ?
-			(VM_WRITE | VM_MAYWRITE) : (VM_READ | VM_MAYREAD);
-	vm_flags &= (gup_flags & FOLL_FORCE) ?
-			(VM_MAYREAD | VM_MAYWRITE) : (VM_READ | VM_WRITE);
-
 	/*
 	 * If FOLL_FORCE and FOLL_NUMA are both set, handle_mm_fault
 	 * would be called on PROT_NONE ranges. We must never invoke
@@ -1741,7 +1732,7 @@ long __get_user_pages(struct task_struct
 
 			/* user gate pages are read-only */
 			if (gup_flags & FOLL_WRITE)
-				return i ? : -EFAULT;
+				goto efault;
 			if (pg > TASK_SIZE)
 				pgd = pgd_offset_k(pg);
 			else
@@ -1751,12 +1742,12 @@ long __get_user_pages(struct task_struct
 			BUG_ON(pud_none(*pud));
 			pmd = pmd_offset(pud, pg);
 			if (pmd_none(*pmd))
-				return i ? : -EFAULT;
+				goto efault;
 			VM_BUG_ON(pmd_trans_huge(*pmd));
 			pte = pte_offset_map(pmd, pg);
 			if (pte_none(*pte)) {
 				pte_unmap(pte);
-				return i ? : -EFAULT;
+				goto efault;
 			}
 			vma = get_gate_vma(mm);
 			if (pages) {
@@ -1769,7 +1760,7 @@ long __get_user_pages(struct task_struct
 						page = pte_page(*pte);
 					else {
 						pte_unmap(pte);
-						return i ? : -EFAULT;
+						goto efault;
 					}
 				}
 				pages[i] = page;
@@ -1780,10 +1771,42 @@ long __get_user_pages(struct task_struct
 			goto next_page;
 		}
 
-		if (!vma ||
-		    (vma->vm_flags & (VM_IO | VM_PFNMAP)) ||
-		    !(vm_flags & vma->vm_flags))
-			return i ? : -EFAULT;
+		if (!vma)
+			goto efault;
+		vm_flags = vma->vm_flags;
+		if (vm_flags & (VM_IO | VM_PFNMAP))
+			goto efault;
+
+		if (gup_flags & FOLL_WRITE) {
+			if (!(vm_flags & VM_WRITE)) {
+				if (!(gup_flags & FOLL_FORCE))
+					goto efault;
+				/*
+				 * We used to let the write,force case do COW
+				 * in a VM_MAYWRITE VM_SHARED !VM_WRITE vma, so
+				 * ptrace could set a breakpoint in a read-only
+				 * mapping of an executable, without corrupting
+				 * the file (yet only when that file had been
+				 * opened for writing!).  Anon pages in shared
+				 * mappings are surprising: now just reject it.
+				 */
+				if (!is_cow_mapping(vm_flags)) {
+					WARN_ON_ONCE(vm_flags & VM_MAYWRITE);
+					goto efault;
+				}
+			}
+		} else {
+			if (!(vm_flags & VM_READ)) {
+				if (!(gup_flags & FOLL_FORCE))
+					goto efault;
+				/*
+				 * Is there actually any vma we can reach here
+				 * which does not have VM_MAYREAD set?
+				 */
+				if (!(vm_flags & VM_MAYREAD))
+					goto efault;
+			}
+		}
 
 		if (is_vm_hugetlb_page(vma)) {
 			i = follow_hugetlb_page(mm, vma, pages, vmas,
@@ -1837,7 +1860,7 @@ long __get_user_pages(struct task_struct
 							return -EFAULT;
 					}
 					if (ret & VM_FAULT_SIGBUS)
-						return i ? i : -EFAULT;
+						goto efault;
 					BUG();
 				}
 
@@ -1895,6 +1918,8 @@ next_page:
 		} while (nr_pages && start < vma->vm_end);
 	} while (nr_pages);
 	return i;
+efault:
+	return i ? : -EFAULT;
 }
 EXPORT_SYMBOL(__get_user_pages);
 
@@ -1962,9 +1987,8 @@ int fixup_user_fault(struct task_struct
  * @start:	starting user address
  * @nr_pages:	number of pages from start to pin
  * @write:	whether pages will be written to by the caller
- * @force:	whether to force write access even if user mapping is
- *		readonly. This will result in the page being COWed even
- *		in MAP_SHARED mappings. You do not want this.
+ * @force:	whether to force access even when user mapping is currently
+ *		protected (but never forces write access to shared mapping).
  * @pages:	array that receives pointers to the pages pinned.
  *		Should be at least nr_pages long. Or NULL, if caller
  *		only intends to ensure the pages are faulted in.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-04  8:28 [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas Hugh Dickins
@ 2014-04-04 12:32 ` Kirill A. Shutemov
  2014-04-04 14:52   ` Hugh Dickins
  2014-04-24 13:30 ` Oleg Nesterov
  1 sibling, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2014-04-04 12:32 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Jan Kara, Roland Dreier,
	Oleg Nesterov, Konstantin Khlebnikov, Kirill A. Shutemov,
	Mauro Carvalho Chehab, Omar Ramirez Luna, Inki Dae, linux-kernel,
	linux-mm, linux-rdma, linux-media

On Fri, Apr 04, 2014 at 01:28:22AM -0700, Hugh Dickins wrote:
> get_user_pages(write=1, force=1) has always had odd behaviour on write-
> protected shared mappings: although it demands FMODE_WRITE-access to the
> underlying object (do_mmap_pgoff sets neither VM_SHARED nor VM_MAYWRITE
> without that), it ends up with do_wp_page substituting private anonymous
> Copied-On-Write pages for the shared file pages in the area.
> 
> That was long ago intentional, as a safety measure to prevent ptrace
> setting a breakpoint (or POKETEXT or POKEDATA) from inadvertently
> corrupting the underlying executable.  Yet exec and dynamic loaders
> open the file read-only, and use MAP_PRIVATE rather than MAP_SHARED.
> 
> The traditional odd behaviour still causes surprises and bugs in mm,
> and is probably not what any caller wants - even the comment on the flag
> says "You do not want this" (although it's undoubtedly necessary for
> overriding userspace protections in some contexts, and good when !write).
> 
> Let's stop doing that.  But it would be dangerous to remove the long-
> standing safety at this stage, so just make get_user_pages(write,force)
> fail with EFAULT when applied to a write-protected shared area.
> Infiniband may in future want to force write through to underlying
> object: we can add another FOLL_flag later to enable that if required.
> 
> Odd though the old behaviour was, there is no doubt that we may turn
> out to break userspace with this change, and have to revert it quickly.
> Issue a WARN_ON_ONCE to help debug the changed case (easily triggered
> by userspace, so only once to prevent spamming the logs); and delay a
> few associated cleanups until this change is proved.
> 
> get_user_pages callers who might see trouble from this change:
>   ptrace poking, or writing to /proc/<pid>/mem
>   drivers/infiniband/
>   drivers/media/v4l2-core/
>   drivers/gpu/drm/exynos/exynos_drm_gem.c
>   drivers/staging/tidspbridge/core/tiomap3430.c
> if they ever apply get_user_pages to write-protected shared mappings
> of an object which was opened for writing.
> 
> I went to apply the same change to mm/nommu.c, but retreated.  NOMMU
> has no place for COW, and its VM_flags conventions are not the same:
> I'd be more likely to screw up NOMMU than make an improvement there.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Hugh Dickins <hughd@google.com>

There's comment in do_wp_page() which is not true anymore with patch
applied. It should be fixed.

Otherwise, looks good to me:

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-04 12:32 ` Kirill A. Shutemov
@ 2014-04-04 14:52   ` Hugh Dickins
  0 siblings, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2014-04-04 14:52 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hugh Dickins, Linus Torvalds, Andrew Morton, Jan Kara,
	Roland Dreier, Oleg Nesterov, Konstantin Khlebnikov,
	Kirill A. Shutemov, Mauro Carvalho Chehab, Omar Ramirez Luna,
	Inki Dae, linux-kernel, linux-mm, linux-rdma, linux-media

On Fri, 4 Apr 2014, Kirill A. Shutemov wrote:
> 
> There's comment in do_wp_page() which is not true anymore with patch
> applied. It should be fixed.

The * Only catch write-faults on shared writable pages,
    * read-only shared pages can get COWed by
    * get_user_pages(.write=1, .force=1).

Yes, I went back and forth on that: I found it difficult to remove that
comment without also simplifying the VM_WRITE|VM_SHARED test immediately
above it, possibly even looking again at the ordering of those tests.

In the end I decided to leave changing it to when we do the other
little cleanups outside get_user_pages(), after it's become clear
whether the new EFAULT is troublesome or not.  Most of my testing
had been without any change in do_wp_page(), so I left that out.

> 
> Otherwise, looks good to me:
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Thanks,
Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-04  8:28 [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas Hugh Dickins
  2014-04-04 12:32 ` Kirill A. Shutemov
@ 2014-04-24 13:30 ` Oleg Nesterov
  2014-04-24 23:14   ` Hugh Dickins
  1 sibling, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2014-04-24 13:30 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Jan Kara, Roland Dreier,
	Konstantin Khlebnikov, Kirill A. Shutemov, Mauro Carvalho Chehab,
	Omar Ramirez Luna, Inki Dae, linux-kernel, linux-mm, linux-rdma,
	linux-media

Hi Hugh,

Sorry for late reply. First of all, to avoid the confusion, I think the
patch is fine.

When I saw this patch I decided that uprobes should be updated accordingly,
but I just realized that I do not understand what should I write in the
changelog.

On 04/04, Hugh Dickins wrote:
>
> +		if (gup_flags & FOLL_WRITE) {
> +			if (!(vm_flags & VM_WRITE)) {
> +				if (!(gup_flags & FOLL_FORCE))
> +					goto efault;
> +				/*
> +				 * We used to let the write,force case do COW
> +				 * in a VM_MAYWRITE VM_SHARED !VM_WRITE vma, so
> +				 * ptrace could set a breakpoint in a read-only
> +				 * mapping of an executable, without corrupting
> +				 * the file (yet only when that file had been
> +				 * opened for writing!).  Anon pages in shared
> +				 * mappings are surprising: now just reject it.
> +				 */
> +				if (!is_cow_mapping(vm_flags)) {
> +					WARN_ON_ONCE(vm_flags & VM_MAYWRITE);
> +					goto efault;
> +				}

OK. But could you please clarify "Anon pages in shared mappings are surprising" ?
I mean, does this only apply to "VM_MAYWRITE VM_SHARED !VM_WRITE vma" mentioned
above or this is bad even if a !FMODE_WRITE file was mmaped as MAP_SHARED ?

Yes, in this case this vma is not VM_SHARED and it is not VM_MAYWRITE, it is only
VM_MAYSHARE. This is in fact private mapping except mprotect(PROT_WRITE) will not
work.

But with or without this patch gup(FOLL_WRITE | FOLL_FORCE) won't work in this case,
(although perhaps it could ?), is_cow_mapping() == F because of !VM_MAYWRITE.

However, currently uprobes assumes that a cowed anon page is fine in this case, and
this differs from gup().

So, what do you think about the patch below? It is probably fine in any case,
but is there any "strong" reason to follow the gup's behaviour and forbid the
anon page in VM_MAYSHARE && !VM_MAYWRITE vma?

Oleg.

--- x/kernel/events/uprobes.c
+++ x/kernel/events/uprobes.c
@@ -127,12 +127,13 @@ struct xol_area {
  */
 static bool valid_vma(struct vm_area_struct *vma, bool is_register)
 {
-	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_SHARED;
+	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC;
 
 	if (is_register)
 		flags |= VM_WRITE;
 
-	return vma->vm_file && (vma->vm_flags & flags) == VM_MAYEXEC;
+	return 	vma->vm_file && is_cow_mapping(vma->vm_flags) &&
+		(vma->vm_flags & flags) == VM_MAYEXEC;
 }
 
 static unsigned long offset_to_vaddr(struct vm_area_struct *vma, loff_t offset)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-24 13:30 ` Oleg Nesterov
@ 2014-04-24 23:14   ` Hugh Dickins
  2014-04-25 19:09     ` Oleg Nesterov
  0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2014-04-24 23:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andrew Morton, Jan Kara, Roland Dreier,
	Konstantin Khlebnikov, Kirill A. Shutemov, Mauro Carvalho Chehab,
	Omar Ramirez Luna, Inki Dae, linux-kernel, linux-mm, linux-rdma,
	linux-media

On Thu, 24 Apr 2014, Oleg Nesterov wrote:

> Hi Hugh,
> 
> Sorry for late reply. First of all, to avoid the confusion, I think the
> patch is fine.
> 
> When I saw this patch I decided that uprobes should be updated accordingly,
> but I just realized that I do not understand what should I write in the
> changelog.

Thanks a lot for considering similar issues in uprobes, Oleg: I merely
checked that its uses of get_user_pages() would not be problematic,
and didn't look around to rediscover the worrying mm business that
goes on down there in kernel/events.

> 
> On 04/04, Hugh Dickins wrote:
> >
> > +		if (gup_flags & FOLL_WRITE) {
> > +			if (!(vm_flags & VM_WRITE)) {
> > +				if (!(gup_flags & FOLL_FORCE))
> > +					goto efault;
> > +				/*
> > +				 * We used to let the write,force case do COW
> > +				 * in a VM_MAYWRITE VM_SHARED !VM_WRITE vma, so
> > +				 * ptrace could set a breakpoint in a read-only
> > +				 * mapping of an executable, without corrupting
> > +				 * the file (yet only when that file had been
> > +				 * opened for writing!).  Anon pages in shared
> > +				 * mappings are surprising: now just reject it.
> > +				 */
> > +				if (!is_cow_mapping(vm_flags)) {
> > +					WARN_ON_ONCE(vm_flags & VM_MAYWRITE);
> > +					goto efault;
> > +				}
> 
> OK. But could you please clarify "Anon pages in shared mappings are surprising" ?
> I mean, does this only apply to "VM_MAYWRITE VM_SHARED !VM_WRITE vma" mentioned
> above or this is bad even if a !FMODE_WRITE file was mmaped as MAP_SHARED ?

Good question. I simply didn't consider that - and (as you have realized)
didn't need to consider it, because I was just stopping the problematic
behaviour in gup(), and didn't need to consider whether other behaviour
prohibited by gup() was actually unproblematic.

> 
> Yes, in this case this vma is not VM_SHARED and it is not VM_MAYWRITE, it is only
> VM_MAYSHARE. This is in fact private mapping except mprotect(PROT_WRITE) will not
> work.
> 
> But with or without this patch gup(FOLL_WRITE | FOLL_FORCE) won't work in this case,
                     "this" meaning my patch rather than yours below
> (although perhaps it could ?), is_cow_mapping() == F because of !VM_MAYWRITE.
> 
> However, currently uprobes assumes that a cowed anon page is fine in this case, and
> this differs from gup().
> 
> So, what do you think about the patch below? It is probably fine in any case,
> but is there any "strong" reason to follow the gup's behaviour and forbid the
> anon page in VM_MAYSHARE && !VM_MAYWRITE vma?

I don't think there is a "strong" reason to forbid it.

The strongest reason is simply that it's much safer if uprobes follows
the same conventions as mm, and get_user_pages() happens to have
forbidden that all along.

The philosophical reason to forbid it is that the user mmapped with
MAP_SHARED, and it's merely a kernel-internal detail that we flip off
VM_SHARED and treat these read-only shared mappings very much like
private mappings.  The user asked for MAP_SHARED, and we prefer to
respect that by not letting private COWs creep in.

We could treat those mappings even more like private mappings, and
allow the COWs; but better to be strict about it, so long as doing
so doesn't give you regressions.

> 
> Oleg.
> 
> --- x/kernel/events/uprobes.c
> +++ x/kernel/events/uprobes.c
> @@ -127,12 +127,13 @@ struct xol_area {
>   */
>  static bool valid_vma(struct vm_area_struct *vma, bool is_register)
>  {
> -	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_SHARED;
> +	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC;

I think a one-line patch changing VM_SHARED to VM_MAYSHARE would do it,
wouldn't it?  And save you from having to export is_cow_mapping()
from mm/memory.c.  (I used is_cow_mapping() because I had to make the
test more complex anyway, just to exclude the case which had been
oddly handled before.)

Hugh

>  
>  	if (is_register)
>  		flags |= VM_WRITE;
>  
> -	return vma->vm_file && (vma->vm_flags & flags) == VM_MAYEXEC;
> +	return 	vma->vm_file && is_cow_mapping(vma->vm_flags) &&
> +		(vma->vm_flags & flags) == VM_MAYEXEC;
>  }
>  
>  static unsigned long offset_to_vaddr(struct vm_area_struct *vma, loff_t offset)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-24 23:14   ` Hugh Dickins
@ 2014-04-25 19:09     ` Oleg Nesterov
  2014-04-25 20:07       ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2014-04-25 19:09 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Jan Kara, Roland Dreier,
	Konstantin Khlebnikov, Kirill A. Shutemov, Mauro Carvalho Chehab,
	Omar Ramirez Luna, Inki Dae, linux-kernel, linux-mm, linux-rdma,
	linux-media

On 04/24, Hugh Dickins wrote:
>
> On Thu, 24 Apr 2014, Oleg Nesterov wrote:
>
> > So, what do you think about the patch below? It is probably fine in any case,
> > but is there any "strong" reason to follow the gup's behaviour and forbid the
> > anon page in VM_MAYSHARE && !VM_MAYWRITE vma?
>
> I don't think there is a "strong" reason to forbid it.
>
> The strongest reason is simply that it's much safer if uprobes follows
> the same conventions as mm, and get_user_pages() happens to have
> forbidden that all along.
>
> The philosophical reason to forbid it is that the user mmapped with
> MAP_SHARED, and it's merely a kernel-internal detail that we flip off
> VM_SHARED and treat these read-only shared mappings very much like
> private mappings.  The user asked for MAP_SHARED, and we prefer to
> respect that by not letting private COWs creep in.
>
> We could treat those mappings even more like private mappings, and
> allow the COWs; but better to be strict about it, so long as doing
> so doesn't give you regressions.

Great, thanks a lot! I was worried I missed something subtle.

And I forgot to mention, there is another reason why I would like to
change uprobes to follow the same convention. I still think it would
be better to kill __replace_page() and use gup(FOLL_WRITE | FORCE)
in uprobe_write_opcode().


> > --- x/kernel/events/uprobes.c
> > +++ x/kernel/events/uprobes.c
> > @@ -127,12 +127,13 @@ struct xol_area {
> >   */
> >  static bool valid_vma(struct vm_area_struct *vma, bool is_register)
> >  {
> > -	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC | VM_SHARED;
> > +	vm_flags_t flags = VM_HUGETLB | VM_MAYEXEC;
>
> I think a one-line patch changing VM_SHARED to VM_MAYSHARE would do it,
> wouldn't it?  And save you from having to export is_cow_mapping()
> from mm/memory.c.  (I used is_cow_mapping() because I had to make the
> test more complex anyway, just to exclude the case which had been
> oddly handled before.)

Indeed, thanks!

Oleg.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas
  2014-04-25 19:09     ` Oleg Nesterov
@ 2014-04-25 20:07       ` Hugh Dickins
  0 siblings, 0 replies; 7+ messages in thread
From: Hugh Dickins @ 2014-04-25 20:07 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Linus Torvalds, Andrew Morton, Jan Kara, Roland Dreier,
	Konstantin Khlebnikov, Kirill A. Shutemov, Mauro Carvalho Chehab,
	Omar Ramirez Luna, Inki Dae, linux-kernel, linux-mm, linux-rdma,
	linux-media

On Fri, 25 Apr 2014, Oleg Nesterov wrote:
> 
> And I forgot to mention, there is another reason why I would like to
> change uprobes to follow the same convention. I still think it would
> be better to kill __replace_page() and use gup(FOLL_WRITE | FORCE)
> in uprobe_write_opcode().

Oh, please please do!  I assumed there was a good atomicity reason
for doing it that way, but if you can do it with gup() please do so.

I went off into a little rant about __replace_page() in my reply to you;
but then felt that you had approached with an honest enquiry, and didn't
deserve a rant in response, so I deleted it.

And, of course, I'm conscious that I have from start to finish withheld
my involvement from uprobes, and refused to review that __replace_page()
(beyond insisting that it not be put in a common place for sharing with
KSM, because I just couldn't guarantee it for uprobes).  I'm afraid that
it's much like HWPoison to me, a complexity (nastiness?) way beyond what
I have time to support myself.

Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-04-25 20:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-04  8:28 [PATCH] mm: get_user_pages(write,force) refuse to COW in shared areas Hugh Dickins
2014-04-04 12:32 ` Kirill A. Shutemov
2014-04-04 14:52   ` Hugh Dickins
2014-04-24 13:30 ` Oleg Nesterov
2014-04-24 23:14   ` Hugh Dickins
2014-04-25 19:09     ` Oleg Nesterov
2014-04-25 20:07       ` Hugh Dickins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).