linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ENOSPC returned by handle_mm_fault()
@ 2011-06-05 13:43 Al Viro
  2011-06-05 19:16 ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2011-06-05 13:43 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel

	When alloc_huge_page() runs afoul of quota, it returns ERR_PTR(-ENOSPC).
Callers do not expect that - hugetlb_cow() returns ENOSPC if it gets that
and so does hugetlb_no_page().  Eventually the thing propagates back to
hugetlb_fault() and is returned by it.

	Callers of hugetlb_fault() clearly expect a bitmap of VM_... and
not something from errno.h: one place is 
                        ret = hugetlb_fault(mm, vma, vaddr,
                                (flags & FOLL_WRITE) ? FAULT_FLAG_WRITE : 0);
                        spin_lock(&mm->page_table_lock);
                        if (!(ret & VM_FAULT_ERROR))
                                continue;
and another is handle_mm_fault(), which ends up returning ENOSPC and *its*
callers are definitely not ready to deal with that.

ENOSPC is 28, i.e. VM_FAULT_MAJOR | VM_FAULT_WRITE | VM_FAULT_HWPOISON;
it's also theoretically possible to get ENOMEM if region_chg() ends up
hitting
                nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
                if (!nrg)
                        return -ENOMEM;
region_chg() <- vma_needs_reservation() <- alloc_huge_page() and from that
point as with ENOSPC.  ENOMEM is 12, i.e. VM_FAULT_MAJOR | VM_FAULT_WRITE...

Am I right assuming that we want VM_FAULT_OOM in both cases?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ENOSPC returned by handle_mm_fault()
  2011-06-05 13:43 ENOSPC returned by handle_mm_fault() Al Viro
@ 2011-06-05 19:16 ` Hugh Dickins
  2011-06-05 19:50   ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2011-06-05 19:16 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-mm, Mel Gorman, linux-kernel

On Sun, 5 Jun 2011, Al Viro wrote:
> 	When alloc_huge_page() runs afoul of quota, it returns ERR_PTR(-ENOSPC).
> Callers do not expect that - hugetlb_cow() returns ENOSPC if it gets that
> and so does hugetlb_no_page().  Eventually the thing propagates back to
> hugetlb_fault() and is returned by it.
> 
> 	Callers of hugetlb_fault() clearly expect a bitmap of VM_... and
> not something from errno.h: one place is 
>                         ret = hugetlb_fault(mm, vma, vaddr,
>                                 (flags & FOLL_WRITE) ? FAULT_FLAG_WRITE : 0);
>                         spin_lock(&mm->page_table_lock);
>                         if (!(ret & VM_FAULT_ERROR))
>                                 continue;
> and another is handle_mm_fault(), which ends up returning ENOSPC and *its*
> callers are definitely not ready to deal with that.
> 
> ENOSPC is 28, i.e. VM_FAULT_MAJOR | VM_FAULT_WRITE | VM_FAULT_HWPOISON;
> it's also theoretically possible to get ENOMEM if region_chg() ends up
> hitting
>                 nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
>                 if (!nrg)
>                         return -ENOMEM;
> region_chg() <- vma_needs_reservation() <- alloc_huge_page() and from that
> point as with ENOSPC.  ENOMEM is 12, i.e. VM_FAULT_MAJOR | VM_FAULT_WRITE...

Good find, news to me.  Interesting uses of -PTR_ERR()!
Looks like we'd better not have more than 12 VM_FAULT_ flags.

> 
> Am I right assuming that we want VM_FAULT_OOM in both cases?

No, where hugetlb_get_quota() fails it should be VM_FAULT_SIGBUS:
there's no excuse to go on an OOM-killing spree just because hugetlb
quota is exhausted.

VM_FAULT_OOM is appropriate where vma_needs_reservation() fails,
because region_chg() couldn't kmalloc a structure, as you point out.

(Though that doesn't matter much, since the only way the kmalloc can
fail is when this task is already selected for OOM-kill - I think.)

Hugh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ENOSPC returned by handle_mm_fault()
  2011-06-05 19:16 ` Hugh Dickins
@ 2011-06-05 19:50   ` Al Viro
  2011-06-05 20:48     ` Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2011-06-05 19:50 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, Mel Gorman, linux-kernel

On Sun, Jun 05, 2011 at 12:16:08PM -0700, Hugh Dickins wrote:

> Good find, news to me.  Interesting uses of -PTR_ERR()!

*snerk*

I've run into a bug where ->open() returned -PTR_ERR(...) on one of the failure
exits and went grepping.  Caught so far:
	* l2tp_debugfs - originally found bug
	* xfs mknod() returning the error with wrong sign if xfs_get_acl()
fails
	* jfs lmLogOpen() - positive error value returned (and propagated
all way be to userland if we'd been doing remount) if block device can't be
opened
	* sunrpc - two bugs of the same kind
	* this one, where the *sign* is right, but mixing E.. with VM_FAULT_..
is not.

Bugs are like mushrooms - found one, look around for more...

> Looks like we'd better not have more than 12 VM_FAULT_ flags.

> > Am I right assuming that we want VM_FAULT_OOM in both cases?
> 
> No, where hugetlb_get_quota() fails it should be VM_FAULT_SIGBUS:
> there's no excuse to go on an OOM-killing spree just because hugetlb
> quota is exhausted.

Good point...

> VM_FAULT_OOM is appropriate where vma_needs_reservation() fails,
> because region_chg() couldn't kmalloc a structure, as you point out.
> 
> (Though that doesn't matter much, since the only way the kmalloc can
> fail is when this task is already selected for OOM-kill - I think.)

You mean, something like the diff below?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f33bb31..3de23f0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -125,7 +125,7 @@ static long region_chg(struct list_head *head, long f, long t)
 	if (&rg->link == head || t < rg->from) {
 		nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
 		if (!nrg)
-			return -ENOMEM;
+			return -VM_FAULT_OOM;
 		nrg->from = f;
 		nrg->to   = f;
 		INIT_LIST_HEAD(&nrg->link);
@@ -1036,7 +1036,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
 		return ERR_PTR(chg);
 	if (chg)
 		if (hugetlb_get_quota(inode->i_mapping, chg))
-			return ERR_PTR(-ENOSPC);
+			return ERR_PTR(-VM_FAULT_SIGBUS);
 
 	spin_lock(&hugetlb_lock);
 	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: ENOSPC returned by handle_mm_fault()
  2011-06-05 19:50   ` Al Viro
@ 2011-06-05 20:48     ` Hugh Dickins
  2011-06-05 22:13       ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2011-06-05 20:48 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-mm, Mel Gorman, linux-kernel

On Sun, 5 Jun 2011, Al Viro wrote:
> On Sun, Jun 05, 2011 at 12:16:08PM -0700, Hugh Dickins wrote:
> 
> > Good find, news to me.  Interesting uses of -PTR_ERR()!
> 
> You mean, something like the diff below?

Second hunk yes, but first hunk no: there's at least one other place
(hugetlb_reserve_pages) which calls region_chg(), and expects a
conventional -errno return from it; and even if there weren't,
I'd rather not spread these unconventional return values any deeper.

Something more like the one at the bottom I think: okay, it's slightly
tacky to assume the nature of the failure from vma_needs_reservation(),
but we already have two places which do make that assumption.

> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f33bb31..3de23f0 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -125,7 +125,7 @@ static long region_chg(struct list_head *head, long f, long t)
>  	if (&rg->link == head || t < rg->from) {
>  		nrg = kmalloc(sizeof(*nrg), GFP_KERNEL);
>  		if (!nrg)
> -			return -ENOMEM;
> +			return -VM_FAULT_OOM;
>  		nrg->from = f;
>  		nrg->to   = f;
>  		INIT_LIST_HEAD(&nrg->link);
> @@ -1036,7 +1036,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
>  		return ERR_PTR(chg);
>  	if (chg)
>  		if (hugetlb_get_quota(inode->i_mapping, chg))
> -			return ERR_PTR(-ENOSPC);
> +			return ERR_PTR(-VM_FAULT_SIGBUS);
>  
>  	spin_lock(&hugetlb_lock);
>  	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve);

Signed-off-by: Hugh Dickins <hughd@google.com
---

 mm/hugetlb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 3.0-rc1/mm/hugetlb.c	2011-05-29 18:42:37.425882575 -0700
+++ linux/mm/hugetlb.c	2011-06-05 13:33:22.795341004 -0700
@@ -1033,10 +1033,10 @@ static struct page *alloc_huge_page(stru
 	 */
 	chg = vma_needs_reservation(h, vma, addr);
 	if (chg < 0)
-		return ERR_PTR(chg);
+		return ERR_PTR(-VM_FAULT_OOM);
 	if (chg)
 		if (hugetlb_get_quota(inode->i_mapping, chg))
-			return ERR_PTR(-ENOSPC);
+			return ERR_PTR(-VM_FAULT_SIGBUS);
 
 	spin_lock(&hugetlb_lock);
 	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: ENOSPC returned by handle_mm_fault()
  2011-06-05 20:48     ` Hugh Dickins
@ 2011-06-05 22:13       ` Al Viro
  2011-06-06  5:03         ` [PATCH] mm: fix " Hugh Dickins
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2011-06-05 22:13 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-mm, Mel Gorman, linux-kernel

On Sun, Jun 05, 2011 at 01:48:55PM -0700, Hugh Dickins wrote:
> On Sun, 5 Jun 2011, Al Viro wrote:
> > On Sun, Jun 05, 2011 at 12:16:08PM -0700, Hugh Dickins wrote:
> > 
> > > Good find, news to me.  Interesting uses of -PTR_ERR()!
> > 
> > You mean, something like the diff below?
> 
> Second hunk yes, but first hunk no: there's at least one other place
> (hugetlb_reserve_pages) which calls region_chg(), and expects a
> conventional -errno return from it; and even if there weren't,
> I'd rather not spread these unconventional return values any deeper.

Umm...  FWIW, callers of hugetlb_reserve_pages() only check if it's 0;
exact value is lost.  But yes, I agree that your variant makes more
sense - they might start caring at some point.

> Signed-off-by: Hugh Dickins <hughd@google.com
Acked-by: Al Viro <viro@zeniv.linux.org.uk>

>  mm/hugetlb.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- 3.0-rc1/mm/hugetlb.c	2011-05-29 18:42:37.425882575 -0700
> +++ linux/mm/hugetlb.c	2011-06-05 13:33:22.795341004 -0700
> @@ -1033,10 +1033,10 @@ static struct page *alloc_huge_page(stru
>  	 */
>  	chg = vma_needs_reservation(h, vma, addr);
>  	if (chg < 0)
> -		return ERR_PTR(chg);
> +		return ERR_PTR(-VM_FAULT_OOM);
>  	if (chg)
>  		if (hugetlb_get_quota(inode->i_mapping, chg))
> -			return ERR_PTR(-ENOSPC);
> +			return ERR_PTR(-VM_FAULT_SIGBUS);
>  
>  	spin_lock(&hugetlb_lock);
>  	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] mm: fix ENOSPC returned by handle_mm_fault()
  2011-06-05 22:13       ` Al Viro
@ 2011-06-06  5:03         ` Hugh Dickins
  2011-06-07  9:57           ` Mel Gorman
  0 siblings, 1 reply; 7+ messages in thread
From: Hugh Dickins @ 2011-06-06  5:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, Al Viro, Mel Gorman, linux-kernel, linux-mm

Al Viro observes that in the hugetlb case, handle_mm_fault() may return
a value of the kind ENOSPC when its caller is expecting a value of the
kind VM_FAULT_SIGBUS: fix alloc_huge_page()'s failure returns.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
---

 mm/hugetlb.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 3.0-rc1/mm/hugetlb.c	2011-05-29 18:42:37.425882575 -0700
+++ linux/mm/hugetlb.c	2011-06-05 13:33:22.795341004 -0700
@@ -1033,10 +1033,10 @@ static struct page *alloc_huge_page(stru
 	 */
 	chg = vma_needs_reservation(h, vma, addr);
 	if (chg < 0)
-		return ERR_PTR(chg);
+		return ERR_PTR(-VM_FAULT_OOM);
 	if (chg)
 		if (hugetlb_get_quota(inode->i_mapping, chg))
-			return ERR_PTR(-ENOSPC);
+			return ERR_PTR(-VM_FAULT_SIGBUS);
 
 	spin_lock(&hugetlb_lock);
 	page = dequeue_huge_page_vma(h, vma, addr, avoid_reserve);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: fix ENOSPC returned by handle_mm_fault()
  2011-06-06  5:03         ` [PATCH] mm: fix " Hugh Dickins
@ 2011-06-07  9:57           ` Mel Gorman
  0 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2011-06-07  9:57 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Linus Torvalds, Al Viro, linux-kernel, linux-mm

On Sun, Jun 05, 2011 at 10:03:13PM -0700, Hugh Dickins wrote:
> Al Viro observes that in the hugetlb case, handle_mm_fault() may return
> a value of the kind ENOSPC when its caller is expecting a value of the
> kind VM_FAULT_SIGBUS: fix alloc_huge_page()'s failure returns.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
> Cc: stable@kernel.org

Nicely spotted!

Acked-by: Mel Gorman <mel@csn.ul.ie>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-06-07  9:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-05 13:43 ENOSPC returned by handle_mm_fault() Al Viro
2011-06-05 19:16 ` Hugh Dickins
2011-06-05 19:50   ` Al Viro
2011-06-05 20:48     ` Hugh Dickins
2011-06-05 22:13       ` Al Viro
2011-06-06  5:03         ` [PATCH] mm: fix " Hugh Dickins
2011-06-07  9:57           ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).