linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
@ 2019-06-13  8:12 Anshuman Khandual
  2019-06-13  9:33 ` Roman Penyaev
  0 siblings, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2019-06-13  8:12 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Anshuman Khandual, Rick Edgecombe, Andrey Ryabinin,
	Mike Rapoport, Roman Gushchin, Michal Hocko, Roman Penyaev,
	Uladzislau Rezki (Sony),
	Andrew Morton

vmap_pte_range() returns an -EBUSY when it encounters a non-empty PTE. But
currently vmap_pmd_range() unifies both -EBUSY and -ENOMEM return code as
-ENOMEM and send it up the call chain which is wrong. Interestingly enough
vmap_page_range_noflush() tests for the absolute error return value from
vmap_p4d_range() but it does not help because -EBUSY has been merged with
-ENOMEM. So all it can return is -ENOMEM. Fix this by testing for absolute
error return from vmap_pmd_range() all the way up to vmap_p4d_range().

Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 mm/vmalloc.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 7350a124524b..6c7dd8df23c3 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -165,14 +165,16 @@ static int vmap_pmd_range(pud_t *pud, unsigned long addr,
 {
 	pmd_t *pmd;
 	unsigned long next;
+	int err = 0;
 
 	pmd = pmd_alloc(&init_mm, pud, addr);
 	if (!pmd)
 		return -ENOMEM;
 	do {
 		next = pmd_addr_end(addr, end);
-		if (vmap_pte_range(pmd, addr, next, prot, pages, nr))
-			return -ENOMEM;
+		err = vmap_pte_range(pmd, addr, next, prot, pages, nr);
+		if (err)
+			return err;
 	} while (pmd++, addr = next, addr != end);
 	return 0;
 }
@@ -182,14 +184,16 @@ static int vmap_pud_range(p4d_t *p4d, unsigned long addr,
 {
 	pud_t *pud;
 	unsigned long next;
+	int err = 0;
 
 	pud = pud_alloc(&init_mm, p4d, addr);
 	if (!pud)
 		return -ENOMEM;
 	do {
 		next = pud_addr_end(addr, end);
-		if (vmap_pmd_range(pud, addr, next, prot, pages, nr))
-			return -ENOMEM;
+		err = vmap_pmd_range(pud, addr, next, prot, pages, nr);
+		if (err)
+			return err;
 	} while (pud++, addr = next, addr != end);
 	return 0;
 }
@@ -199,14 +203,16 @@ static int vmap_p4d_range(pgd_t *pgd, unsigned long addr,
 {
 	p4d_t *p4d;
 	unsigned long next;
+	int err = 0;
 
 	p4d = p4d_alloc(&init_mm, pgd, addr);
 	if (!p4d)
 		return -ENOMEM;
 	do {
 		next = p4d_addr_end(addr, end);
-		if (vmap_pud_range(p4d, addr, next, prot, pages, nr))
-			return -ENOMEM;
+		err = vmap_pud_range(p4d, addr, next, prot, pages, nr);
+		if (err)
+			return err;
 	} while (p4d++, addr = next, addr != end);
 	return 0;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
  2019-06-13  8:12 [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range() Anshuman Khandual
@ 2019-06-13  9:33 ` Roman Penyaev
  2019-06-13 15:21   ` Anshuman Khandual
  0 siblings, 1 reply; 6+ messages in thread
From: Roman Penyaev @ 2019-06-13  9:33 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, linux-kernel, Rick Edgecombe, Andrey Ryabinin,
	Mike Rapoport, Roman Gushchin, Michal Hocko,
	Uladzislau Rezki (Sony),
	Andrew Morton

On 2019-06-13 10:12, Anshuman Khandual wrote:
> vmap_pte_range() returns an -EBUSY when it encounters a non-empty PTE. 
> But
> currently vmap_pmd_range() unifies both -EBUSY and -ENOMEM return code 
> as
> -ENOMEM and send it up the call chain which is wrong. Interestingly 
> enough
> vmap_page_range_noflush() tests for the absolute error return value 
> from
> vmap_p4d_range() but it does not help because -EBUSY has been merged 
> with
> -ENOMEM. So all it can return is -ENOMEM. Fix this by testing for 
> absolute
> error return from vmap_pmd_range() all the way up to vmap_p4d_range().

I could not find any real external caller of vmap API who really cares
about the errno, and frankly why they should?  This is allocation path,
allocation failed - game over.  When you step on -EBUSY case something
has gone completely wrong in your kernel, you get a big warning in
your dmesg and it is already does not matter what errno you get.

--
Roman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
  2019-06-13  9:33 ` Roman Penyaev
@ 2019-06-13 15:21   ` Anshuman Khandual
  2019-06-13 15:31     ` Matthew Wilcox
  0 siblings, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2019-06-13 15:21 UTC (permalink / raw)
  To: Roman Penyaev
  Cc: linux-mm, linux-kernel, Rick Edgecombe, Andrey Ryabinin,
	Mike Rapoport, Roman Gushchin, Michal Hocko,
	Uladzislau Rezki (Sony),
	Andrew Morton



On 06/13/2019 03:03 PM, Roman Penyaev wrote:
> On 2019-06-13 10:12, Anshuman Khandual wrote:
>> vmap_pte_range() returns an -EBUSY when it encounters a non-empty PTE. But
>> currently vmap_pmd_range() unifies both -EBUSY and -ENOMEM return code as
>> -ENOMEM and send it up the call chain which is wrong. Interestingly enough
>> vmap_page_range_noflush() tests for the absolute error return value from
>> vmap_p4d_range() but it does not help because -EBUSY has been merged with
>> -ENOMEM. So all it can return is -ENOMEM. Fix this by testing for absolute
>> error return from vmap_pmd_range() all the way up to vmap_p4d_range().
> 
> I could not find any real external caller of vmap API who really cares
> about the errno, and frankly why they should?  This is allocation path,

map_vm_area() which is an exported symbol suppose to provide the right
error code regardless whether it's current users care for it or not.

> allocation failed - game over.  When you step on -EBUSY case something
> has gone completely wrong in your kernel, you get a big warning in
> your dmesg and it is already does not matter what errno you get.

Its true that vmap_pte_range() does warn during error conditions. But if
we really dont care about error return code then we should just remove
specific error details (ENOMEM/EBUSY) and instead replace them with simple
boolean false/true or (0/1/-1) return values at each level. Will that be
acceptable ? What we have currently is wrong where vmap_pmd_range() could
just wrap EBUSY as ENOMEM and send up the call chain.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
  2019-06-13 15:21   ` Anshuman Khandual
@ 2019-06-13 15:31     ` Matthew Wilcox
  2019-06-14  5:27       ` Anshuman Khandual
  0 siblings, 1 reply; 6+ messages in thread
From: Matthew Wilcox @ 2019-06-13 15:31 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Roman Penyaev, linux-mm, linux-kernel, Rick Edgecombe,
	Andrey Ryabinin, Mike Rapoport, Roman Gushchin, Michal Hocko,
	Uladzislau Rezki (Sony),
	Andrew Morton

On Thu, Jun 13, 2019 at 08:51:17PM +0530, Anshuman Khandual wrote:
> acceptable ? What we have currently is wrong where vmap_pmd_range() could
> just wrap EBUSY as ENOMEM and send up the call chain.

It's not wrong.  We do it in lots of places.  Unless there's a caller
which really needs to know the difference, it's often better than
returning the "real error".


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
  2019-06-13 15:31     ` Matthew Wilcox
@ 2019-06-14  5:27       ` Anshuman Khandual
  2019-06-14 13:42         ` Matthew Wilcox
  0 siblings, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2019-06-14  5:27 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Roman Penyaev, linux-mm, linux-kernel, Rick Edgecombe,
	Andrey Ryabinin, Mike Rapoport, Roman Gushchin, Michal Hocko,
	Uladzislau Rezki (Sony),
	Andrew Morton



On 06/13/2019 09:01 PM, Matthew Wilcox wrote:
> On Thu, Jun 13, 2019 at 08:51:17PM +0530, Anshuman Khandual wrote:
>> acceptable ? What we have currently is wrong where vmap_pmd_range() could
>> just wrap EBUSY as ENOMEM and send up the call chain.
> 
> It's not wrong.  We do it in lots of places.  Unless there's a caller
> which really needs to know the difference, it's often better than
> returning the "real error".

I can understand the fact that because there are no active users of this
return code, the current situation has been alright. But then I fail to
understand how can EBUSY be made ENOMEM and let the caller to think that
vmap_page_rage() failed because of lack of memory when it is clearly not
the case. It is really surprising how it can be acceptable inside kernel
(init_mm) page table functions which need to be thorough enough.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range()
  2019-06-14  5:27       ` Anshuman Khandual
@ 2019-06-14 13:42         ` Matthew Wilcox
  0 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2019-06-14 13:42 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Roman Penyaev, linux-mm, linux-kernel, Rick Edgecombe,
	Andrey Ryabinin, Mike Rapoport, Roman Gushchin, Michal Hocko,
	Uladzislau Rezki (Sony),
	Andrew Morton

On Fri, Jun 14, 2019 at 10:57:42AM +0530, Anshuman Khandual wrote:
> 
> 
> On 06/13/2019 09:01 PM, Matthew Wilcox wrote:
> > On Thu, Jun 13, 2019 at 08:51:17PM +0530, Anshuman Khandual wrote:
> >> acceptable ? What we have currently is wrong where vmap_pmd_range() could
> >> just wrap EBUSY as ENOMEM and send up the call chain.
> > 
> > It's not wrong.  We do it in lots of places.  Unless there's a caller
> > which really needs to know the difference, it's often better than
> > returning the "real error".
> 
> I can understand the fact that because there are no active users of this
> return code, the current situation has been alright. But then I fail to
> understand how can EBUSY be made ENOMEM and let the caller to think that
> vmap_page_rage() failed because of lack of memory when it is clearly not
> the case. It is really surprising how it can be acceptable inside kernel
> (init_mm) page table functions which need to be thorough enough.

It's a corollary of Steinbach's Guideline for Systems Programming.
There is no possible way to handle this error because this error is
never supposed to happen.  So we may as well return a different error
that will still lead to the caller doing the right thing.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-14 13:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-13  8:12 [PATCH] mm/vmalloc: Check absolute error return from vmap_[p4d|pud|pmd|pte]_range() Anshuman Khandual
2019-06-13  9:33 ` Roman Penyaev
2019-06-13 15:21   ` Anshuman Khandual
2019-06-13 15:31     ` Matthew Wilcox
2019-06-14  5:27       ` Anshuman Khandual
2019-06-14 13:42         ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).