All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm,hwpoison: Return -EBUSY when migration fails
@ 2020-12-09  9:28 Oscar Salvador
  2020-12-09  9:57 ` HORIGUCHI NAOYA(堀口 直也)
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Oscar Salvador @ 2020-12-09  9:28 UTC (permalink / raw)
  To: akpm; +Cc: n-horiguchi, vbabka, linux-mm, linux-kernel, Oscar Salvador

Currently, we return -EIO when we fail to migrate the page.

Migrations' failures are rather transient as they can happen due to
several reasons, e.g: high page refcount bump, mapping->migrate_page
failing etc.
All meaning that at that time the page could not be migrated, but
that has nothing to do with an EIO error.

Let us return -EBUSY instead, as we do in case we failed to isolate
the page.

While are it, let us remove the "ret" print as its value does not change.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/memory-failure.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 428991e297e2..1942fb83ac64 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1849,11 +1849,11 @@ static int __soft_offline_page(struct page *page)
 			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
 				pfn, msg_page[huge], ret, page->flags, &page->flags);
 			if (ret > 0)
-				ret = -EIO;
+				ret = -EBUSY;
 		}
 	} else {
-		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
-			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
+		pr_info("soft offline: %#lx: %s isolation failed, page count %d, type %lx (%pGp)\n",
+			pfn, msg_page[huge], page_count(page), page->flags, &page->flags);
 		ret = -EBUSY;
 	}
 	return ret;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm,hwpoison: Return -EBUSY when migration fails
  2020-12-09  9:28 [PATCH] mm,hwpoison: Return -EBUSY when migration fails Oscar Salvador
@ 2020-12-09  9:57 ` HORIGUCHI NAOYA(堀口 直也)
  2020-12-09  9:59 ` David Hildenbrand
  2020-12-09 10:25 ` Vlastimil Babka
  2 siblings, 0 replies; 6+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2020-12-09  9:57 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: akpm, n-horiguchi, vbabka, linux-mm, linux-kernel

On Wed, Dec 09, 2020 at 10:28:18AM +0100, Oscar Salvador wrote:
> Currently, we return -EIO when we fail to migrate the page.
> 
> Migrations' failures are rather transient as they can happen due to
> several reasons, e.g: high page refcount bump, mapping->migrate_page
> failing etc.
> All meaning that at that time the page could not be migrated, but
> that has nothing to do with an EIO error.
> 
> Let us return -EBUSY instead, as we do in case we failed to isolate
> the page.
> 
> While are it, let us remove the "ret" print as its value does not change.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm,hwpoison: Return -EBUSY when migration fails
  2020-12-09  9:28 [PATCH] mm,hwpoison: Return -EBUSY when migration fails Oscar Salvador
  2020-12-09  9:57 ` HORIGUCHI NAOYA(堀口 直也)
@ 2020-12-09  9:59 ` David Hildenbrand
  2020-12-09 10:36   ` Oscar Salvador
  2020-12-09 10:25 ` Vlastimil Babka
  2 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2020-12-09  9:59 UTC (permalink / raw)
  To: Oscar Salvador, akpm; +Cc: n-horiguchi, vbabka, linux-mm, linux-kernel

On 09.12.20 10:28, Oscar Salvador wrote:
> Currently, we return -EIO when we fail to migrate the page.
> 
> Migrations' failures are rather transient as they can happen due to
> several reasons, e.g: high page refcount bump, mapping->migrate_page
> failing etc.
> All meaning that at that time the page could not be migrated, but
> that has nothing to do with an EIO error.
> 
> Let us return -EBUSY instead, as we do in case we failed to isolate
> the page.
> 
> While are it, let us remove the "ret" print as its value does not change.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>  mm/memory-failure.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 428991e297e2..1942fb83ac64 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1849,11 +1849,11 @@ static int __soft_offline_page(struct page *page)
>  			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
>  				pfn, msg_page[huge], ret, page->flags, &page->flags);
>  			if (ret > 0)
> -				ret = -EIO;
> +				ret = -EBUSY;

Do we expect callers to retry immediately? -EAGAIN might make also
sense. But -EBUSY is an obvious improvement. Do we have callers relying
on this behavior?


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm,hwpoison: Return -EBUSY when migration fails
  2020-12-09  9:28 [PATCH] mm,hwpoison: Return -EBUSY when migration fails Oscar Salvador
  2020-12-09  9:57 ` HORIGUCHI NAOYA(堀口 直也)
  2020-12-09  9:59 ` David Hildenbrand
@ 2020-12-09 10:25 ` Vlastimil Babka
  2020-12-09 10:38   ` Oscar Salvador
  2 siblings, 1 reply; 6+ messages in thread
From: Vlastimil Babka @ 2020-12-09 10:25 UTC (permalink / raw)
  To: Oscar Salvador, akpm; +Cc: n-horiguchi, linux-mm, linux-kernel, Linux API

On 12/9/20 10:28 AM, Oscar Salvador wrote:
> Currently, we return -EIO when we fail to migrate the page.
> 
> Migrations' failures are rather transient as they can happen due to
> several reasons, e.g: high page refcount bump, mapping->migrate_page
> failing etc.
> All meaning that at that time the page could not be migrated, but
> that has nothing to do with an EIO error.
> 
> Let us return -EBUSY instead, as we do in case we failed to isolate
> the page.
> 
> While are it, let us remove the "ret" print as its value does not change.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Technically this affects madvise(2) so let's cc linux-api. The manpage doesn't
document error codes of MADV_HWPOISON and MADV_SOFT_OFFLINE (besides EPERM)
though so nothing to adjust there. It's meant only for the hwpoison testing
suite anyway.

> ---
>  mm/memory-failure.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 428991e297e2..1942fb83ac64 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1849,11 +1849,11 @@ static int __soft_offline_page(struct page *page)
>  			pr_info("soft offline: %#lx: %s migration failed %d, type %lx (%pGp)\n",
>  				pfn, msg_page[huge], ret, page->flags, &page->flags);
>  			if (ret > 0)
> -				ret = -EIO;
> +				ret = -EBUSY;
>  		}
>  	} else {
> -		pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n",
> -			pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags);
> +		pr_info("soft offline: %#lx: %s isolation failed, page count %d, type %lx (%pGp)\n",
> +			pfn, msg_page[huge], page_count(page), page->flags, &page->flags);
>  		ret = -EBUSY;
>  	}
>  	return ret;
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm,hwpoison: Return -EBUSY when migration fails
  2020-12-09  9:59 ` David Hildenbrand
@ 2020-12-09 10:36   ` Oscar Salvador
  0 siblings, 0 replies; 6+ messages in thread
From: Oscar Salvador @ 2020-12-09 10:36 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: akpm, n-horiguchi, vbabka, linux-mm, linux-kernel

On Wed, Dec 09, 2020 at 10:59:04AM +0100, David Hildenbrand wrote:
> On 09.12.20 10:28, Oscar Salvador wrote:
> Do we expect callers to retry immediately? -EAGAIN might make also
> sense. But -EBUSY is an obvious improvement. Do we have callers relying
> on this behavior?

Not really, unless something LTP takes a look at the error code in retries
in case EBUSY.

Take into account that most of the callers do not even really check the
return code (GHES, RAS/CEC, etc.)

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm,hwpoison: Return -EBUSY when migration fails
  2020-12-09 10:25 ` Vlastimil Babka
@ 2020-12-09 10:38   ` Oscar Salvador
  0 siblings, 0 replies; 6+ messages in thread
From: Oscar Salvador @ 2020-12-09 10:38 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: akpm, n-horiguchi, linux-mm, linux-kernel, Linux API

On Wed, Dec 09, 2020 at 11:25:31AM +0100, Vlastimil Babka wrote:
> On 12/9/20 10:28 AM, Oscar Salvador wrote:
> > Currently, we return -EIO when we fail to migrate the page.
> > 
> > Migrations' failures are rather transient as they can happen due to
> > several reasons, e.g: high page refcount bump, mapping->migrate_page
> > failing etc.
> > All meaning that at that time the page could not be migrated, but
> > that has nothing to do with an EIO error.
> > 
> > Let us return -EBUSY instead, as we do in case we failed to isolate
> > the page.
> > 
> > While are it, let us remove the "ret" print as its value does not change.
> > 
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Technically this affects madvise(2) so let's cc linux-api. The manpage doesn't
> document error codes of MADV_HWPOISON and MADV_SOFT_OFFLINE (besides EPERM)
> though so nothing to adjust there. It's meant only for the hwpoison testing
> suite anyway.

Well, not only for hwpoison testing suite.

RAS/CEC and GHES also use soft_offline_page by means of memory_failure_queue
in case a page cec count goes beyond a certain thereshold, but they do not
really check the return code.

Only madvise does.


-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-12-09 10:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09  9:28 [PATCH] mm,hwpoison: Return -EBUSY when migration fails Oscar Salvador
2020-12-09  9:57 ` HORIGUCHI NAOYA(堀口 直也)
2020-12-09  9:59 ` David Hildenbrand
2020-12-09 10:36   ` Oscar Salvador
2020-12-09 10:25 ` Vlastimil Babka
2020-12-09 10:38   ` Oscar Salvador

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.