linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent
@ 2022-02-10  8:58 luofei
  2022-02-16  1:44 ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 1 reply; 3+ messages in thread
From: luofei @ 2022-02-10  8:58 UTC (permalink / raw)
  To: naoya.horiguchi, akpm; +Cc: linux-mm, linux-kernel, luofei

After successfully obtaining the reference count of the huge
page, it is still necessary to call hwpoison_filter() to make a
filter judgement, otherwise the filter page will be unmaped and
the related process may be killed.

Also when the huge page meets the filter conditions, it should
not be regarded as successful memory_failure() processing, but
should return an error code to inform the caller, otherwise the
caller regards the error page has been identified and isolated,
which may lead to calling set_mce_nospec() to change page attribute.

Meanwhile, when calling hwpoison_filter() to determine that a page
needs to be filtered, it is better to keep the same EBUSY return
value, so that the caller can recognize that memory_failure() did
not successfully process the error page.

Signed-off-by: luofei <luofei@unicloud.com>
---
 mm/hwpoison-inject.c |  2 +-
 mm/memory-failure.c  | 12 ++++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
index aff4d27ec235..4ecd77cd1ded 100644
--- a/mm/hwpoison-inject.c
+++ b/mm/hwpoison-inject.c
@@ -44,7 +44,7 @@ static int hwpoison_inject(void *data, u64 val)
 	 */
 	err = hwpoison_filter(hpage);
 	if (err)
-		return 0;
+		return -EBUSY;
 
 inject:
 	pr_info("Injecting memory failure at pfn %#lx\n", pfn);
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 97a9ed8f87a9..acf97fb2659a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1526,7 +1526,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
 				if (TestClearPageHWPoison(head))
 					num_poisoned_pages_dec();
 				unlock_page(head);
-				return 0;
+				return -EBUSY;
 			}
 			unlock_page(head);
 			res = MF_FAILED;
@@ -1545,6 +1545,13 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
 	lock_page(head);
 	page_flags = head->flags;
 
+	if (hwpoison_filter(p)) {
+		if (TestClearPageHWPoison(head))
+			num_poisoned_pages_dec();
+		unlock_page(head);
+		res = -EBUSY
+		goto out;
+	}
 	/*
 	 * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
 	 * simply disable it. In order to make it work properly, we need
@@ -1613,7 +1620,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
 		goto out;
 
 	if (hwpoison_filter(page)) {
-		rc = 0;
+		rc = -EBUSY;
 		goto unlock;
 	}
 
@@ -1837,6 +1844,7 @@ int memory_failure(unsigned long pfn, int flags)
 			num_poisoned_pages_dec();
 		unlock_page(p);
 		put_page(p);
+		res = -EBUSY;
 		goto unlock_mutex;
 	}
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent
  2022-02-10  8:58 [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent luofei
@ 2022-02-16  1:44 ` HORIGUCHI NAOYA(堀口 直也)
  2022-02-16  8:48   ` 答复: " 罗飞
  0 siblings, 1 reply; 3+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-02-16  1:44 UTC (permalink / raw)
  To: luofei; +Cc: akpm, linux-mm, linux-kernel

On Thu, Feb 10, 2022 at 03:58:46AM -0500, luofei wrote:
> After successfully obtaining the reference count of the huge
> page, it is still necessary to call hwpoison_filter() to make a
> filter judgement, otherwise the filter page will be unmaped and
> the related process may be killed.
> 
> Also when the huge page meets the filter conditions, it should
> not be regarded as successful memory_failure() processing, but
> should return an error code to inform the caller, otherwise the
> caller regards the error page has been identified and isolated,
> which may lead to calling set_mce_nospec() to change page attribute.

Hi luofei,

I agree with inserting another hwpoison_filter() block which was
simply missed. Thanks for pointing it out.

> 
> Meanwhile, when calling hwpoison_filter() to determine that a page
> needs to be filtered, it is better to keep the same EBUSY return
> value, so that the caller can recognize that memory_failure() did
> not successfully process the error page.

IIRC, hwpoison_filter() is expected to ignore error events on
irrelevant target (page|device|cgroup) for testers to focus on
non-filtered events.  So filtered events should not be considered
as error-handling failures (otherwise testers can't easily
distinguish them from real error-handling failures).

But yes, the MCE handler depends on the return value of memory_failure(),
so filtered events shouldn't be considered as error-handling's successes.
So one possible approach is that we introduce a new mf_flags to show that
memory_failure() is called from MCE handler (not from hwpoison injector),
and we switch return code when hwpoison_filter failed by this flag.
Could you try it if this approach works for you?  Or if you have any better
idea, feel free to share it.

Thanks,
Naoya Horiguchi

> 
> Signed-off-by: luofei <luofei@unicloud.com>
> ---
>  mm/hwpoison-inject.c |  2 +-
>  mm/memory-failure.c  | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
> index aff4d27ec235..4ecd77cd1ded 100644
> --- a/mm/hwpoison-inject.c
> +++ b/mm/hwpoison-inject.c
> @@ -44,7 +44,7 @@ static int hwpoison_inject(void *data, u64 val)
>  	 */
>  	err = hwpoison_filter(hpage);
>  	if (err)
> -		return 0;
> +		return -EBUSY;
>  
>  inject:
>  	pr_info("Injecting memory failure at pfn %#lx\n", pfn);
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 97a9ed8f87a9..acf97fb2659a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1526,7 +1526,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>  				if (TestClearPageHWPoison(head))
>  					num_poisoned_pages_dec();
>  				unlock_page(head);
> -				return 0;
> +				return -EBUSY;
>  			}
>  			unlock_page(head);
>  			res = MF_FAILED;
> @@ -1545,6 +1545,13 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>  	lock_page(head);
>  	page_flags = head->flags;
>  
> +	if (hwpoison_filter(p)) {
> +		if (TestClearPageHWPoison(head))
> +			num_poisoned_pages_dec();
> +		unlock_page(head);
> +		res = -EBUSY
> +		goto out;
> +	}
>  	/*
>  	 * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
>  	 * simply disable it. In order to make it work properly, we need
> @@ -1613,7 +1620,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>  		goto out;
>  
>  	if (hwpoison_filter(page)) {
> -		rc = 0;
> +		rc = -EBUSY;
>  		goto unlock;
>  	}
>  
> @@ -1837,6 +1844,7 @@ int memory_failure(unsigned long pfn, int flags)
>  			num_poisoned_pages_dec();
>  		unlock_page(p);
>  		put_page(p);
> +		res = -EBUSY;
>  		goto unlock_mutex;
>  	}
>  
> -- 
> 2.27.0

^ permalink raw reply	[flat|nested] 3+ messages in thread

* 答复: [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent
  2022-02-16  1:44 ` HORIGUCHI NAOYA(堀口 直也)
@ 2022-02-16  8:48   ` 罗飞
  0 siblings, 0 replies; 3+ messages in thread
From: 罗飞 @ 2022-02-16  8:48 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5657 bytes --]

>IIRC, hwpoison_filter() is expected to ignore error events on
>irrelevant target (page|device|cgroup) for testers to focus on
>non-filtered events.  So filtered events should not be considered
>as error-handling failures (otherwise testers can't easily
>distinguish them from real error-handling failures).
>
>But yes, the MCE handler depends on the return value of memory_failure(),
>so filtered events shouldn't be considered as error-handling's successes.
>So one possible approach is that we introduce a new mf_flags to show that
>memory_failure() is called from MCE handler (not from hwpoison injector),
>and we switch return code when hwpoison_filter failed by this flag.
>Could you try it if this approach works for you?  Or if you have any better
>idea, feel free to share it.


Since hwpoison_filter() is expected to ignore error events on irrelevant taregt,

introduce a new mf_flags to differentiate between mce handler and hwpoison

injector is a good way, this can avoid the impact of hwpoison_filter() on the mce

handler. I will resubmit a patch

________________________________
发件人: HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com>
发送时间: 2022年2月16日 9:44:13
收件人: 罗飞
抄送: akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent

On Thu, Feb 10, 2022 at 03:58:46AM -0500, luofei wrote:
> After successfully obtaining the reference count of the huge
> page, it is still necessary to call hwpoison_filter() to make a
> filter judgement, otherwise the filter page will be unmaped and
> the related process may be killed.
>
> Also when the huge page meets the filter conditions, it should
> not be regarded as successful memory_failure() processing, but
> should return an error code to inform the caller, otherwise the
> caller regards the error page has been identified and isolated,
> which may lead to calling set_mce_nospec() to change page attribute.

Hi luofei,

I agree with inserting another hwpoison_filter() block which was
simply missed. Thanks for pointing it out.

>
> Meanwhile, when calling hwpoison_filter() to determine that a page
> needs to be filtered, it is better to keep the same EBUSY return
> value, so that the caller can recognize that memory_failure() did
> not successfully process the error page.

IIRC, hwpoison_filter() is expected to ignore error events on
irrelevant target (page|device|cgroup) for testers to focus on
non-filtered events.  So filtered events should not be considered
as error-handling failures (otherwise testers can't easily
distinguish them from real error-handling failures).

But yes, the MCE handler depends on the return value of memory_failure(),
so filtered events shouldn't be considered as error-handling's successes.
So one possible approach is that we introduce a new mf_flags to show that
memory_failure() is called from MCE handler (not from hwpoison injector),
and we switch return code when hwpoison_filter failed by this flag.
Could you try it if this approach works for you?  Or if you have any better
idea, feel free to share it.

Thanks,
Naoya Horiguchi

>
> Signed-off-by: luofei <luofei@unicloud.com>
> ---
>  mm/hwpoison-inject.c |  2 +-
>  mm/memory-failure.c  | 12 ++++++++++--
>  2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c
> index aff4d27ec235..4ecd77cd1ded 100644
> --- a/mm/hwpoison-inject.c
> +++ b/mm/hwpoison-inject.c
> @@ -44,7 +44,7 @@ static int hwpoison_inject(void *data, u64 val)
>         */
>        err = hwpoison_filter(hpage);
>        if (err)
> -             return 0;
> +             return -EBUSY;
>
>  inject:
>        pr_info("Injecting memory failure at pfn %#lx\n", pfn);
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 97a9ed8f87a9..acf97fb2659a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1526,7 +1526,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>                                if (TestClearPageHWPoison(head))
>                                        num_poisoned_pages_dec();
>                                unlock_page(head);
> -                             return 0;
> +                             return -EBUSY;
>                        }
>                        unlock_page(head);
>                        res = MF_FAILED;
> @@ -1545,6 +1545,13 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>        lock_page(head);
>        page_flags = head->flags;
>
> +     if (hwpoison_filter(p)) {
> +             if (TestClearPageHWPoison(head))
> +                     num_poisoned_pages_dec();
> +             unlock_page(head);
> +             res = -EBUSY
> +             goto out;
> +     }
>        /*
>         * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so
>         * simply disable it. In order to make it work properly, we need
> @@ -1613,7 +1620,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>                goto out;
>
>        if (hwpoison_filter(page)) {
> -             rc = 0;
> +             rc = -EBUSY;
>                goto unlock;
>        }
>
> @@ -1837,6 +1844,7 @@ int memory_failure(unsigned long pfn, int flags)
>                        num_poisoned_pages_dec();
>                unlock_page(p);
>                put_page(p);
> +             res = -EBUSY;
>                goto unlock_mutex;
>        }
>
> --
> 2.27.0

[-- Attachment #2: Type: text/html, Size: 16235 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-02-16  8:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  8:58 [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent luofei
2022-02-16  1:44 ` HORIGUCHI NAOYA(堀口 直也)
2022-02-16  8:48   ` 答复: " 罗飞

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).