>IIRC, hwpoison_filter() is expected to ignore error events on >irrelevant target (page|device|cgroup) for testers to focus on >non-filtered events. So filtered events should not be considered >as error-handling failures (otherwise testers can't easily >distinguish them from real error-handling failures). > >But yes, the MCE handler depends on the return value of memory_failure(), >so filtered events shouldn't be considered as error-handling's successes. >So one possible approach is that we introduce a new mf_flags to show that >memory_failure() is called from MCE handler (not from hwpoison injector), >and we switch return code when hwpoison_filter failed by this flag. >Could you try it if this approach works for you? Or if you have any better >idea, feel free to share it. Since hwpoison_filter() is expected to ignore error events on irrelevant taregt, introduce a new mf_flags to differentiate between mce handler and hwpoison injector is a good way, this can avoid the impact of hwpoison_filter() on the mce handler. I will resubmit a patch ________________________________ 发件人: HORIGUCHI NAOYA(堀口 直也) 发送时间: 2022年2月16日 9:44:13 收件人: 罗飞 抄送: akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org 主题: Re: [PATCH] mm/hwpoison: In-use hugepage need filter judgement and filter return value should be consistent On Thu, Feb 10, 2022 at 03:58:46AM -0500, luofei wrote: > After successfully obtaining the reference count of the huge > page, it is still necessary to call hwpoison_filter() to make a > filter judgement, otherwise the filter page will be unmaped and > the related process may be killed. > > Also when the huge page meets the filter conditions, it should > not be regarded as successful memory_failure() processing, but > should return an error code to inform the caller, otherwise the > caller regards the error page has been identified and isolated, > which may lead to calling set_mce_nospec() to change page attribute. Hi luofei, I agree with inserting another hwpoison_filter() block which was simply missed. Thanks for pointing it out. > > Meanwhile, when calling hwpoison_filter() to determine that a page > needs to be filtered, it is better to keep the same EBUSY return > value, so that the caller can recognize that memory_failure() did > not successfully process the error page. IIRC, hwpoison_filter() is expected to ignore error events on irrelevant target (page|device|cgroup) for testers to focus on non-filtered events. So filtered events should not be considered as error-handling failures (otherwise testers can't easily distinguish them from real error-handling failures). But yes, the MCE handler depends on the return value of memory_failure(), so filtered events shouldn't be considered as error-handling's successes. So one possible approach is that we introduce a new mf_flags to show that memory_failure() is called from MCE handler (not from hwpoison injector), and we switch return code when hwpoison_filter failed by this flag. Could you try it if this approach works for you? Or if you have any better idea, feel free to share it. Thanks, Naoya Horiguchi > > Signed-off-by: luofei > --- > mm/hwpoison-inject.c | 2 +- > mm/memory-failure.c | 12 ++++++++++-- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/mm/hwpoison-inject.c b/mm/hwpoison-inject.c > index aff4d27ec235..4ecd77cd1ded 100644 > --- a/mm/hwpoison-inject.c > +++ b/mm/hwpoison-inject.c > @@ -44,7 +44,7 @@ static int hwpoison_inject(void *data, u64 val) > */ > err = hwpoison_filter(hpage); > if (err) > - return 0; > + return -EBUSY; > > inject: > pr_info("Injecting memory failure at pfn %#lx\n", pfn); > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 97a9ed8f87a9..acf97fb2659a 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1526,7 +1526,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > if (TestClearPageHWPoison(head)) > num_poisoned_pages_dec(); > unlock_page(head); > - return 0; > + return -EBUSY; > } > unlock_page(head); > res = MF_FAILED; > @@ -1545,6 +1545,13 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) > lock_page(head); > page_flags = head->flags; > > + if (hwpoison_filter(p)) { > + if (TestClearPageHWPoison(head)) > + num_poisoned_pages_dec(); > + unlock_page(head); > + res = -EBUSY > + goto out; > + } > /* > * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so > * simply disable it. In order to make it work properly, we need > @@ -1613,7 +1620,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, > goto out; > > if (hwpoison_filter(page)) { > - rc = 0; > + rc = -EBUSY; > goto unlock; > } > > @@ -1837,6 +1844,7 @@ int memory_failure(unsigned long pfn, int flags) > num_poisoned_pages_dec(); > unlock_page(p); > put_page(p); > + res = -EBUSY; > goto unlock_mutex; > } > > -- > 2.27.0