* [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
@ 2021-09-09 0:41 Naoya Horiguchi
2021-09-09 3:22 ` Yang Shi
2021-09-10 11:16 ` David Hildenbrand
0 siblings, 2 replies; 4+ messages in thread
From: Naoya Horiguchi @ 2021-09-09 0:41 UTC (permalink / raw)
To: linux-mm
Cc: Andrew Morton, Tony Luck, Oscar Salvador, Mike Kravetz,
Michal Hocko, Yang Shi, David Hildenbrand, Naoya Horiguchi,
linux-kernel
From: Naoya Horiguchi <naoya.horiguchi@nec.com>
commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
unhandlable pages") changes the return value of __get_hwpoison_page() to
retry for transiently unhandlable cases. However, __get_hwpoison_page()
currently fails to properly judge buddy pages as handlable, so hard/soft
offline for buddy pages always fail as "unhandlable page". This is
totally regrettable.
So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
__get_hwpoison_page() returns different return values between buddy
pages and unhandlable pages as intended.
Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
Cc: <stable@vger.kernel.org>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
index 60df8fcd0444..3416c55be810 100644
--- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
+++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
@@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
*/
static inline bool HWPoisonHandlable(struct page *page)
{
- return PageLRU(page) || __PageMovable(page);
+ return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
}
static int __get_hwpoison_page(struct page *page)
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-09 0:41 [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable() Naoya Horiguchi
@ 2021-09-09 3:22 ` Yang Shi
2021-09-09 3:58 ` HORIGUCHI NAOYA(堀口 直也)
2021-09-10 11:16 ` David Hildenbrand
1 sibling, 1 reply; 4+ messages in thread
From: Yang Shi @ 2021-09-09 3:22 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: Linux MM, Andrew Morton, Tony Luck, Oscar Salvador, Mike Kravetz,
Michal Hocko, David Hildenbrand, Naoya Horiguchi,
Linux Kernel Mailing List
On Wed, Sep 8, 2021 at 5:41 PM Naoya Horiguchi
<naoya.horiguchi@linux.dev> wrote:
>
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
>
> commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> unhandlable pages") changes the return value of __get_hwpoison_page() to
> retry for transiently unhandlable cases. However, __get_hwpoison_page()
> currently fails to properly judge buddy pages as handlable, so hard/soft
> offline for buddy pages always fail as "unhandlable page". This is
> totally regrettable.
>
> So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> __get_hwpoison_page() returns different return values between buddy
> pages and unhandlable pages as intended.
>
> Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> ---
> mm/memory-failure.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> index 60df8fcd0444..3416c55be810 100644
> --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> */
> static inline bool HWPoisonHandlable(struct page *page)
> {
> - return PageLRU(page) || __PageMovable(page);
> + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
It seems to work. Although this may change the return value of
get_any_page() to 1 when MF_COUNT_INCREASED is set. This may cause
soft offline to mishandle free buddy page, but MF_COUNT_INCREASED is
only set when madvise is used, and madvise definitely can't soft
offline free buddy page. It did take me a while to figure out this
trick. Maybe need some refactor?
Anyway this patch looks fine to me. Reviewed-by: Yang Shi <shy828301@gmail.com>
> }
>
> static int __get_hwpoison_page(struct page *page)
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-09 3:22 ` Yang Shi
@ 2021-09-09 3:58 ` HORIGUCHI NAOYA(堀口 直也)
0 siblings, 0 replies; 4+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-09-09 3:58 UTC (permalink / raw)
To: Yang Shi
Cc: Naoya Horiguchi, Linux MM, Andrew Morton, Tony Luck,
Oscar Salvador, Mike Kravetz, Michal Hocko, David Hildenbrand,
Linux Kernel Mailing List
On Wed, Sep 08, 2021 at 08:22:14PM -0700, Yang Shi wrote:
> On Wed, Sep 8, 2021 at 5:41 PM Naoya Horiguchi
> <naoya.horiguchi@linux.dev> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> >
> > commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> > unhandlable pages") changes the return value of __get_hwpoison_page() to
> > retry for transiently unhandlable cases. However, __get_hwpoison_page()
> > currently fails to properly judge buddy pages as handlable, so hard/soft
> > offline for buddy pages always fail as "unhandlable page". This is
> > totally regrettable.
> >
> > So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> > __get_hwpoison_page() returns different return values between buddy
> > pages and unhandlable pages as intended.
> >
> > Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > ---
> > mm/memory-failure.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> > index 60df8fcd0444..3416c55be810 100644
> > --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> > +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> > @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> > */
> > static inline bool HWPoisonHandlable(struct page *page)
> > {
> > - return PageLRU(page) || __PageMovable(page);
> > + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
>
> It seems to work. Although this may change the return value of
> get_any_page() to 1 when MF_COUNT_INCREASED is set. This may cause
> soft offline to mishandle free buddy page, but MF_COUNT_INCREASED is
> only set when madvise is used, and madvise definitely can't soft
> offline free buddy page.
Yes, madvise(MADV_SOFT_OFFLINE) should not be called for free buddy pages,
because the interface covers only user-mapped pages.
> It did take me a while to figure out this
> trick. Maybe need some refactor?
I think so, the current code looks to me fragile for future change, but
I'm not sure how we can improve this.
One (maybe not related to MF_COUNT_INCREASED) room for refactoring is
"if (PageTransHuge(head))" block in __get_hwpoison_page(), which might
be old and useless code for now.
>
> Anyway this patch looks fine to me. Reviewed-by: Yang Shi <shy828301@gmail.com>
Thank you for the review.
- Naoya
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-09 0:41 [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable() Naoya Horiguchi
2021-09-09 3:22 ` Yang Shi
@ 2021-09-10 11:16 ` David Hildenbrand
1 sibling, 0 replies; 4+ messages in thread
From: David Hildenbrand @ 2021-09-10 11:16 UTC (permalink / raw)
To: Naoya Horiguchi, linux-mm
Cc: Andrew Morton, Tony Luck, Oscar Salvador, Mike Kravetz,
Michal Hocko, Yang Shi, Naoya Horiguchi, linux-kernel
On 09.09.21 02:41, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
>
> commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
> unhandlable pages") changes the return value of __get_hwpoison_page() to
> retry for transiently unhandlable cases. However, __get_hwpoison_page()
> currently fails to properly judge buddy pages as handlable, so hard/soft
> offline for buddy pages always fail as "unhandlable page". This is
> totally regrettable.
>
> So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
> __get_hwpoison_page() returns different return values between buddy
> pages and unhandlable pages as intended.
>
> Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> ---
> mm/memory-failure.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> index 60df8fcd0444..3416c55be810 100644
> --- v5.14-rc7-mmotm-2021-08-23-16-42/mm/memory-failure.c
> +++ v5.14-rc7-mmotm-2021-08-23-16-42_patched/mm/memory-failure.c
> @@ -1126,7 +1126,7 @@ static int page_action(struct page_state *ps, struct page *p,
> */
> static inline bool HWPoisonHandlable(struct page *page)
> {
> - return PageLRU(page) || __PageMovable(page);
> + return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page);
> }
>
> static int __get_hwpoison_page(struct page *page)
>
LGTM and I agree that code might desrve some cleanups.
Acked-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-09-10 11:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 0:41 [PATCH v1] mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable() Naoya Horiguchi
2021-09-09 3:22 ` Yang Shi
2021-09-09 3:58 ` HORIGUCHI NAOYA(堀口 直也)
2021-09-10 11:16 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).