* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-13 2:37 [PATCH v2] mm: clean up hwpoison page cache page in fault path Rik van Riel
@ 2022-02-14 23:24 ` Andrew Morton
2022-02-15 1:37 ` Rik van Riel
2022-02-15 11:22 ` David Hildenbrand
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2022-02-14 23:24 UTC (permalink / raw)
To: Rik van Riel
Cc: linux-kernel, kernel-team, linux-mm, Miaohe Lin, Mel Gorman,
Johannes Weiner, Matthew Wilcox, Naoya Horiguchi,
Naoya Horiguchi
> Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path
At first scan I thought this was a code cleanup.
I think I'll do s/clean up/invalidate/.
On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com> wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page.
Is this correct behaviour?
> This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
>
> This is particularly embarrassing when the page was offlined due to
> having too many corrected memory errors. Now we are killing tasks
> due to them trying to access memory that probably isn't even corrupted.
>
> This problem can be avoided by invalidating the page from the page
> fault handler, which already has a branch for dealing with these
> kinds of pages. With this patch we simply pretend the page fault
> was successful if the page was invalidated, return to userspace,
> incur another page fault, read in the file from disk (to a new
> memory page), and then everything works again.
Is this worth a cc:stable?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-14 23:24 ` Andrew Morton
@ 2022-02-15 1:37 ` Rik van Riel
2022-02-15 5:15 ` HORIGUCHI NAOYA(堀口 直也)
0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2022-02-15 1:37 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, kernel-team, linux-mm, Miaohe Lin, Mel Gorman,
Johannes Weiner, Matthew Wilcox, Naoya Horiguchi,
Naoya Horiguchi
[-- Attachment #1: Type: text/plain, Size: 1770 bytes --]
On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote:
>
> > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault
> > path
>
> At first scan I thought this was a code cleanup.
>
> I think I'll do s/clean up/invalidate/.
>
OK, that sounds good.
> On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com>
> wrote:
>
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page.
>
> Is this correct behaviour?
It is not desirable, and the soft page offlining code
tries to invalidate the page, but I don't think overhauling
the way we lock and refcount page cache pages just to make
offlining them more reliable would be worthwhile, when we
already have a branch in the page fault handler to deal with
these pages, anyway.
> > This can lead to programs being killed over and over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
> >
> > This is particularly embarrassing when the page was offlined due to
> > having too many corrected memory errors. Now we are killing tasks
> > due to them trying to access memory that probably isn't even
> > corrupted.
> >
> > This problem can be avoided by invalidating the page from the page
> > fault handler, which already has a branch for dealing with these
> > kinds of pages. With this patch we simply pretend the page fault
> > was successful if the page was invalidated, return to userspace,
> > incur another page fault, read in the file from disk (to a new
> > memory page), and then everything works again.
>
> Is this worth a cc:stable?
Maybe. I don't know how far back this issue goes...
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 484 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-15 1:37 ` Rik van Riel
@ 2022-02-15 5:15 ` HORIGUCHI NAOYA(堀口 直也)
0 siblings, 0 replies; 13+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-02-15 5:15 UTC (permalink / raw)
To: Rik van Riel
Cc: Andrew Morton, linux-kernel, kernel-team, linux-mm, Miaohe Lin,
Mel Gorman, Johannes Weiner, Matthew Wilcox, Naoya Horiguchi
On Mon, Feb 14, 2022 at 08:37:26PM -0500, Rik van Riel wrote:
> On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote:
> >
> > > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault
> > > path
> >
> > At first scan I thought this was a code cleanup.
> >
> > I think I'll do s/clean up/invalidate/.
> >
> OK, that sounds good.
>
> > On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com>
> > wrote:
> >
> > > Sometimes the page offlining code can leave behind a hwpoisoned
> > > clean
> > > page cache page.
> >
> > Is this correct behaviour?
>
> It is not desirable, and the soft page offlining code
> tries to invalidate the page, but I don't think overhauling
> the way we lock and refcount page cache pages just to make
> offlining them more reliable would be worthwhile, when we
> already have a branch in the page fault handler to deal with
> these pages, anyway.
I don't have any idea about how this kind of page is left on page
cache after page offlining. But I agree with the suggested change.
>
> > > This can lead to programs being killed over and over
> > > and over again as they fault in the hwpoisoned page, get killed,
> > > and
> > > then get re-spawned by whatever wanted to run them.
> > >
> > > This is particularly embarrassing when the page was offlined due to
> > > having too many corrected memory errors. Now we are killing tasks
> > > due to them trying to access memory that probably isn't even
> > > corrupted.
> > >
> > > This problem can be avoided by invalidating the page from the page
> > > fault handler, which already has a branch for dealing with these
> > > kinds of pages. With this patch we simply pretend the page fault
> > > was successful if the page was invalidated, return to userspace,
> > > incur another page fault, read in the file from disk (to a new
> > > memory page), and then everything works again.
> >
> > Is this worth a cc:stable?
>
> Maybe. I don't know how far back this issue goes...
This issue should be orthogonal with recent changes on hwpoison, and
the base code targetted by this patch is unchanged since 2016 (4.10-rc1),
so this patch is simply applicable to most of the maintained stable trees
(maybe except 4.9.z).
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Thanks,
Naoya Horiguchi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-13 2:37 [PATCH v2] mm: clean up hwpoison page cache page in fault path Rik van Riel
2022-02-14 23:24 ` Andrew Morton
@ 2022-02-15 11:22 ` David Hildenbrand
2022-02-15 15:01 ` Rik van Riel
2022-02-15 12:51 ` Oscar Salvador
2022-02-21 12:55 ` Oscar Salvador
3 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2022-02-15 11:22 UTC (permalink / raw)
To: Rik van Riel, linux-kernel
Cc: kernel-team, linux-mm, Miaohe Lin, Andrew Morton, Mel Gorman,
Johannes Weiner, Matthew Wilcox
On 13.02.22 03:37, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
Hi Rik,
am I correct that you are only talking about soft offlining as triggered
from mm/memory-failure.c, not page offlining as in memory offlining
mm/memory_hotunplug.c ?
Maybe you can clarify that in the patch description, it made me nervous
for a second :)
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-15 11:22 ` David Hildenbrand
@ 2022-02-15 15:01 ` Rik van Riel
2022-02-15 15:06 ` David Hildenbrand
0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2022-02-15 15:01 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel
Cc: kernel-team, linux-mm, Miaohe Lin, Andrew Morton, Mel Gorman,
Johannes Weiner, Matthew Wilcox
[-- Attachment #1: Type: text/plain, Size: 810 bytes --]
On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote:
> On 13.02.22 03:37, Rik van Riel wrote:
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page. This can lead to programs being killed over and
> > over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
>
> Hi Rik,
>
> am I correct that you are only talking about soft offlining as
> triggered
> from mm/memory-failure.c, not page offlining as in memory offlining
> mm/memory_hotunplug.c ?
That is correct in that I am talking only about memory-failure.c,
however the code in memory-failure.c has both hard and soft
offlining modes, and I suppose this patch covers both?
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-15 15:01 ` Rik van Riel
@ 2022-02-15 15:06 ` David Hildenbrand
0 siblings, 0 replies; 13+ messages in thread
From: David Hildenbrand @ 2022-02-15 15:06 UTC (permalink / raw)
To: Rik van Riel, linux-kernel
Cc: kernel-team, linux-mm, Miaohe Lin, Andrew Morton, Mel Gorman,
Johannes Weiner, Matthew Wilcox
On 15.02.22 16:01, Rik van Riel wrote:
> On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote:
>> On 13.02.22 03:37, Rik van Riel wrote:
>>> Sometimes the page offlining code can leave behind a hwpoisoned
>>> clean
>>> page cache page. This can lead to programs being killed over and
>>> over
>>> and over again as they fault in the hwpoisoned page, get killed,
>>> and
>>> then get re-spawned by whatever wanted to run them.
>>
>> Hi Rik,
>>
>> am I correct that you are only talking about soft offlining as
>> triggered
>> from mm/memory-failure.c, not page offlining as in memory offlining
>> mm/memory_hotunplug.c ?
>
> That is correct in that I am talking only about memory-failure.c,
> however the code in memory-failure.c has both hard and soft
> offlining modes, and I suppose this patch covers both?
>
Right, for hwpoison handling there is hard and soft offlining of pages
... maybe "hwpoison page offlining" would be clearer, not sure.
Thanks for clarifying!
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-13 2:37 [PATCH v2] mm: clean up hwpoison page cache page in fault path Rik van Riel
2022-02-14 23:24 ` Andrew Morton
2022-02-15 11:22 ` David Hildenbrand
@ 2022-02-15 12:51 ` Oscar Salvador
2022-02-15 15:04 ` Rik van Riel
2022-02-16 2:13 ` Miaohe Lin
2022-02-21 12:55 ` Oscar Salvador
3 siblings, 2 replies; 13+ messages in thread
From: Oscar Salvador @ 2022-02-15 12:51 UTC (permalink / raw)
To: Rik van Riel
Cc: linux-kernel, kernel-team, linux-mm, Miaohe Lin, Andrew Morton,
Mel Gorman, Johannes Weiner, Matthew Wilcox
On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
Hi Rik,
Do you know how that exactly happens? We should not be really leaving
anything behind, and soft-offline (not hard) code works with the premise
of only poisoning a page in case it was contained, so I am wondering
what is going on here.
In-use pagecache pages are migrated away, and the actual page is
contained, and for clean ones, we already do the invalidate_inode_page()
and then contain it in case we succeed.
One scenario I can imagine this can happen is if by the time we call
page_handle_poison(), someone has taken another refcount on the page,
and the put_page() does not really free it, but I am not sure that
can happen.
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-15 12:51 ` Oscar Salvador
@ 2022-02-15 15:04 ` Rik van Riel
2022-02-16 2:13 ` Miaohe Lin
1 sibling, 0 replies; 13+ messages in thread
From: Rik van Riel @ 2022-02-15 15:04 UTC (permalink / raw)
To: Oscar Salvador
Cc: linux-kernel, kernel-team, linux-mm, Miaohe Lin, Andrew Morton,
Mel Gorman, Johannes Weiner, Matthew Wilcox
[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]
On Tue, 2022-02-15 at 13:51 +0100, Oscar Salvador wrote:
> On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page. This can lead to programs being killed over and
> > over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
>
> Hi Rik,
>
> Do you know how that exactly happens? We should not be really leaving
> anything behind, and soft-offline (not hard) code works with the
> premise
> of only poisoning a page in case it was contained, so I am wondering
> what is going on here.
>
> In-use pagecache pages are migrated away, and the actual page is
> contained, and for clean ones, we already do the
> invalidate_inode_page()
> and then contain it in case we succeed.
I do not know the exact failure case, since I have never
caught a system in the act of leaking one of these pages.
I just know I have seen this issue on systems where the
"soft_offline: %#lx: invalidated\n" printk was the only
offline method leaving any message in the kernel log.
However, there are a few code paths through the soft
offlining code path that don't seem to have any printks,
so I am not sure exactly where things went wrong.
I only really found the aftermath, and tested this patch
by loading it as a kernel live patch module on some of
those systems.
--
All Rights Reversed.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-15 12:51 ` Oscar Salvador
2022-02-15 15:04 ` Rik van Riel
@ 2022-02-16 2:13 ` Miaohe Lin
2022-02-21 12:54 ` Oscar Salvador
1 sibling, 1 reply; 13+ messages in thread
From: Miaohe Lin @ 2022-02-16 2:13 UTC (permalink / raw)
To: Oscar Salvador, Rik van Riel
Cc: linux-kernel, kernel-team, linux-mm, Andrew Morton, Mel Gorman,
Johannes Weiner, Matthew Wilcox
On 2022/2/15 20:51, Oscar Salvador wrote:
> On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
>> Sometimes the page offlining code can leave behind a hwpoisoned clean
>> page cache page. This can lead to programs being killed over and over
>> and over again as they fault in the hwpoisoned page, get killed, and
>> then get re-spawned by whatever wanted to run them.
>
> Hi Rik,
>
> Do you know how that exactly happens? We should not be really leaving
> anything behind, and soft-offline (not hard) code works with the premise
> of only poisoning a page in case it was contained, so I am wondering
> what is going on here.
>
> In-use pagecache pages are migrated away, and the actual page is
> contained, and for clean ones, we already do the invalidate_inode_page()
> and then contain it in case we succeed.
>
IIUC, this could not happen when soft-offlining a pagecache page. They're either
invalidated or migrated away and then we set PageHWPoison.
I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
and so on ...(see error_states for details). Or am I miss anything?
Thanks.
> One scenario I can imagine this can happen is if by the time we call
> page_handle_poison(), someone has taken another refcount on the page,
> and the put_page() does not really free it, but I am not sure that
> can happen.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-16 2:13 ` Miaohe Lin
@ 2022-02-21 12:54 ` Oscar Salvador
2022-02-21 13:19 ` Miaohe Lin
0 siblings, 1 reply; 13+ messages in thread
From: Oscar Salvador @ 2022-02-21 12:54 UTC (permalink / raw)
To: Miaohe Lin
Cc: Rik van Riel, linux-kernel, kernel-team, linux-mm, Andrew Morton,
Mel Gorman, Johannes Weiner, Matthew Wilcox
On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote:
> IIUC, this could not happen when soft-offlining a pagecache page. They're either
> invalidated or migrated away and then we set PageHWPoison.
> I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
> And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
> and so on ...(see error_states for details). Or am I miss anything?
But the path you are talking about is when we do have a non-recoverable
error, so memory_failure() path.
AFAIU, Rik talks about pages with corrected errors, and that is
soft_offline().
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-21 12:54 ` Oscar Salvador
@ 2022-02-21 13:19 ` Miaohe Lin
0 siblings, 0 replies; 13+ messages in thread
From: Miaohe Lin @ 2022-02-21 13:19 UTC (permalink / raw)
To: Oscar Salvador
Cc: Rik van Riel, linux-kernel, kernel-team, linux-mm, Andrew Morton,
Mel Gorman, Johannes Weiner, Matthew Wilcox
On 2022/2/21 20:54, Oscar Salvador wrote:
> On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote:
>> IIUC, this could not happen when soft-offlining a pagecache page. They're either
>> invalidated or migrated away and then we set PageHWPoison.
>> I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
>> And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
>> and so on ...(see error_states for details). Or am I miss anything?
>
> But the path you are talking about is when we do have a non-recoverable
> error, so memory_failure() path.
> AFAIU, Rik talks about pages with corrected errors, and that is
> soft_offline().
Oh, yes, Rik talks about pages with corrected errors. My mistake. Then I really
want to understand how we got there too. :)
Thanks.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: clean up hwpoison page cache page in fault path
2022-02-13 2:37 [PATCH v2] mm: clean up hwpoison page cache page in fault path Rik van Riel
` (2 preceding siblings ...)
2022-02-15 12:51 ` Oscar Salvador
@ 2022-02-21 12:55 ` Oscar Salvador
3 siblings, 0 replies; 13+ messages in thread
From: Oscar Salvador @ 2022-02-21 12:55 UTC (permalink / raw)
To: Rik van Riel
Cc: linux-kernel, kernel-team, linux-mm, Miaohe Lin, Andrew Morton,
Mel Gorman, Johannes Weiner, Matthew Wilcox
On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
>
> This is particularly embarrassing when the page was offlined due to
> having too many corrected memory errors. Now we are killing tasks
> due to them trying to access memory that probably isn't even corrupted.
>
> This problem can be avoided by invalidating the page from the page
> fault handler, which already has a branch for dealing with these
> kinds of pages. With this patch we simply pretend the page fault
> was successful if the page was invalidated, return to userspace,
> incur another page fault, read in the file from disk (to a new
> memory page), and then everything works again.
>
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Although I would really loved to understand how we got there,
it fixes the problem, so:
Reviewed-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 13+ messages in thread