* [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb
@ 2024-04-18 2:19 Miaohe Lin
2024-04-18 2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
2024-04-18 2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
0 siblings, 2 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18 2:19 UTC (permalink / raw)
To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel
This series contains fixup patches to fix the issues I observed when
I did memory failure tests.
Thanks!
Miaohe Lin (2):
mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when
dissolve_free_hugetlb_folio()
mm/hugetlb: fix unable to handle page fault for address
dead000000000108
mm/hugetlb.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--
2.33.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
2024-04-18 2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
@ 2024-04-18 2:19 ` Miaohe Lin
2024-04-18 4:05 ` Oscar Salvador
2024-04-18 2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
1 sibling, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18 2:19 UTC (permalink / raw)
To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel
When I did memory failure tests recently, below warning occurs:
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0
Call Trace:
<TASK>
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
panic+0x326/0x350
check_panic_on_warn+0x4f/0x50
__warn+0x98/0x190
report_bug+0x18e/0x1a0
handle_bug+0x3d/0x70
exc_invalid_op+0x18/0x70
asm_exc_invalid_op+0x1a/0x20
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
lock_acquire+0xbe/0x2d0
_raw_spin_lock_irqsave+0x3a/0x60
hugepage_subpool_put_pages.part.0+0xe/0xc0
free_huge_folio+0x253/0x3f0
dissolve_free_huge_page+0x147/0x210
__page_handle_poison+0x9/0x70
memory_failure+0x4e6/0x8c0
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
</TASK>
After git bisecting and digging into the code, I believe the root cause
is that _deferred_list field of folio is unioned with _hugetlb_subpool
field. In __update_and_free_hugetlb_folio(), folio->_deferred_list is
always initialized leading to corrupted folio->_hugetlb_subpool when
folio is hugetlb. Later free_huge_folio() will use _hugetlb_subpool
and above warning happens. Fix this by initialise folio->_deferred_list
iff folio is not hugetlb.
Fixes: b6952b6272dd ("mm: always initialise folio->_deferred_list")
CC: stable@vger.kernel.org
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
mm/hugetlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 26ab9dfc7d63..1da9a14a5513 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
destroy_compound_gigantic_folio(folio, huge_page_order(h));
free_gigantic_folio(folio, huge_page_order(h));
} else {
- INIT_LIST_HEAD(&folio->_deferred_list);
+ if (!folio_test_hugetlb(folio))
+ INIT_LIST_HEAD(&folio->_deferred_list);
folio_put(folio);
}
}
--
2.33.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
2024-04-18 2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
2024-04-18 2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
@ 2024-04-18 2:20 ` Miaohe Lin
2024-04-18 20:38 ` Andrew Morton
1 sibling, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18 2:20 UTC (permalink / raw)
To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel
Below panic occurs when I did memory failure test:
BUG: unable to handle page fault for address: dead000000000108
PGD 0 P4D 0
Oops: Oops: 0001 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 1073 Comm: bash Not tainted 6.9.0-rc4-next-20240417-dirty #52
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:enqueue_hugetlb_folio+0x46/0xe0
RSP: 0018:ffff9e0207f03d10 EFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000122
RDX: ffffcbb244460008 RSI: dead000000000100 RDI: ffff976a09da6f90
RBP: ffffcbb244460000 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 7a088d6100000000 R12: ffffffffbcc93160
R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
FS: 00007fdb749b1740(0000) GS:ffff97711fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: dead000000000108 CR3: 00000001078ac000 CR4: 00000000000006f0
Call Trace:
<TASK>
free_huge_folio+0x28d/0x420
dissolve_free_hugetlb_folio+0x135/0x1d0
__page_handle_poison+0x18/0xb0
memory_failure+0x712/0xd30
hard_offline_page_store+0x55/0xa0
kernfs_fop_write_iter+0x12c/0x1d0
vfs_write+0x380/0x540
ksys_write+0x64/0xe0
do_syscall_64+0xbc/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fdb74714887
RSP: 002b:00007ffdfc7074e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fdb74714887
RDX: 000000000000000c RSI: 00005653ec7c0e10 RDI: 0000000000000001
RBP: 00005653ec7c0e10 R08: 00007fdb747d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007fdb7481b780 R14: 00007fdb74817600 R15: 00007fdb74816a00
</TASK>
Modules linked in: mce_inject hwpoison_inject
CR2: dead000000000108
---[ end trace 0000000000000000 ]---
RIP: 0010:enqueue_hugetlb_folio+0x46/0xe0
RSP: 0018:ffff9e0207f03d10 EFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000122
RDX: ffffcbb244460008 RSI: dead000000000100 RDI: ffff976a09da6f90
RBP: ffffcbb244460000 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 7a088d6100000000 R12: ffffffffbcc93160
R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
FS: 00007fdb749b1740(0000) GS:ffff97711fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: dead000000000108 CR3: 00000001078ac000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x38a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---
The root cause is that list_del() is used to remove folio from list when
dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
hugetlb folio when free_huge_folio() leading to above panic. Fix this
issue by using list_del_init() to remove folio.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
mm/hugetlb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1da9a14a5513..08634732dca4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
return;
- list_del(&folio->lru);
+ list_del_init(&folio->lru);
if (folio_test_hugetlb_freed(folio)) {
h->free_huge_pages--;
--
2.33.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
2024-04-18 2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
@ 2024-04-18 4:05 ` Oscar Salvador
2024-04-18 8:00 ` Miaohe Lin
0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2024-04-18 4:05 UTC (permalink / raw)
To: Miaohe Lin
Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 26ab9dfc7d63..1da9a14a5513 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
> destroy_compound_gigantic_folio(folio, huge_page_order(h));
> free_gigantic_folio(folio, huge_page_order(h));
> } else {
> - INIT_LIST_HEAD(&folio->_deferred_list);
> + if (!folio_test_hugetlb(folio))
> + INIT_LIST_HEAD(&folio->_deferred_list);
Ok, it took me a bit to figure this out.
So we basically init __deferred_list when we know that
folio_put will not end up calling free_huge_folio
because a previous call to remove_hugetlb_folio has already cleared the
bit.
Maybe Matthew thought that any folio ending here would not end up in
free_huge_folio (which is the one fiddling subpool).
I mean, fix looks good because if hugetlb flag is cleared,
destroy_large_folio will go straight to free_the_page, but the
whole thing is a bit subtle.
And if we decide to go with this, I think we are going to need a comment
in there explaining what is going on like "only init _deferred_list if
free_huge_folio cannot be call".
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
2024-04-18 4:05 ` Oscar Salvador
@ 2024-04-18 8:00 ` Miaohe Lin
2024-04-18 12:41 ` Oscar Salvador
0 siblings, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18 8:00 UTC (permalink / raw)
To: Oscar Salvador
Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On 2024/4/18 12:05, Oscar Salvador wrote:
> On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 26ab9dfc7d63..1da9a14a5513 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
>> destroy_compound_gigantic_folio(folio, huge_page_order(h));
>> free_gigantic_folio(folio, huge_page_order(h));
>> } else {
>> - INIT_LIST_HEAD(&folio->_deferred_list);
>> + if (!folio_test_hugetlb(folio))
>> + INIT_LIST_HEAD(&folio->_deferred_list);
>
> Ok, it took me a bit to figure this out.
>
> So we basically init __deferred_list when we know that
> folio_put will not end up calling free_huge_folio
> because a previous call to remove_hugetlb_folio has already cleared the
> bit.
>
> Maybe Matthew thought that any folio ending here would not end up in
> free_huge_folio (which is the one fiddling subpool).
>
> I mean, fix looks good because if hugetlb flag is cleared,
> destroy_large_folio will go straight to free_the_page, but the
> whole thing is a bit subtle.
AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
on how to fix this in a more graceful way?
>
> And if we decide to go with this, I think we are going to need a comment
> in there explaining what is going on like "only init _deferred_list if
> free_huge_folio cannot be call".
Yes, this comment will help.
Thanks.
.
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
2024-04-18 8:00 ` Miaohe Lin
@ 2024-04-18 12:41 ` Oscar Salvador
2024-04-19 2:00 ` Miaohe Lin
0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2024-04-18 12:41 UTC (permalink / raw)
To: Miaohe Lin
Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On Thu, Apr 18, 2024 at 04:00:42PM +0800, Miaohe Lin wrote:
> On 2024/4/18 12:05, Oscar Salvador wrote:
> > On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index 26ab9dfc7d63..1da9a14a5513 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
> >> destroy_compound_gigantic_folio(folio, huge_page_order(h));
> >> free_gigantic_folio(folio, huge_page_order(h));
> >> } else {
> >> - INIT_LIST_HEAD(&folio->_deferred_list);
> >> + if (!folio_test_hugetlb(folio))
> >> + INIT_LIST_HEAD(&folio->_deferred_list);
> >
> > Ok, it took me a bit to figure this out.
> >
> > So we basically init __deferred_list when we know that
> > folio_put will not end up calling free_huge_folio
> > because a previous call to remove_hugetlb_folio has already cleared the
> > bit.
> >
> > Maybe Matthew thought that any folio ending here would not end up in
> > free_huge_folio (which is the one fiddling subpool).
> >
> > I mean, fix looks good because if hugetlb flag is cleared,
> > destroy_large_folio will go straight to free_the_page, but the
> > whole thing is a bit subtle.
>
> AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
> on how to fix this in a more graceful way?
Not from the top of my head.
Anyway, I have been thinking for a while that this code needs some love,
so I will check how this can be untangled.
--
Oscar Salvador
SUSE Labs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
2024-04-18 2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
@ 2024-04-18 20:38 ` Andrew Morton
2024-04-19 2:07 ` Miaohe Lin
2024-04-19 9:07 ` Miaohe Lin
0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2024-04-18 20:38 UTC (permalink / raw)
To: Miaohe Lin; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> Below panic occurs when I did memory failure test:
>
> BUG: unable to handle page fault for address: dead000000000108
>
> ...
>
> The root cause is that list_del() is used to remove folio from list when
> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
> hugetlb folio when free_huge_folio() leading to above panic. Fix this
> issue by using list_del_init() to remove folio.
>
> ...
>
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
> return;
>
> - list_del(&folio->lru);
> + list_del_init(&folio->lru);
>
> if (folio_test_hugetlb_freed(folio)) {
> h->free_huge_pages--;
We should cc:stable and find a Fixes:. This appears to predate
6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
2024-04-18 12:41 ` Oscar Salvador
@ 2024-04-19 2:00 ` Miaohe Lin
0 siblings, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19 2:00 UTC (permalink / raw)
To: Oscar Salvador
Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On 2024/4/18 20:41, Oscar Salvador wrote:
> On Thu, Apr 18, 2024 at 04:00:42PM +0800, Miaohe Lin wrote:
>> On 2024/4/18 12:05, Oscar Salvador wrote:
>>> On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 26ab9dfc7d63..1da9a14a5513 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
>>>> destroy_compound_gigantic_folio(folio, huge_page_order(h));
>>>> free_gigantic_folio(folio, huge_page_order(h));
>>>> } else {
>>>> - INIT_LIST_HEAD(&folio->_deferred_list);
>>>> + if (!folio_test_hugetlb(folio))
>>>> + INIT_LIST_HEAD(&folio->_deferred_list);
>>>
>>> Ok, it took me a bit to figure this out.
>>>
>>> So we basically init __deferred_list when we know that
>>> folio_put will not end up calling free_huge_folio
>>> because a previous call to remove_hugetlb_folio has already cleared the
>>> bit.
>>>
>>> Maybe Matthew thought that any folio ending here would not end up in
>>> free_huge_folio (which is the one fiddling subpool).
>>>
>>> I mean, fix looks good because if hugetlb flag is cleared,
>>> destroy_large_folio will go straight to free_the_page, but the
>>> whole thing is a bit subtle.
>>
>> AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
>> on how to fix this in a more graceful way?
>
> Not from the top of my head.
> Anyway, I have been thinking for a while that this code needs some love,
> so I will check how this can be untangled.
That would be really nice. Thanks Oscar.
.
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
2024-04-18 20:38 ` Andrew Morton
@ 2024-04-19 2:07 ` Miaohe Lin
2024-04-19 9:07 ` Miaohe Lin
1 sibling, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19 2:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On 2024/4/19 4:38, Andrew Morton wrote:
> On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
>
>> Below panic occurs when I did memory failure test:
>>
>> BUG: unable to handle page fault for address: dead000000000108
>>
>> ...
>>
>> The root cause is that list_del() is used to remove folio from list when
>> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
>> hugetlb folio when free_huge_folio() leading to above panic. Fix this
>> issue by using list_del_init() to remove folio.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
>> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>> return;
>>
>> - list_del(&folio->lru);
>> + list_del_init(&folio->lru);
>>
>> if (folio_test_hugetlb_freed(folio)) {
>> h->free_huge_pages--;
>
> We should cc:stable and find a Fixes:. This appears to predate
> 6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.
It's weird I didn't observe this issue before last merge window while corresponding
code logic seems not changed. I will try again to find a Fixes.
Thanks.
.
> .
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
2024-04-18 20:38 ` Andrew Morton
2024-04-19 2:07 ` Miaohe Lin
@ 2024-04-19 9:07 ` Miaohe Lin
1 sibling, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19 9:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel
On 2024/4/19 4:38, Andrew Morton wrote:
> On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
>
>> Below panic occurs when I did memory failure test:
>>
>> BUG: unable to handle page fault for address: dead000000000108
>>
>> ...
>>
>> The root cause is that list_del() is used to remove folio from list when
>> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
>> hugetlb folio when free_huge_folio() leading to above panic. Fix this
>> issue by using list_del_init() to remove folio.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
>> if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>> return;
>>
>> - list_del(&folio->lru);
>> + list_del_init(&folio->lru);
>>
>> if (folio_test_hugetlb_freed(folio)) {
>> h->free_huge_pages--;
>
> We should cc:stable and find a Fixes:. This appears to predate
> 6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.
I think this series can be dropped because this didn't fix the root cause.
Please see my v2 patch for details. So this Fixes tag isn't needed anymore.
Thanks.
.
> .
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-04-19 9:07 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18 2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
2024-04-18 2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
2024-04-18 4:05 ` Oscar Salvador
2024-04-18 8:00 ` Miaohe Lin
2024-04-18 12:41 ` Oscar Salvador
2024-04-19 2:00 ` Miaohe Lin
2024-04-18 2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
2024-04-18 20:38 ` Andrew Morton
2024-04-19 2:07 ` Miaohe Lin
2024-04-19 9:07 ` Miaohe Lin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.