All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb
@ 2024-04-18  2:19 Miaohe Lin
  2024-04-18  2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
  2024-04-18  2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
  0 siblings, 2 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18  2:19 UTC (permalink / raw)
  To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel

This series contains fixup patches to fix the issues I observed when
I did memory failure tests.
Thanks!

Miaohe Lin (2):
  mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when
    dissolve_free_hugetlb_folio()
  mm/hugetlb: fix unable to handle page fault for address
    dead000000000108

 mm/hugetlb.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  2024-04-18  2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
@ 2024-04-18  2:19 ` Miaohe Lin
  2024-04-18  4:05   ` Oscar Salvador
  2024-04-18  2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
  1 sibling, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18  2:19 UTC (permalink / raw)
  To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel

When I did memory failure tests recently, below warning occurs:

DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
FS:  00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 lock_acquire+0xbe/0x2d0
 _raw_spin_lock_irqsave+0x3a/0x60
 hugepage_subpool_put_pages.part.0+0xe/0xc0
 free_huge_folio+0x253/0x3f0
 dissolve_free_huge_page+0x147/0x210
 __page_handle_poison+0x9/0x70
 memory_failure+0x4e6/0x8c0
 hard_offline_page_store+0x55/0xa0
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xbc/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
 </TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...
CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 panic+0x326/0x350
 check_panic_on_warn+0x4f/0x50
 __warn+0x98/0x190
 report_bug+0x18e/0x1a0
 handle_bug+0x3d/0x70
 exc_invalid_op+0x18/0x70
 asm_exc_invalid_op+0x1a/0x20
RIP: 0010:__lock_acquire+0xccb/0x1ca0
RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082
RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0
RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb
R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10
R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004
 lock_acquire+0xbe/0x2d0
 _raw_spin_lock_irqsave+0x3a/0x60
 hugepage_subpool_put_pages.part.0+0xe/0xc0
 free_huge_folio+0x253/0x3f0
 dissolve_free_huge_page+0x147/0x210
 __page_handle_poison+0x9/0x70
 memory_failure+0x4e6/0x8c0
 hard_offline_page_store+0x55/0xa0
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xbc/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff9f3114887
RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887
RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001
RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00
 </TASK>

After git bisecting and digging into the code, I believe the root cause
is that _deferred_list field of folio is unioned with _hugetlb_subpool
field. In __update_and_free_hugetlb_folio(), folio->_deferred_list is
always initialized leading to corrupted folio->_hugetlb_subpool when
folio is hugetlb. Later free_huge_folio() will use _hugetlb_subpool
and above warning happens. Fix this by initialise folio->_deferred_list
iff folio is not hugetlb.

Fixes: b6952b6272dd ("mm: always initialise folio->_deferred_list")
CC: stable@vger.kernel.org
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/hugetlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 26ab9dfc7d63..1da9a14a5513 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
 		destroy_compound_gigantic_folio(folio, huge_page_order(h));
 		free_gigantic_folio(folio, huge_page_order(h));
 	} else {
-		INIT_LIST_HEAD(&folio->_deferred_list);
+		if (!folio_test_hugetlb(folio))
+			INIT_LIST_HEAD(&folio->_deferred_list);
 		folio_put(folio);
 	}
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
  2024-04-18  2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
  2024-04-18  2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
@ 2024-04-18  2:20 ` Miaohe Lin
  2024-04-18 20:38   ` Andrew Morton
  1 sibling, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18  2:20 UTC (permalink / raw)
  To: akpm, muchun.song; +Cc: david, vbabka, willy, linmiaohe, linux-mm, linux-kernel

Below panic occurs when I did memory failure test:

BUG: unable to handle page fault for address: dead000000000108
PGD 0 P4D 0
Oops: Oops: 0001 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 1073 Comm: bash Not tainted 6.9.0-rc4-next-20240417-dirty #52
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:enqueue_hugetlb_folio+0x46/0xe0
RSP: 0018:ffff9e0207f03d10 EFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000122
RDX: ffffcbb244460008 RSI: dead000000000100 RDI: ffff976a09da6f90
RBP: ffffcbb244460000 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 7a088d6100000000 R12: ffffffffbcc93160
R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fdb749b1740(0000) GS:ffff97711fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: dead000000000108 CR3: 00000001078ac000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 free_huge_folio+0x28d/0x420
 dissolve_free_hugetlb_folio+0x135/0x1d0
 __page_handle_poison+0x18/0xb0
 memory_failure+0x712/0xd30
 hard_offline_page_store+0x55/0xa0
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xbc/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fdb74714887
RSP: 002b:00007ffdfc7074e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007fdb74714887
RDX: 000000000000000c RSI: 00005653ec7c0e10 RDI: 0000000000000001
RBP: 00005653ec7c0e10 R08: 00007fdb747d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007fdb7481b780 R14: 00007fdb74817600 R15: 00007fdb74816a00
 </TASK>
Modules linked in: mce_inject hwpoison_inject
CR2: dead000000000108
---[ end trace 0000000000000000 ]---
RIP: 0010:enqueue_hugetlb_folio+0x46/0xe0
RSP: 0018:ffff9e0207f03d10 EFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: dead000000000122
RDX: ffffcbb244460008 RSI: dead000000000100 RDI: ffff976a09da6f90
RBP: ffffcbb244460000 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 7a088d6100000000 R12: ffffffffbcc93160
R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000000
FS:  00007fdb749b1740(0000) GS:ffff97711fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: dead000000000108 CR3: 00000001078ac000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x38a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The root cause is that list_del() is used to remove folio from list when
dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
hugetlb folio when free_huge_folio() leading to above panic. Fix this
issue by using list_del_init() to remove folio.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1da9a14a5513..08634732dca4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 		return;
 
-	list_del(&folio->lru);
+	list_del_init(&folio->lru);
 
 	if (folio_test_hugetlb_freed(folio)) {
 		h->free_huge_pages--;
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  2024-04-18  2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
@ 2024-04-18  4:05   ` Oscar Salvador
  2024-04-18  8:00     ` Miaohe Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2024-04-18  4:05 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 26ab9dfc7d63..1da9a14a5513 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
>  		destroy_compound_gigantic_folio(folio, huge_page_order(h));
>  		free_gigantic_folio(folio, huge_page_order(h));
>  	} else {
> -		INIT_LIST_HEAD(&folio->_deferred_list);
> +		if (!folio_test_hugetlb(folio))
> +			INIT_LIST_HEAD(&folio->_deferred_list);

Ok, it took me a bit to figure this out.

So we basically init __deferred_list when we know that
folio_put will not end up calling free_huge_folio
because a previous call to remove_hugetlb_folio has already cleared the
bit.

Maybe Matthew thought that any folio ending here would not end up in
free_huge_folio (which is the one fiddling subpool).

I mean, fix looks good because if hugetlb flag is cleared,
destroy_large_folio will go straight to free_the_page, but the
whole thing is a bit subtle.

And if we decide to go with this, I think we are going to need a comment
in there explaining what is going on like "only init _deferred_list if
free_huge_folio cannot be call".


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  2024-04-18  4:05   ` Oscar Salvador
@ 2024-04-18  8:00     ` Miaohe Lin
  2024-04-18 12:41       ` Oscar Salvador
  0 siblings, 1 reply; 10+ messages in thread
From: Miaohe Lin @ 2024-04-18  8:00 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On 2024/4/18 12:05, Oscar Salvador wrote:
> On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 26ab9dfc7d63..1da9a14a5513 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
>>  		destroy_compound_gigantic_folio(folio, huge_page_order(h));
>>  		free_gigantic_folio(folio, huge_page_order(h));
>>  	} else {
>> -		INIT_LIST_HEAD(&folio->_deferred_list);
>> +		if (!folio_test_hugetlb(folio))
>> +			INIT_LIST_HEAD(&folio->_deferred_list);
> 
> Ok, it took me a bit to figure this out.
> 
> So we basically init __deferred_list when we know that
> folio_put will not end up calling free_huge_folio
> because a previous call to remove_hugetlb_folio has already cleared the
> bit.
> 
> Maybe Matthew thought that any folio ending here would not end up in
> free_huge_folio (which is the one fiddling subpool).
> 
> I mean, fix looks good because if hugetlb flag is cleared,
> destroy_large_folio will go straight to free_the_page, but the
> whole thing is a bit subtle.

AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
on how to fix this in a more graceful way?

> 
> And if we decide to go with this, I think we are going to need a comment
> in there explaining what is going on like "only init _deferred_list if
> free_huge_folio cannot be call".

Yes, this comment will help.
Thanks.
.

> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  2024-04-18  8:00     ` Miaohe Lin
@ 2024-04-18 12:41       ` Oscar Salvador
  2024-04-19  2:00         ` Miaohe Lin
  0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2024-04-18 12:41 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On Thu, Apr 18, 2024 at 04:00:42PM +0800, Miaohe Lin wrote:
> On 2024/4/18 12:05, Oscar Salvador wrote:
> > On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >> index 26ab9dfc7d63..1da9a14a5513 100644
> >> --- a/mm/hugetlb.c
> >> +++ b/mm/hugetlb.c
> >> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
> >>  		destroy_compound_gigantic_folio(folio, huge_page_order(h));
> >>  		free_gigantic_folio(folio, huge_page_order(h));
> >>  	} else {
> >> -		INIT_LIST_HEAD(&folio->_deferred_list);
> >> +		if (!folio_test_hugetlb(folio))
> >> +			INIT_LIST_HEAD(&folio->_deferred_list);
> > 
> > Ok, it took me a bit to figure this out.
> > 
> > So we basically init __deferred_list when we know that
> > folio_put will not end up calling free_huge_folio
> > because a previous call to remove_hugetlb_folio has already cleared the
> > bit.
> > 
> > Maybe Matthew thought that any folio ending here would not end up in
> > free_huge_folio (which is the one fiddling subpool).
> > 
> > I mean, fix looks good because if hugetlb flag is cleared,
> > destroy_large_folio will go straight to free_the_page, but the
> > whole thing is a bit subtle.
> 
> AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
> on how to fix this in a more graceful way?

Not from the top of my head.
Anyway, I have been thinking for a while that this code needs some love,
so I will check how this can be untangled.


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
  2024-04-18  2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
@ 2024-04-18 20:38   ` Andrew Morton
  2024-04-19  2:07     ` Miaohe Lin
  2024-04-19  9:07     ` Miaohe Lin
  0 siblings, 2 replies; 10+ messages in thread
From: Andrew Morton @ 2024-04-18 20:38 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:

> Below panic occurs when I did memory failure test:
> 
> BUG: unable to handle page fault for address: dead000000000108
> 
> ...
>
> The root cause is that list_del() is used to remove folio from list when
> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
> hugetlb folio when free_huge_folio() leading to above panic. Fix this
> issue by using list_del_init() to remove folio.
> 
> ...
>
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
>  	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>  		return;
>  
> -	list_del(&folio->lru);
> +	list_del_init(&folio->lru);
>  
>  	if (folio_test_hugetlb_freed(folio)) {
>  		h->free_huge_pages--;

We should cc:stable and find a Fixes:.  This appears to predate
6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio()
  2024-04-18 12:41       ` Oscar Salvador
@ 2024-04-19  2:00         ` Miaohe Lin
  0 siblings, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19  2:00 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: akpm, muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On 2024/4/18 20:41, Oscar Salvador wrote:
> On Thu, Apr 18, 2024 at 04:00:42PM +0800, Miaohe Lin wrote:
>> On 2024/4/18 12:05, Oscar Salvador wrote:
>>> On Thu, Apr 18, 2024 at 10:19:59AM +0800, Miaohe Lin wrote:
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 26ab9dfc7d63..1da9a14a5513 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -1788,7 +1788,8 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
>>>>  		destroy_compound_gigantic_folio(folio, huge_page_order(h));
>>>>  		free_gigantic_folio(folio, huge_page_order(h));
>>>>  	} else {
>>>> -		INIT_LIST_HEAD(&folio->_deferred_list);
>>>> +		if (!folio_test_hugetlb(folio))
>>>> +			INIT_LIST_HEAD(&folio->_deferred_list);
>>>
>>> Ok, it took me a bit to figure this out.
>>>
>>> So we basically init __deferred_list when we know that
>>> folio_put will not end up calling free_huge_folio
>>> because a previous call to remove_hugetlb_folio has already cleared the
>>> bit.
>>>
>>> Maybe Matthew thought that any folio ending here would not end up in
>>> free_huge_folio (which is the one fiddling subpool).
>>>
>>> I mean, fix looks good because if hugetlb flag is cleared,
>>> destroy_large_folio will go straight to free_the_page, but the
>>> whole thing is a bit subtle.
>>
>> AFAICS, this is the most straightforward way to fix the issue. Do you have any suggestions
>> on how to fix this in a more graceful way?
> 
> Not from the top of my head.
> Anyway, I have been thinking for a while that this code needs some love,
> so I will check how this can be untangled.

That would be really nice. Thanks Oscar.
.

> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
  2024-04-18 20:38   ` Andrew Morton
@ 2024-04-19  2:07     ` Miaohe Lin
  2024-04-19  9:07     ` Miaohe Lin
  1 sibling, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19  2:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On 2024/4/19 4:38, Andrew Morton wrote:
> On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> 
>> Below panic occurs when I did memory failure test:
>>
>> BUG: unable to handle page fault for address: dead000000000108
>>
>> ...
>>
>> The root cause is that list_del() is used to remove folio from list when
>> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
>> hugetlb folio when free_huge_folio() leading to above panic. Fix this
>> issue by using list_del_init() to remove folio.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
>>  	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>>  		return;
>>  
>> -	list_del(&folio->lru);
>> +	list_del_init(&folio->lru);
>>  
>>  	if (folio_test_hugetlb_freed(folio)) {
>>  		h->free_huge_pages--;
> 
> We should cc:stable and find a Fixes:.  This appears to predate
> 6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.

It's weird I didn't observe this issue before last merge window while corresponding
code logic seems not changed. I will try again to find a Fixes.
Thanks.
.

> .
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108
  2024-04-18 20:38   ` Andrew Morton
  2024-04-19  2:07     ` Miaohe Lin
@ 2024-04-19  9:07     ` Miaohe Lin
  1 sibling, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2024-04-19  9:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: muchun.song, david, vbabka, willy, linux-mm, linux-kernel

On 2024/4/19 4:38, Andrew Morton wrote:
> On Thu, 18 Apr 2024 10:20:00 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> 
>> Below panic occurs when I did memory failure test:
>>
>> BUG: unable to handle page fault for address: dead000000000108
>>
>> ...
>>
>> The root cause is that list_del() is used to remove folio from list when
>> dissolve_free_hugetlb_folio(). But list_move() might be used to reenqueue
>> hugetlb folio when free_huge_folio() leading to above panic. Fix this
>> issue by using list_del_init() to remove folio.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1642,7 +1642,7 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
>>  	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>>  		return;
>>  
>> -	list_del(&folio->lru);
>> +	list_del_init(&folio->lru);
>>  
>>  	if (folio_test_hugetlb_freed(folio)) {
>>  		h->free_huge_pages--;
> 
> We should cc:stable and find a Fixes:.  This appears to predate
> 6eb4e88a6d27022ea8aff424d47a0a5dfc9fcb34, after which I got lost.

I think this series can be dropped because this didn't fix the root cause.
Please see my v2 patch for details. So this Fixes tag isn't needed anymore.
Thanks.
.

> .
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-04-19  9:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18  2:19 [PATCH 0/2] mm/hugetlb: a few fixup patches for hugetlb Miaohe Lin
2024-04-18  2:19 ` [PATCH 1/2] mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() Miaohe Lin
2024-04-18  4:05   ` Oscar Salvador
2024-04-18  8:00     ` Miaohe Lin
2024-04-18 12:41       ` Oscar Salvador
2024-04-19  2:00         ` Miaohe Lin
2024-04-18  2:20 ` [PATCH 2/2] mm/hugetlb: fix unable to handle page fault for address dead000000000108 Miaohe Lin
2024-04-18 20:38   ` Andrew Morton
2024-04-19  2:07     ` Miaohe Lin
2024-04-19  9:07     ` Miaohe Lin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.