linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
@ 2022-04-27  5:32 Naoya Horiguchi
  2022-04-27 19:15 ` Andrew Morton
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2022-04-27  5:32 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox, Miaohe Lin, Christoph Hellwig, Andrew Morton,
	John Hubbard, Jason Gunthorpe, William Kucharski,
	Naoya Horiguchi, linux-kernel

From: Naoya Horiguchi <naoya.horiguchi@nec.com>

The following VM_BUG_ON_FOLIO() is triggered when memory error event
happens on the (thp/folio) pages which are about to be freed:

  [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
  [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
  [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
  [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
  [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
  [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
  [ 1160.251815] ------------[ cut here ]------------
  [ 1160.253438] kernel BUG at include/linux/mm.h:788!
  [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
  [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
  [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
  [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
  [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
  [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
  [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
  [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
  [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
  [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
  [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
  [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
  [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
  [ 1160.293031] Call Trace:
  [ 1160.293724]  <TASK>
  [ 1160.294334]  get_hwpoison_page+0x47d/0x570
  [ 1160.295474]  memory_failure+0x106/0xaa0
  [ 1160.296474]  ? security_capable+0x36/0x50
  [ 1160.297524]  hard_offline_page_store+0x43/0x80
  [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
  [ 1160.299829]  new_sync_write+0xf9/0x160
  [ 1160.300810]  vfs_write+0x209/0x290
  [ 1160.301835]  ksys_write+0x4f/0xc0
  [ 1160.302718]  do_syscall_64+0x3b/0x90
  [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [ 1160.304981] RIP: 0033:0x7eff54b018b7

As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
called from dump_page("hwpoison: unhandlable page") in get_any_page().
The below explains the mechanism of the race:

  CPU 0                                       CPU 1

    memory_failure
      get_hwpoison_page
        get_any_page
          dump_page
            compound = PageCompound
                                                free_pages_prepare
                                                  page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
            folio_entire_mapcount
              VM_BUG_ON_FOLIO(!folio_test_large(folio))

So replace dump_page() with safer one, pr_err().

Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
---
ChangeLog v1 -> v2:
- v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
- update caller side instead of changing dump_page().
---
 mm/memory-failure.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 35e11d6bea4a..0e1453514a2b 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
 	}
 out:
 	if (ret == -EIO)
-		dump_page(p, "hwpoison: unhandlable page");
+		pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
 
 	return ret;
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
@ 2022-04-27 19:15 ` Andrew Morton
  2022-04-27 22:37   ` HORIGUCHI NAOYA(堀口 直也)
  2022-04-27 23:48 ` John Hubbard
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2022-04-27 19:15 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Matthew Wilcox, Miaohe Lin, Christoph Hellwig,
	John Hubbard, Jason Gunthorpe, William Kucharski,
	Naoya Horiguchi, linux-kernel

On Wed, 27 Apr 2022 14:32:20 +0900 Naoya Horiguchi <naoya.horiguchi@linux.dev> wrote:

> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> The following VM_BUG_ON_FOLIO() is triggered when memory error event
> happens on the (thp/folio) pages which are about to be freed:
> 
> ...
>
> Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")

I added cc:stable to this, OK?




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27 19:15 ` Andrew Morton
@ 2022-04-27 22:37   ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 0 replies; 10+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-04-27 22:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, Matthew Wilcox, Miaohe Lin,
	Christoph Hellwig, John Hubbard, Jason Gunthorpe,
	William Kucharski, linux-kernel

On Wed, Apr 27, 2022 at 12:15:17PM -0700, Andrew Morton wrote:
> On Wed, 27 Apr 2022 14:32:20 +0900 Naoya Horiguchi <naoya.horiguchi@linux.dev> wrote:
> 
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > The following VM_BUG_ON_FOLIO() is triggered when memory error event
> > happens on the (thp/folio) pages which are about to be freed:
> > 
> > ...
> >
> > Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> 
> I added cc:stable to this, OK?

Yes, that's fine.  74e8ee4708a8 is new, and only v5.17.z is the target tree.
Thank you.

- Naoya Horiguchi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
  2022-04-27 19:15 ` Andrew Morton
@ 2022-04-27 23:48 ` John Hubbard
  2022-04-28  1:46 ` Miaohe Lin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: John Hubbard @ 2022-04-27 23:48 UTC (permalink / raw)
  To: Naoya Horiguchi, linux-mm
  Cc: Matthew Wilcox, Miaohe Lin, Christoph Hellwig, Andrew Morton,
	Jason Gunthorpe, William Kucharski, Naoya Horiguchi,
	linux-kernel

On 4/26/22 22:32, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> The following VM_BUG_ON_FOLIO() is triggered when memory error event
> happens on the (thp/folio) pages which are about to be freed:
> 
>    [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>    [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>    [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
>    [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
>    [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>    [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
>    [ 1160.251815] ------------[ cut here ]------------
>    [ 1160.253438] kernel BUG at include/linux/mm.h:788!
>    [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
>    [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
>    [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
>    [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
>    [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
>    [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
>    [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
>    [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
>    [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
>    [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
>    [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
>    [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
>    [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
>    [ 1160.293031] Call Trace:
>    [ 1160.293724]  <TASK>
>    [ 1160.294334]  get_hwpoison_page+0x47d/0x570
>    [ 1160.295474]  memory_failure+0x106/0xaa0
>    [ 1160.296474]  ? security_capable+0x36/0x50
>    [ 1160.297524]  hard_offline_page_store+0x43/0x80
>    [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
>    [ 1160.299829]  new_sync_write+0xf9/0x160
>    [ 1160.300810]  vfs_write+0x209/0x290
>    [ 1160.301835]  ksys_write+0x4f/0xc0
>    [ 1160.302718]  do_syscall_64+0x3b/0x90
>    [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>    [ 1160.304981] RIP: 0033:0x7eff54b018b7
> 
> As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> called from dump_page("hwpoison: unhandlable page") in get_any_page().
> The below explains the mechanism of the race:
> 
>    CPU 0                                       CPU 1
> 
>      memory_failure
>        get_hwpoison_page
>          get_any_page
>            dump_page
>              compound = PageCompound
>                                                  free_pages_prepare
>                                                    page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
>              folio_entire_mapcount
>                VM_BUG_ON_FOLIO(!folio_test_large(folio))
> 
> So replace dump_page() with safer one, pr_err().
> 
> Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> ---
> ChangeLog v1 -> v2:
> - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> - update caller side instead of changing dump_page().
> ---
>   mm/memory-failure.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 35e11d6bea4a..0e1453514a2b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
>   	}
>   out:
>   	if (ret == -EIO)
> -		dump_page(p, "hwpoison: unhandlable page");
> +		pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
>   
>   	return ret;
>   }

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
-- 
John Hubbard
NVIDIA


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
  2022-04-27 19:15 ` Andrew Morton
  2022-04-27 23:48 ` John Hubbard
@ 2022-04-28  1:46 ` Miaohe Lin
  2022-04-28 17:25 ` Yang Shi
  2022-05-10  2:28 ` Matthew Wilcox
  4 siblings, 0 replies; 10+ messages in thread
From: Miaohe Lin @ 2022-04-28  1:46 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Matthew Wilcox, Christoph Hellwig, Andrew Morton, John Hubbard,
	Jason Gunthorpe, William Kucharski, Naoya Horiguchi,
	linux-kernel, Linux-MM

On 2022/4/27 13:32, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> The following VM_BUG_ON_FOLIO() is triggered when memory error event
> happens on the (thp/folio) pages which are about to be freed:
> 
>   [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>   [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>   [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
>   [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
>   [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>   [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
>   [ 1160.251815] ------------[ cut here ]------------
>   [ 1160.253438] kernel BUG at include/linux/mm.h:788!
>   [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
>   [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
>   [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
>   [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
>   [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
>   [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
>   [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
>   [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
>   [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
>   [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
>   [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
>   [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
>   [ 1160.293031] Call Trace:
>   [ 1160.293724]  <TASK>
>   [ 1160.294334]  get_hwpoison_page+0x47d/0x570
>   [ 1160.295474]  memory_failure+0x106/0xaa0
>   [ 1160.296474]  ? security_capable+0x36/0x50
>   [ 1160.297524]  hard_offline_page_store+0x43/0x80
>   [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
>   [ 1160.299829]  new_sync_write+0xf9/0x160
>   [ 1160.300810]  vfs_write+0x209/0x290
>   [ 1160.301835]  ksys_write+0x4f/0xc0
>   [ 1160.302718]  do_syscall_64+0x3b/0x90
>   [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>   [ 1160.304981] RIP: 0033:0x7eff54b018b7
> 
> As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> called from dump_page("hwpoison: unhandlable page") in get_any_page().
> The below explains the mechanism of the race:
> 
>   CPU 0                                       CPU 1
> 
>     memory_failure
>       get_hwpoison_page
>         get_any_page
>           dump_page
>             compound = PageCompound
>                                                 free_pages_prepare
>                                                   page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
>             folio_entire_mapcount
>               VM_BUG_ON_FOLIO(!folio_test_large(folio))
> 
> So replace dump_page() with safer one, pr_err().
> 
> Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

LGTM. Thanks!

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
> ChangeLog v1 -> v2:
> - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> - update caller side instead of changing dump_page().
> ---
>  mm/memory-failure.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 35e11d6bea4a..0e1453514a2b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
>  	}
>  out:
>  	if (ret == -EIO)
> -		dump_page(p, "hwpoison: unhandlable page");
> +		pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
>  
>  	return ret;
>  }
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
                   ` (2 preceding siblings ...)
  2022-04-28  1:46 ` Miaohe Lin
@ 2022-04-28 17:25 ` Yang Shi
  2022-05-09  7:14   ` ##freemail## " HORIGUCHI NAOYA(堀口 直也)
  2022-05-10  2:28 ` Matthew Wilcox
  4 siblings, 1 reply; 10+ messages in thread
From: Yang Shi @ 2022-04-28 17:25 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Linux MM, Matthew Wilcox, Miaohe Lin, Christoph Hellwig,
	Andrew Morton, John Hubbard, Jason Gunthorpe, William Kucharski,
	Naoya Horiguchi, Linux Kernel Mailing List

On Tue, Apr 26, 2022 at 10:32 PM Naoya Horiguchi
<naoya.horiguchi@linux.dev> wrote:
>
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
>
> The following VM_BUG_ON_FOLIO() is triggered when memory error event
> happens on the (thp/folio) pages which are about to be freed:
>
>   [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>   [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
>   [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
>   [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
>   [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>   [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
>   [ 1160.251815] ------------[ cut here ]------------
>   [ 1160.253438] kernel BUG at include/linux/mm.h:788!
>   [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
>   [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
>   [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
>   [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
>   [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
>   [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
>   [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
>   [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
>   [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
>   [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
>   [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
>   [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
>   [ 1160.293031] Call Trace:
>   [ 1160.293724]  <TASK>
>   [ 1160.294334]  get_hwpoison_page+0x47d/0x570
>   [ 1160.295474]  memory_failure+0x106/0xaa0
>   [ 1160.296474]  ? security_capable+0x36/0x50
>   [ 1160.297524]  hard_offline_page_store+0x43/0x80
>   [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
>   [ 1160.299829]  new_sync_write+0xf9/0x160
>   [ 1160.300810]  vfs_write+0x209/0x290
>   [ 1160.301835]  ksys_write+0x4f/0xc0
>   [ 1160.302718]  do_syscall_64+0x3b/0x90
>   [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>   [ 1160.304981] RIP: 0033:0x7eff54b018b7
>
> As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> called from dump_page("hwpoison: unhandlable page") in get_any_page().
> The below explains the mechanism of the race:
>
>   CPU 0                                       CPU 1
>
>     memory_failure
>       get_hwpoison_page
>         get_any_page
>           dump_page
>             compound = PageCompound
>                                                 free_pages_prepare
>                                                   page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
>             folio_entire_mapcount
>               VM_BUG_ON_FOLIO(!folio_test_large(folio))
>
> So replace dump_page() with safer one, pr_err().
>
> Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> ---
> ChangeLog v1 -> v2:
> - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> - update caller side instead of changing dump_page().
> ---
>  mm/memory-failure.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 35e11d6bea4a..0e1453514a2b 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
>         }
>  out:
>         if (ret == -EIO)
> -               dump_page(p, "hwpoison: unhandlable page");
> +               pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));

I think dump_page() is helpful to tell the users more information
about the unhandlable page, I'm ok with this fix for now, but should
we consider having a memory failure safe dump_page() in the future?

>
>         return ret;
>  }
> --
> 2.25.1
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ##freemail## Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-28 17:25 ` Yang Shi
@ 2022-05-09  7:14   ` HORIGUCHI NAOYA(堀口 直也)
  2022-05-10  0:16     ` Yang Shi
  0 siblings, 1 reply; 10+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-05-09  7:14 UTC (permalink / raw)
  To: Yang Shi
  Cc: Naoya Horiguchi, Linux MM, Matthew Wilcox, Miaohe Lin,
	Christoph Hellwig, Andrew Morton, John Hubbard, Jason Gunthorpe,
	William Kucharski, Linux Kernel Mailing List

On Thu, Apr 28, 2022 at 10:25:33AM -0700, Yang Shi wrote:
> On Tue, Apr 26, 2022 at 10:32 PM Naoya Horiguchi
> <naoya.horiguchi@linux.dev> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> >
> > The following VM_BUG_ON_FOLIO() is triggered when memory error event
> > happens on the (thp/folio) pages which are about to be freed:
> >
> >   [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> >   [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> >   [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> >   [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
> >   [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> >   [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
> >   [ 1160.251815] ------------[ cut here ]------------
> >   [ 1160.253438] kernel BUG at include/linux/mm.h:788!
> >   [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> >   [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
> >   [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> >   [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
> >   [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
> >   [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
> >   [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
> >   [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
> >   [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
> >   [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
> >   [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
> >   [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
> >   [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
> >   [ 1160.293031] Call Trace:
> >   [ 1160.293724]  <TASK>
> >   [ 1160.294334]  get_hwpoison_page+0x47d/0x570
> >   [ 1160.295474]  memory_failure+0x106/0xaa0
> >   [ 1160.296474]  ? security_capable+0x36/0x50
> >   [ 1160.297524]  hard_offline_page_store+0x43/0x80
> >   [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
> >   [ 1160.299829]  new_sync_write+0xf9/0x160
> >   [ 1160.300810]  vfs_write+0x209/0x290
> >   [ 1160.301835]  ksys_write+0x4f/0xc0
> >   [ 1160.302718]  do_syscall_64+0x3b/0x90
> >   [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >   [ 1160.304981] RIP: 0033:0x7eff54b018b7
> >
> > As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> > called from dump_page("hwpoison: unhandlable page") in get_any_page().
> > The below explains the mechanism of the race:
> >
> >   CPU 0                                       CPU 1
> >
> >     memory_failure
> >       get_hwpoison_page
> >         get_any_page
> >           dump_page
> >             compound = PageCompound
> >                                                 free_pages_prepare
> >                                                   page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
> >             folio_entire_mapcount
> >               VM_BUG_ON_FOLIO(!folio_test_large(folio))
> >
> > So replace dump_page() with safer one, pr_err().
> >
> > Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > ---
> > ChangeLog v1 -> v2:
> > - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> > - update caller side instead of changing dump_page().
> > ---
> >  mm/memory-failure.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 35e11d6bea4a..0e1453514a2b 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
> >         }
> >  out:
> >         if (ret == -EIO)
> > -               dump_page(p, "hwpoison: unhandlable page");
> > +               pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
> 
> I think dump_page() is helpful to tell the users more information
> about the unhandlable page, I'm ok with this fix for now, but should
> we consider having a memory failure safe dump_page() in the future?

Yes, maybe that would be helpful not only in this unhandlable case, so sounds
good to me.  But how do we handle folio's case?  And I'm not sure that the full
info in dump_page() is needed in a memory_failure-specific variant.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ##freemail## Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-05-09  7:14   ` ##freemail## " HORIGUCHI NAOYA(堀口 直也)
@ 2022-05-10  0:16     ` Yang Shi
  0 siblings, 0 replies; 10+ messages in thread
From: Yang Shi @ 2022-05-10  0:16 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Naoya Horiguchi, Linux MM, Matthew Wilcox, Miaohe Lin,
	Christoph Hellwig, Andrew Morton, John Hubbard, Jason Gunthorpe,
	William Kucharski, Linux Kernel Mailing List

On Mon, May 9, 2022 at 12:14 AM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@nec.com> wrote:
>
> On Thu, Apr 28, 2022 at 10:25:33AM -0700, Yang Shi wrote:
> > On Tue, Apr 26, 2022 at 10:32 PM Naoya Horiguchi
> > <naoya.horiguchi@linux.dev> wrote:
> > >
> > > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > >
> > > The following VM_BUG_ON_FOLIO() is triggered when memory error event
> > > happens on the (thp/folio) pages which are about to be freed:
> > >
> > >   [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> > >   [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
> > >   [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> > >   [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
> > >   [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> > >   [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
> > >   [ 1160.251815] ------------[ cut here ]------------
> > >   [ 1160.253438] kernel BUG at include/linux/mm.h:788!
> > >   [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> > >   [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
> > >   [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> > >   [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
> > >   [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
> > >   [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
> > >   [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
> > >   [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
> > >   [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
> > >   [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
> > >   [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
> > >   [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
> > >   [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >   [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
> > >   [ 1160.293031] Call Trace:
> > >   [ 1160.293724]  <TASK>
> > >   [ 1160.294334]  get_hwpoison_page+0x47d/0x570
> > >   [ 1160.295474]  memory_failure+0x106/0xaa0
> > >   [ 1160.296474]  ? security_capable+0x36/0x50
> > >   [ 1160.297524]  hard_offline_page_store+0x43/0x80
> > >   [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
> > >   [ 1160.299829]  new_sync_write+0xf9/0x160
> > >   [ 1160.300810]  vfs_write+0x209/0x290
> > >   [ 1160.301835]  ksys_write+0x4f/0xc0
> > >   [ 1160.302718]  do_syscall_64+0x3b/0x90
> > >   [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >   [ 1160.304981] RIP: 0033:0x7eff54b018b7
> > >
> > > As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
> > > called from dump_page("hwpoison: unhandlable page") in get_any_page().
> > > The below explains the mechanism of the race:
> > >
> > >   CPU 0                                       CPU 1
> > >
> > >     memory_failure
> > >       get_hwpoison_page
> > >         get_any_page
> > >           dump_page
> > >             compound = PageCompound
> > >                                                 free_pages_prepare
> > >                                                   page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
> > >             folio_entire_mapcount
> > >               VM_BUG_ON_FOLIO(!folio_test_large(folio))
> > >
> > > So replace dump_page() with safer one, pr_err().
> > >
> > > Fixes: 74e8ee4708a8 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
> > > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > ---
> > > ChangeLog v1 -> v2:
> > > - v1: https://lore.kernel.org/linux-mm/20220414235950.840409-1-naoya.horiguchi@linux.dev/T/#u
> > > - update caller side instead of changing dump_page().
> > > ---
> > >  mm/memory-failure.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > index 35e11d6bea4a..0e1453514a2b 100644
> > > --- a/mm/memory-failure.c
> > > +++ b/mm/memory-failure.c
> > > @@ -1270,7 +1270,7 @@ static int get_any_page(struct page *p, unsigned long flags)
> > >         }
> > >  out:
> > >         if (ret == -EIO)
> > > -               dump_page(p, "hwpoison: unhandlable page");
> > > +               pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
> >
> > I think dump_page() is helpful to tell the users more information
> > about the unhandlable page, I'm ok with this fix for now, but should
> > we consider having a memory failure safe dump_page() in the future?
>
> Yes, maybe that would be helpful not only in this unhandlable case, so sounds
> good to me.  But how do we handle folio's case?  And I'm not sure that the full
> info in dump_page() is needed in a memory_failure-specific variant.

Off the top of my head, we just dump the raw information of the page
and the head page if it belongs to a large folio at the moment? This
may be just slightly different from dump_page().

>
> Thanks,
> Naoya Horiguchi


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
                   ` (3 preceding siblings ...)
  2022-04-28 17:25 ` Yang Shi
@ 2022-05-10  2:28 ` Matthew Wilcox
  2022-05-10 10:31   ` HORIGUCHI NAOYA(堀口 直也)
  4 siblings, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2022-05-10  2:28 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Miaohe Lin, Christoph Hellwig, Andrew Morton,
	John Hubbard, Jason Gunthorpe, William Kucharski,
	Naoya Horiguchi, linux-kernel

On Wed, Apr 27, 2022 at 02:32:20PM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> The following VM_BUG_ON_FOLIO() is triggered when memory error event
> happens on the (thp/folio) pages which are about to be freed:

So the real problem is that we're calling dump_page() when we don't
have a reference to the page, right?  Otherwise it wouldn't be freed.

>  out:
>  	if (ret == -EIO)
> -		dump_page(p, "hwpoison: unhandlable page");
> +		pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));

It would be nice to get some more information out of the page than that
,.. but taking a refcount inside dump_page() conflicts with the other
"would be nice", which is for dump_page() to take a const struct page *
so we can (eg) make folio_test_uptodate() take a const struct folio *.

We've had some other problems with inconsistent pages being printed in
dump_page().  It can be quite confusing when debugging.  I still don't
have a good solution to that either.

I do have a proposal for reforming mapcount which will solve this
particular problem, but I'm not quite sure when I'll get to it.
This patch is probably the best thing to do for now.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page()
  2022-05-10  2:28 ` Matthew Wilcox
@ 2022-05-10 10:31   ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 0 replies; 10+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-05-10 10:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Naoya Horiguchi, linux-mm, Miaohe Lin, Christoph Hellwig,
	Andrew Morton, John Hubbard, Jason Gunthorpe, William Kucharski,
	linux-kernel

On Tue, May 10, 2022 at 03:28:05AM +0100, Matthew Wilcox wrote:
> On Wed, Apr 27, 2022 at 02:32:20PM +0900, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > The following VM_BUG_ON_FOLIO() is triggered when memory error event
> > happens on the (thp/folio) pages which are about to be freed:
> 
> So the real problem is that we're calling dump_page() when we don't
> have a reference to the page, right?  Otherwise it wouldn't be freed.

Right, that's the problem.

> 
> >  out:
> >  	if (ret == -EIO)
> > -		dump_page(p, "hwpoison: unhandlable page");
> > +		pr_err("Memory failure: %#lx: unhandlable page.\n", page_to_pfn(p));
> 
> It would be nice to get some more information out of the page than that
> ,.. but taking a refcount inside dump_page() conflicts with the other
> "would be nice", which is for dump_page() to take a const struct page *
> so we can (eg) make folio_test_uptodate() take a const struct folio *.
> 
> We've had some other problems with inconsistent pages being printed in
> dump_page().  It can be quite confusing when debugging.  I still don't
> have a good solution to that either.
> 
> I do have a proposal for reforming mapcount which will solve this
> particular problem, but I'm not quite sure when I'll get to it.
> This patch is probably the best thing to do for now.

Thank you for the comment.  So I leave this code simple (or some
hwpoison specific dump_page() if someone have good ideas) until some
ultimate solution is ready for upstream.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-05-10 10:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-27  5:32 [PATCH v2] mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() Naoya Horiguchi
2022-04-27 19:15 ` Andrew Morton
2022-04-27 22:37   ` HORIGUCHI NAOYA(堀口 直也)
2022-04-27 23:48 ` John Hubbard
2022-04-28  1:46 ` Miaohe Lin
2022-04-28 17:25 ` Yang Shi
2022-05-09  7:14   ` ##freemail## " HORIGUCHI NAOYA(堀口 直也)
2022-05-10  0:16     ` Yang Shi
2022-05-10  2:28 ` Matthew Wilcox
2022-05-10 10:31   ` HORIGUCHI NAOYA(堀口 直也)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).