linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fs: improve dump_mapping() robustness
@ 2024-01-16  7:53 Baolin Wang
  2024-01-16 11:16 ` Christian Brauner
  2024-01-18  1:38 ` Al Viro
  0 siblings, 2 replies; 10+ messages in thread
From: Baolin Wang @ 2024-01-16  7:53 UTC (permalink / raw)
  To: akpm
  Cc: willy, viro, brauner, jack, baolin.wang, linux-mm, linux-fsdevel,
	linux-kernel

We met a kernel crash issue when running stress-ng testing, and the
system crashes when printing the dentry name in dump_mapping().

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
pc : dentry_name+0xd8/0x224
lr : pointer+0x22c/0x370
sp : ffff800025f134c0
......
Call trace:
  dentry_name+0xd8/0x224
  pointer+0x22c/0x370
  vsnprintf+0x1ec/0x730
  vscnprintf+0x2c/0x60
  vprintk_store+0x70/0x234
  vprintk_emit+0xe0/0x24c
  vprintk_default+0x3c/0x44
  vprintk_func+0x84/0x2d0
  printk+0x64/0x88
  __dump_page+0x52c/0x530
  dump_page+0x14/0x20
  set_migratetype_isolate+0x110/0x224
  start_isolate_page_range+0xc4/0x20c
  offline_pages+0x124/0x474
  memory_block_offline+0x44/0xf4
  memory_subsys_offline+0x3c/0x70
  device_offline+0xf0/0x120
  ......

The root cause is that, one thread is doing page migration, and we will
use the target page's ->mapping field to save 'anon_vma' pointer between
page unmap and page move, and now the target page is locked and refcount
is 1.

Currently, there is another stress-ng thread performing memory hotplug,
attempting to offline the target page that is being migrated. It discovers
that the refcount of this target page is 1, preventing the offline operation,
thus proceeding to dump the page. However, page_mapping() of the target
page may return an incorrect file mapping to crash the system in dump_mapping(),
since the target page->mapping only saves 'anon_vma' pointer without setting
PAGE_MAPPING_ANON flag.

The page migration issue has been fixed by commit d1adb25df711 ("mm: migrate:
fix getting incorrect page mapping during page migration"). In addition,
Matthew suggested we should also improve dump_mapping()'s robustness to
resilient against the kernel crash [1].

With checking the 'dentry.parent' and 'dentry.d_name.name' used by
dentry_name(), I can see dump_mapping() will output the invalid dentry
instead of crashing the system when this issue is reproduced again.

[12211.189128] page:fffff7de047741c0 refcount:1 mapcount:0 mapping:ffff989117f55ea0 index:0x1 pfn:0x211dd07
[12211.189144] aops:0x0 ino:1 invalid dentry:74786574206e6870
[12211.189148] flags: 0x57ffffc0000001(locked|node=1|zone=2|lastcpupid=0x1fffff)
[12211.189150] page_type: 0xffffffff()
[12211.189153] raw: 0057ffffc0000001 0000000000000000 dead000000000122 ffff989117f55ea0
[12211.189154] raw: 0000000000000001 0000000000000001 00000001ffffffff 0000000000000000
[12211.189155] page dumped because: unmovable page

[1] https://lore.kernel.org/all/ZXxn%2F0oixJxxAnpF@casper.infradead.org/
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 fs/inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/inode.c b/fs/inode.c
index 99d8754a74a3..3093e3b3fd12 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -589,7 +589,8 @@ void dump_mapping(const struct address_space *mapping)
 	}
 
 	dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
-	if (get_kernel_nofault(dentry, dentry_ptr)) {
+	if (get_kernel_nofault(dentry, dentry_ptr) ||
+	    !dentry.d_parent || !dentry.d_name.name) {
 		pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
 				a_ops, ino, dentry_ptr);
 		return;
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-16  7:53 [PATCH] fs: improve dump_mapping() robustness Baolin Wang
@ 2024-01-16 11:16 ` Christian Brauner
  2024-01-18  1:27   ` Baolin Wang
  2024-01-18  1:38 ` Al Viro
  1 sibling, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2024-01-16 11:16 UTC (permalink / raw)
  To: Baolin Wang
  Cc: Christian Brauner, willy, viro, jack, linux-mm, linux-fsdevel,
	linux-kernel, akpm

On Tue, 16 Jan 2024 15:53:35 +0800, Baolin Wang wrote:
> We met a kernel crash issue when running stress-ng testing, and the
> system crashes when printing the dentry name in dump_mapping().
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> pc : dentry_name+0xd8/0x224
> lr : pointer+0x22c/0x370
> sp : ffff800025f134c0
> ......
> Call trace:
>   dentry_name+0xd8/0x224
>   pointer+0x22c/0x370
>   vsnprintf+0x1ec/0x730
>   vscnprintf+0x2c/0x60
>   vprintk_store+0x70/0x234
>   vprintk_emit+0xe0/0x24c
>   vprintk_default+0x3c/0x44
>   vprintk_func+0x84/0x2d0
>   printk+0x64/0x88
>   __dump_page+0x52c/0x530
>   dump_page+0x14/0x20
>   set_migratetype_isolate+0x110/0x224
>   start_isolate_page_range+0xc4/0x20c
>   offline_pages+0x124/0x474
>   memory_block_offline+0x44/0xf4
>   memory_subsys_offline+0x3c/0x70
>   device_offline+0xf0/0x120
>   ......
> 
> [...]

Seems fine for debugging purposes. Let me know if this needs to go through
somewhere else.

---

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] fs: improve dump_mapping() robustness
      https://git.kernel.org/vfs/vfs/c/30a1b9d12728

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-16 11:16 ` Christian Brauner
@ 2024-01-18  1:27   ` Baolin Wang
  0 siblings, 0 replies; 10+ messages in thread
From: Baolin Wang @ 2024-01-18  1:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: willy, viro, jack, linux-mm, linux-fsdevel, linux-kernel, akpm



On 1/16/2024 7:16 PM, Christian Brauner wrote:
> On Tue, 16 Jan 2024 15:53:35 +0800, Baolin Wang wrote:
>> We met a kernel crash issue when running stress-ng testing, and the
>> system crashes when printing the dentry name in dump_mapping().
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> pc : dentry_name+0xd8/0x224
>> lr : pointer+0x22c/0x370
>> sp : ffff800025f134c0
>> ......
>> Call trace:
>>    dentry_name+0xd8/0x224
>>    pointer+0x22c/0x370
>>    vsnprintf+0x1ec/0x730
>>    vscnprintf+0x2c/0x60
>>    vprintk_store+0x70/0x234
>>    vprintk_emit+0xe0/0x24c
>>    vprintk_default+0x3c/0x44
>>    vprintk_func+0x84/0x2d0
>>    printk+0x64/0x88
>>    __dump_page+0x52c/0x530
>>    dump_page+0x14/0x20
>>    set_migratetype_isolate+0x110/0x224
>>    start_isolate_page_range+0xc4/0x20c
>>    offline_pages+0x124/0x474
>>    memory_block_offline+0x44/0xf4
>>    memory_subsys_offline+0x3c/0x70
>>    device_offline+0xf0/0x120
>>    ......
>>
>> [...]
> 
> Seems fine for debugging purposes. Let me know if this needs to go through
> somewhere else.

Going through VFS tree is fine to me. Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-16  7:53 [PATCH] fs: improve dump_mapping() robustness Baolin Wang
  2024-01-16 11:16 ` Christian Brauner
@ 2024-01-18  1:38 ` Al Viro
  2024-01-18  2:12   ` Matthew Wilcox
  2024-01-18  2:43   ` Baolin Wang
  1 sibling, 2 replies; 10+ messages in thread
From: Al Viro @ 2024-01-18  1:38 UTC (permalink / raw)
  To: Baolin Wang
  Cc: akpm, willy, brauner, jack, linux-mm, linux-fsdevel, linux-kernel

On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote:

> With checking the 'dentry.parent' and 'dentry.d_name.name' used by
> dentry_name(), I can see dump_mapping() will output the invalid dentry
> instead of crashing the system when this issue is reproduced again.

>  	dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
> -	if (get_kernel_nofault(dentry, dentry_ptr)) {
> +	if (get_kernel_nofault(dentry, dentry_ptr) ||
> +	    !dentry.d_parent || !dentry.d_name.name) {
>  		pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
>  				a_ops, ino, dentry_ptr);
>  		return;

That's nowhere near enough.  Your ->d_name.name can bloody well be pointing
to an external name that gets freed right under you.  Legitimately so.

Think what happens if dentry has a long name (longer than would fit into
the embedded array) and gets renamed name just after you copy it into
a local variable.  Old name will get freed.  Yes, freeing is RCU-delayed,
but I don't see anything that would prevent your thread losing CPU
and not getting it back until after the sucker's been freed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-18  1:38 ` Al Viro
@ 2024-01-18  2:12   ` Matthew Wilcox
  2024-01-18 14:04     ` Christian Brauner
  2024-01-18  2:43   ` Baolin Wang
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Wilcox @ 2024-01-18  2:12 UTC (permalink / raw)
  To: Al Viro
  Cc: Baolin Wang, akpm, brauner, jack, linux-mm, linux-fsdevel, linux-kernel

On Thu, Jan 18, 2024 at 01:38:57AM +0000, Al Viro wrote:
> On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote:
> 
> > With checking the 'dentry.parent' and 'dentry.d_name.name' used by
> > dentry_name(), I can see dump_mapping() will output the invalid dentry
> > instead of crashing the system when this issue is reproduced again.
> 
> >  	dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
> > -	if (get_kernel_nofault(dentry, dentry_ptr)) {
> > +	if (get_kernel_nofault(dentry, dentry_ptr) ||
> > +	    !dentry.d_parent || !dentry.d_name.name) {
> >  		pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
> >  				a_ops, ino, dentry_ptr);
> >  		return;
> 
> That's nowhere near enough.  Your ->d_name.name can bloody well be pointing
> to an external name that gets freed right under you.  Legitimately so.
> 
> Think what happens if dentry has a long name (longer than would fit into
> the embedded array) and gets renamed name just after you copy it into
> a local variable.  Old name will get freed.  Yes, freeing is RCU-delayed,
> but I don't see anything that would prevent your thread losing CPU
> and not getting it back until after the sucker's been freed.

Agreed that it's not enough.  It does usually work, and it's very
helpful when it does.  We've had it since 2018 (1c6fb1d89e73) and we've
been gradually making it more robust over time.  Part of my reason for
splitting dump_mapping() out of dump_page() was so that it would get
more review from people who understand the fs side of things ... and
that seems to have worked.

Can I trouble you to suggest a more robust solution?  Bear in mind that
dump_page() does get called on pointers which turn out not to even be
pointers to struct page so this is all very much best-effort, and giving
up and printing 'this is not a dentry" is always an option.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-18  1:38 ` Al Viro
  2024-01-18  2:12   ` Matthew Wilcox
@ 2024-01-18  2:43   ` Baolin Wang
  2024-01-19 15:48     ` Charan Teja Kalla
  1 sibling, 1 reply; 10+ messages in thread
From: Baolin Wang @ 2024-01-18  2:43 UTC (permalink / raw)
  To: Al Viro; +Cc: akpm, willy, brauner, jack, linux-mm, linux-fsdevel, linux-kernel



On 1/18/2024 9:38 AM, Al Viro wrote:
> On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote:
> 
>> With checking the 'dentry.parent' and 'dentry.d_name.name' used by
>> dentry_name(), I can see dump_mapping() will output the invalid dentry
>> instead of crashing the system when this issue is reproduced again.
> 
>>   	dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
>> -	if (get_kernel_nofault(dentry, dentry_ptr)) {
>> +	if (get_kernel_nofault(dentry, dentry_ptr) ||
>> +	    !dentry.d_parent || !dentry.d_name.name) {
>>   		pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
>>   				a_ops, ino, dentry_ptr);
>>   		return;
> 
> That's nowhere near enough.  Your ->d_name.name can bloody well be pointing
> to an external name that gets freed right under you.  Legitimately so.
> 
> Think what happens if dentry has a long name (longer than would fit into
> the embedded array) and gets renamed name just after you copy it into
> a local variable.  Old name will get freed.  Yes, freeing is RCU-delayed,
> but I don't see anything that would prevent your thread losing CPU
> and not getting it back until after the sucker's been freed.

Yes, that's possible. And this appears to be a use-after-free issue in 
the existing code, which is different from the issue that my patch 
addressed.

So how about adding a rcu_read_lock() before copying the dentry to a 
local variable in case the old name is freed?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-18  2:12   ` Matthew Wilcox
@ 2024-01-18 14:04     ` Christian Brauner
  0 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2024-01-18 14:04 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Al Viro, Baolin Wang, akpm, jack, linux-mm, linux-fsdevel, linux-kernel

> Agreed that it's not enough.  It does usually work, and it's very

Wasn't the intent to just get somewhat further and accepting that this
might crash no matter what?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-18  2:43   ` Baolin Wang
@ 2024-01-19 15:48     ` Charan Teja Kalla
  2024-01-22  7:17       ` Baolin Wang
  0 siblings, 1 reply; 10+ messages in thread
From: Charan Teja Kalla @ 2024-01-19 15:48 UTC (permalink / raw)
  To: Baolin Wang, Al Viro
  Cc: akpm, willy, brauner, jack, linux-mm, linux-fsdevel, linux-kernel

Hi Matthew/Baolin,

On 1/18/2024 8:13 AM, Baolin Wang wrote:
> 
> 
> On 1/18/2024 9:38 AM, Al Viro wrote:
>> On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote:
>>
>>> With checking the 'dentry.parent' and 'dentry.d_name.name' used by
>>> dentry_name(), I can see dump_mapping() will output the invalid dentry
>>> instead of crashing the system when this issue is reproduced again.
>>
>>>       dentry_ptr = container_of(dentry_first, struct dentry,
>>> d_u.d_alias);
>>> -    if (get_kernel_nofault(dentry, dentry_ptr)) {
>>> +    if (get_kernel_nofault(dentry, dentry_ptr) ||
>>> +        !dentry.d_parent || !dentry.d_name.name) {
>>>           pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
>>>                   a_ops, ino, dentry_ptr);
>>>           return;
>>
>> That's nowhere near enough.  Your ->d_name.name can bloody well be
>> pointing
>> to an external name that gets freed right under you.  Legitimately so.
>>
>> Think what happens if dentry has a long name (longer than would fit into
>> the embedded array) and gets renamed name just after you copy it into
>> a local variable.  Old name will get freed.  Yes, freeing is RCU-delayed,
>> but I don't see anything that would prevent your thread losing CPU
>> and not getting it back until after the sucker's been freed.
> 
> Yes, that's possible. And this appears to be a use-after-free issue in
> the existing code, which is different from the issue that my patch
> addressed.
> 
> So how about adding a rcu_read_lock() before copying the dentry to a
> local variable in case the old name is freed?
> 

We too seen the below crash while printing the dentry name.

aops:shmem_aops ino:5e029 dentry name:"dev/zero"
flags:
0x8000000000080006(referenced|uptodate|swapbacked|zone=2|kasantag=0x0)
raw: 8000000000080006 ffffffc033b1bb60 ffffffc033b1bb60 ffffff8862537600
raw: 0000000000000001 0000000000000000 00000003ffffffff ffffff807fe64000
page dumped because: migration failure
migrating pfn aef223 failed ret:1
page:000000009e72a120 refcount:3 mapcount:0 mapping:000000003325dda1
index:0x1 pfn:0xaef223
memcg:ffffff807fe64000
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000000
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000005
  CM = 0, WnR = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=000000090c12d000
[0000000000000000] pgd=0000000000000000, p4d=0000000000000000,
pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP

dentry_name+0x1f8/0x3a8
pointer+0x3b0/0x6b8
vsnprintf+0x4a4/0x65c
vprintk_store+0x168/0x4a8
vprintk_emit+0x98/0x218
vprintk_default+0x44/0x70
vprintk+0xf0/0x138
_printk+0x54/0x80
dump_mapping+0x17c/0x188
dump_page+0x1d0/0x2e8
offline_pages+0x67c/0x898



Not much comfortable with block layer internals, TMK, the below is what
happening in the my case:
memoffline	     		dput()
(offline_pages)		 (as part of closing of the shmem file)
------------		 --------------------------------------
					.......
			1) dentry_unlink_inode()
			      hlist_del_init(&dentry->d_u.d_alias);

			2) iput():
			    a) inode->i_state |= I_FREEING
				.....
			    b) evict_inode()->..->shmem_undo_range
			       1) get the folios with elevated refcount
3) do_migrate_range():
   a) Because of the elevated
   refcount in 2.b.1, the
   migration of this page will
   be failed.

			       2) truncate_inode_folio() ->
				     filemap_remove_folio():
 				(deletes from the page cache,
				 set page->mapping=NULL,
				 decrement the refcount on folio)
  b) Call dump_page():
     1) mapping = page_mapping(page);
     2) dump_mapping(mapping)
	  a) We unlinked the dentry in 1)
           thus dentry_ptr from host->i_dentry.first
           is not a proper one.

         b) dentry name print with %pd is resulting into
	   the mentioned crash.


At least in this case, I think __this patchset in its current form can
help us__.

Thanks,
Charan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-19 15:48     ` Charan Teja Kalla
@ 2024-01-22  7:17       ` Baolin Wang
  2024-01-22 16:28         ` Charan Teja Kalla
  0 siblings, 1 reply; 10+ messages in thread
From: Baolin Wang @ 2024-01-22  7:17 UTC (permalink / raw)
  To: Charan Teja Kalla, Al Viro
  Cc: akpm, willy, brauner, jack, linux-mm, linux-fsdevel, linux-kernel



On 1/19/2024 11:48 PM, Charan Teja Kalla wrote:
> Hi Matthew/Baolin,
> 
> On 1/18/2024 8:13 AM, Baolin Wang wrote:
>>
>>
>> On 1/18/2024 9:38 AM, Al Viro wrote:
>>> On Tue, Jan 16, 2024 at 03:53:35PM +0800, Baolin Wang wrote:
>>>
>>>> With checking the 'dentry.parent' and 'dentry.d_name.name' used by
>>>> dentry_name(), I can see dump_mapping() will output the invalid dentry
>>>> instead of crashing the system when this issue is reproduced again.
>>>
>>>>        dentry_ptr = container_of(dentry_first, struct dentry,
>>>> d_u.d_alias);
>>>> -    if (get_kernel_nofault(dentry, dentry_ptr)) {
>>>> +    if (get_kernel_nofault(dentry, dentry_ptr) ||
>>>> +        !dentry.d_parent || !dentry.d_name.name) {
>>>>            pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
>>>>                    a_ops, ino, dentry_ptr);
>>>>            return;
>>>
>>> That's nowhere near enough.  Your ->d_name.name can bloody well be
>>> pointing
>>> to an external name that gets freed right under you.  Legitimately so.
>>>
>>> Think what happens if dentry has a long name (longer than would fit into
>>> the embedded array) and gets renamed name just after you copy it into
>>> a local variable.  Old name will get freed.  Yes, freeing is RCU-delayed,
>>> but I don't see anything that would prevent your thread losing CPU
>>> and not getting it back until after the sucker's been freed.
>>
>> Yes, that's possible. And this appears to be a use-after-free issue in
>> the existing code, which is different from the issue that my patch
>> addressed.
>>
>> So how about adding a rcu_read_lock() before copying the dentry to a
>> local variable in case the old name is freed?
>>
> 
> We too seen the below crash while printing the dentry name.
> 
> aops:shmem_aops ino:5e029 dentry name:"dev/zero"
> flags:
> 0x8000000000080006(referenced|uptodate|swapbacked|zone=2|kasantag=0x0)
> raw: 8000000000080006 ffffffc033b1bb60 ffffffc033b1bb60 ffffff8862537600
> raw: 0000000000000001 0000000000000000 00000003ffffffff ffffff807fe64000
> page dumped because: migration failure
> migrating pfn aef223 failed ret:1
> page:000000009e72a120 refcount:3 mapcount:0 mapping:000000003325dda1
> index:0x1 pfn:0xaef223
> memcg:ffffff807fe64000
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000000
> Mem abort info:
>    ESR = 0x0000000096000005
>    EC = 0x25: DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>    FSC = 0x05: level 1 translation fault
> Data abort info:
>    ISV = 0, ISS = 0x00000005
>    CM = 0, WnR = 0
> user pgtable: 4k pages, 39-bit VAs, pgdp=000000090c12d000
> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000,
> pud=0000000000000000
> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> 
> dentry_name+0x1f8/0x3a8
> pointer+0x3b0/0x6b8
> vsnprintf+0x4a4/0x65c
> vprintk_store+0x168/0x4a8
> vprintk_emit+0x98/0x218
> vprintk_default+0x44/0x70
> vprintk+0xf0/0x138
> _printk+0x54/0x80
> dump_mapping+0x17c/0x188
> dump_page+0x1d0/0x2e8
> offline_pages+0x67c/0x898
> 
> 
> 
> Not much comfortable with block layer internals, TMK, the below is what
> happening in the my case:
> memoffline	     		dput()
> (offline_pages)		 (as part of closing of the shmem file)
> ------------		 --------------------------------------
> 					.......
> 			1) dentry_unlink_inode()
> 			      hlist_del_init(&dentry->d_u.d_alias);
> 
> 			2) iput():
> 			    a) inode->i_state |= I_FREEING
> 				.....
> 			    b) evict_inode()->..->shmem_undo_range
> 			       1) get the folios with elevated refcount
> 3) do_migrate_range():
>     a) Because of the elevated
>     refcount in 2.b.1, the
>     migration of this page will
>     be failed.
> 
> 			       2) truncate_inode_folio() ->
> 				     filemap_remove_folio():
>   				(deletes from the page cache,
> 				 set page->mapping=NULL,
> 				 decrement the refcount on folio)
>    b) Call dump_page():
>       1) mapping = page_mapping(page);
>       2) dump_mapping(mapping)
> 	  a) We unlinked the dentry in 1)
>             thus dentry_ptr from host->i_dentry.first
>             is not a proper one.
> 
>           b) dentry name print with %pd is resulting into
> 	   the mentioned crash.
> 
> 
> At least in this case, I think __this patchset in its current form can
> help us__.

This looks another case of NULL pointer access. Thanks for the detailed 
analysis. Could you provide a Tested-by or Reviewed-by tag if it can 
solve your problem?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fs: improve dump_mapping() robustness
  2024-01-22  7:17       ` Baolin Wang
@ 2024-01-22 16:28         ` Charan Teja Kalla
  0 siblings, 0 replies; 10+ messages in thread
From: Charan Teja Kalla @ 2024-01-22 16:28 UTC (permalink / raw)
  To: Baolin Wang, Al Viro
  Cc: akpm, willy, brauner, jack, linux-mm, linux-fsdevel, linux-kernel



On 1/22/2024 12:47 PM, Baolin Wang wrote:
>>
>> We too seen the below crash while printing the dentry name.
>>
>> aops:shmem_aops ino:5e029 dentry name:"dev/zero"
>> flags:
>> 0x8000000000080006(referenced|uptodate|swapbacked|zone=2|kasantag=0x0)
>> raw: 8000000000080006 ffffffc033b1bb60 ffffffc033b1bb60 ffffff8862537600
>> raw: 0000000000000001 0000000000000000 00000003ffffffff ffffff807fe64000
>> page dumped because: migration failure
>> migrating pfn aef223 failed ret:1
>> page:000000009e72a120 refcount:3 mapcount:0 mapping:000000003325dda1
>> index:0x1 pfn:0xaef223
>> memcg:ffffff807fe64000
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0000000000000000
>> Mem abort info:
>>    ESR = 0x0000000096000005
>>    EC = 0x25: DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>    FSC = 0x05: level 1 translation fault
>> Data abort info:
>>    ISV = 0, ISS = 0x00000005
>>    CM = 0, WnR = 0
>> user pgtable: 4k pages, 39-bit VAs, pgdp=000000090c12d000
>> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000,
>> pud=0000000000000000
>> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
>>
>> dentry_name+0x1f8/0x3a8
>> pointer+0x3b0/0x6b8
>> vsnprintf+0x4a4/0x65c
>> vprintk_store+0x168/0x4a8
>> vprintk_emit+0x98/0x218
>> vprintk_default+0x44/0x70
>> vprintk+0xf0/0x138
>> _printk+0x54/0x80
>> dump_mapping+0x17c/0x188
>> dump_page+0x1d0/0x2e8
>> offline_pages+0x67c/0x898
>>
>>
>>
>> Not much comfortable with block layer internals, TMK, the below is what
>> happening in the my case:
>> memoffline                 dput()
>> (offline_pages)         (as part of closing of the shmem file)
>> ------------         --------------------------------------
>>                     .......
>>             1) dentry_unlink_inode()
>>                   hlist_del_init(&dentry->d_u.d_alias);
>>
>>             2) iput():
>>                 a) inode->i_state |= I_FREEING
>>                 .....
>>                 b) evict_inode()->..->shmem_undo_range
>>                    1) get the folios with elevated refcount
>> 3) do_migrate_range():
>>     a) Because of the elevated
>>     refcount in 2.b.1, the
>>     migration of this page will
>>     be failed.
>>
>>                    2) truncate_inode_folio() ->
>>                      filemap_remove_folio():
>>                   (deletes from the page cache,
>>                  set page->mapping=NULL,
>>                  decrement the refcount on folio)
>>    b) Call dump_page():
>>       1) mapping = page_mapping(page);
>>       2) dump_mapping(mapping)
>>       a) We unlinked the dentry in 1)
>>             thus dentry_ptr from host->i_dentry.first
>>             is not a proper one.
>>
>>           b) dentry name print with %pd is resulting into
>>        the mentioned crash.
>>
>>
>> At least in this case, I think __this patchset in its current form can
>> help us__.
> 
> This looks another case of NULL pointer access. Thanks for the detailed
> analysis. Could you provide a Tested-by or Reviewed-by tag if it can
> solve your problem?
Seen this issue couple of times, over 3 months back. Not sure if we ever
encounter this issue again. Still, will pick this and let you know the
side effects of this patch, after thorough testing.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-01-22 16:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-16  7:53 [PATCH] fs: improve dump_mapping() robustness Baolin Wang
2024-01-16 11:16 ` Christian Brauner
2024-01-18  1:27   ` Baolin Wang
2024-01-18  1:38 ` Al Viro
2024-01-18  2:12   ` Matthew Wilcox
2024-01-18 14:04     ` Christian Brauner
2024-01-18  2:43   ` Baolin Wang
2024-01-19 15:48     ` Charan Teja Kalla
2024-01-22  7:17       ` Baolin Wang
2024-01-22 16:28         ` Charan Teja Kalla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).