All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-14  4:04 ` Peng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Peng Zhang @ 2022-11-14  4:04 UTC (permalink / raw)
  To: konishi.ryusuke, ye.xingchen, chi.minghao, vishal.moola, linux-nilfs
  Cc: linux-kernel, ZhangPeng, syzbot+ebe05ee8e98f755f61d0

From: ZhangPeng <zhangpeng362@huawei.com>

Syzbot reported a null-ptr-deref bug:

NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
frequency < 30 seconds
general protection fault, probably for non-canonical address
0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 1 PID: 3603 Comm: segctord Not tainted
6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
10/11/2022
RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
fs/nilfs2/alloc.c:608
Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
 nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
 nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
 nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
 nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
 nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
 nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
 nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
 nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
 nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
 nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
 nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
 nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
 nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
 nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
 kthread+0x2e4/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
fs/nilfs2/alloc.c:608
Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
----------------
Code disassembly (best guess), 7 bytes skipped:
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	0f 85 cd 05 00 00    	jne    0x5d7
   a:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  11:	fc ff df
  14:	4c 8b 73 08          	mov    0x8(%rbx),%r14
  18:	49 8d 7e 10          	lea    0x10(%r14),%rdi
  1c:	48 89 fa             	mov    %rdi,%rdx
  1f:	48 c1 ea 03          	shr    $0x3,%rdx
* 23:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping
instruction
  27:	0f 85 26 05 00 00    	jne    0x553
  2d:	49 8b 46 10          	mov    0x10(%r14),%rax
  31:	be a6 00 00 00       	mov    $0xa6,%esi
  36:	48                   	rex.W
  37:	c7                   	.byte 0xc7
  38:	c7                   	.byte 0xc7

When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
of oldreq at one level higher to 0. And we don't have a NULL check on
req->pr_desc_bh and req->pr_bitmap_bh in
nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
NULL and kmap() dereferences a NULL pointer, it leads to above crash.
Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().

Reported-by: syzbot+ebe05ee8e98f755f61d0@syzkaller.appspotmail.com
Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
---
 fs/nilfs2/dat.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 3b55e239705f..84ce050a8fa3 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
 	kunmap_atomic(kaddr);
 
 	nilfs_dat_commit_entry(dat, req);
+
+	if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
+		return;
 	nilfs_palloc_commit_free_entry(dat, req);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-14  4:04 ` Peng Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Peng Zhang @ 2022-11-14  4:04 UTC (permalink / raw)
  To: konishi.ryusuke-Re5JQEeQqe8AvxtiuMwx3w,
	ye.xingchen-Th6q7B73Y6EnDS1+zs4M5A,
	chi.minghao-Th6q7B73Y6EnDS1+zs4M5A,
	vishal.moola-Re5JQEeQqe8AvxtiuMwx3w,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, ZhangPeng,
	syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8

From: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

Syzbot reported a null-ptr-deref bug:

NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
frequency < 30 seconds
general protection fault, probably for non-canonical address
0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 1 PID: 3603 Comm: segctord Not tainted
6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
10/11/2022
RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
fs/nilfs2/alloc.c:608
Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
 nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
 nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
 nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
 nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
 nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
 nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
 nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
 nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
 nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
 nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
 nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
 nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
 nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
 nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
 kthread+0x2e4/0x3a0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
fs/nilfs2/alloc.c:608
Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
----------------
Code disassembly (best guess), 7 bytes skipped:
   0:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1)
   4:	0f 85 cd 05 00 00    	jne    0x5d7
   a:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  11:	fc ff df
  14:	4c 8b 73 08          	mov    0x8(%rbx),%r14
  18:	49 8d 7e 10          	lea    0x10(%r14),%rdi
  1c:	48 89 fa             	mov    %rdi,%rdx
  1f:	48 c1 ea 03          	shr    $0x3,%rdx
* 23:	80 3c 02 00          	cmpb   $0x0,(%rdx,%rax,1) <-- trapping
instruction
  27:	0f 85 26 05 00 00    	jne    0x553
  2d:	49 8b 46 10          	mov    0x10(%r14),%rax
  31:	be a6 00 00 00       	mov    $0xa6,%esi
  36:	48                   	rex.W
  37:	c7                   	.byte 0xc7
  38:	c7                   	.byte 0xc7

When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
of oldreq at one level higher to 0. And we don't have a NULL check on
req->pr_desc_bh and req->pr_bitmap_bh in
nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
NULL and kmap() dereferences a NULL pointer, it leads to above crash.
Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().

Reported-by: syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8@public.gmane.org
Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
Signed-off-by: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 fs/nilfs2/dat.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
index 3b55e239705f..84ce050a8fa3 100644
--- a/fs/nilfs2/dat.c
+++ b/fs/nilfs2/dat.c
@@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
 	kunmap_atomic(kaddr);
 
 	nilfs_dat_commit_entry(dat, req);
+
+	if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
+		return;
 	nilfs_palloc_commit_free_entry(dat, req);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-14 18:39   ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-14 18:39 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen, chi.minghao, vishal.moola, linux-nilfs,
	linux-kernel, syzbot+ebe05ee8e98f755f61d0

On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
>
> From: ZhangPeng <zhangpeng362@huawei.com>
>
> Syzbot reported a null-ptr-deref bug:
>
> NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
> frequency < 30 seconds
> general protection fault, probably for non-canonical address
> 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> CPU: 1 PID: 3603 Comm: segctord Not tainted
> 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
> Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
> 10/11/2022
> RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> fs/nilfs2/alloc.c:608
> Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
>  nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
>  nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
>  nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
>  nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
>  nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
>  nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
>  nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
>  nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
>  nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
>  nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
>  nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
>  nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
>  nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
>  nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
>  nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
>  kthread+0x2e4/0x3a0 kernel/kthread.c:376
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
>  </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> fs/nilfs2/alloc.c:608
> Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> ----------------
> Code disassembly (best guess), 7 bytes skipped:
>    0:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
>    4:   0f 85 cd 05 00 00       jne    0x5d7
>    a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
>   11:   fc ff df
>   14:   4c 8b 73 08             mov    0x8(%rbx),%r14
>   18:   49 8d 7e 10             lea    0x10(%r14),%rdi
>   1c:   48 89 fa                mov    %rdi,%rdx
>   1f:   48 c1 ea 03             shr    $0x3,%rdx
> * 23:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1) <-- trapping
> instruction
>   27:   0f 85 26 05 00 00       jne    0x553
>   2d:   49 8b 46 10             mov    0x10(%r14),%rax
>   31:   be a6 00 00 00          mov    $0xa6,%esi
>   36:   48                      rex.W
>   37:   c7                      .byte 0xc7
>   38:   c7                      .byte 0xc7
>
> When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
> blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
> of oldreq at one level higher to 0. And we don't have a NULL check on
> req->pr_desc_bh and req->pr_bitmap_bh in
> nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
> NULL and kmap() dereferences a NULL pointer, it leads to above crash.
> Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
> before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().
>
> Reported-by: syzbot+ebe05ee8e98f755f61d0@syzkaller.appspotmail.com
> Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
> Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
> ---
>  fs/nilfs2/dat.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
> index 3b55e239705f..84ce050a8fa3 100644
> --- a/fs/nilfs2/dat.c
> +++ b/fs/nilfs2/dat.c
> @@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
>         kunmap_atomic(kaddr);
>
>         nilfs_dat_commit_entry(dat, req);
> +
> +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
> +               return;
>         nilfs_palloc_commit_free_entry(dat, req);
>  }
>
> --
> 2.25.1
>

Thank you for your help.

This patch actually fixes the NULL pointer dereference, and doesn't
seem to cause any regressions so far.
But, if de_blocknr is 0 in the first place, it assumes that
req->pr_desc_bh and req->pr_bitmap_bh are set properly in the
corresponding "prepare" operation, nilfs_dat_prepare_end(), so I'd
like to review in a little more detail why that is broken.

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-14 18:39   ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-14 18:39 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen-Th6q7B73Y6EnDS1+zs4M5A,
	chi.minghao-Th6q7B73Y6EnDS1+zs4M5A,
	vishal.moola-Re5JQEeQqe8AvxtiuMwx3w,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8

On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
>
> From: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>
> Syzbot reported a null-ptr-deref bug:
>
> NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
> frequency < 30 seconds
> general protection fault, probably for non-canonical address
> 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> CPU: 1 PID: 3603 Comm: segctord Not tainted
> 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
> Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
> 10/11/2022
> RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> fs/nilfs2/alloc.c:608
> Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
>  nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
>  nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
>  nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
>  nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
>  nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
>  nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
>  nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
>  nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
>  nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
>  nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
>  nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
>  nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
>  nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
>  nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
>  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
>  nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
>  kthread+0x2e4/0x3a0 kernel/kthread.c:376
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
>  </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> fs/nilfs2/alloc.c:608
> Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> ----------------
> Code disassembly (best guess), 7 bytes skipped:
>    0:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
>    4:   0f 85 cd 05 00 00       jne    0x5d7
>    a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
>   11:   fc ff df
>   14:   4c 8b 73 08             mov    0x8(%rbx),%r14
>   18:   49 8d 7e 10             lea    0x10(%r14),%rdi
>   1c:   48 89 fa                mov    %rdi,%rdx
>   1f:   48 c1 ea 03             shr    $0x3,%rdx
> * 23:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1) <-- trapping
> instruction
>   27:   0f 85 26 05 00 00       jne    0x553
>   2d:   49 8b 46 10             mov    0x10(%r14),%rax
>   31:   be a6 00 00 00          mov    $0xa6,%esi
>   36:   48                      rex.W
>   37:   c7                      .byte 0xc7
>   38:   c7                      .byte 0xc7
>
> When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
> blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
> of oldreq at one level higher to 0. And we don't have a NULL check on
> req->pr_desc_bh and req->pr_bitmap_bh in
> nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
> NULL and kmap() dereferences a NULL pointer, it leads to above crash.
> Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
> before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().
>
> Reported-by: syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8@public.gmane.org
> Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
> Signed-off-by: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  fs/nilfs2/dat.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
> index 3b55e239705f..84ce050a8fa3 100644
> --- a/fs/nilfs2/dat.c
> +++ b/fs/nilfs2/dat.c
> @@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
>         kunmap_atomic(kaddr);
>
>         nilfs_dat_commit_entry(dat, req);
> +
> +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
> +               return;
>         nilfs_palloc_commit_free_entry(dat, req);
>  }
>
> --
> 2.25.1
>

Thank you for your help.

This patch actually fixes the NULL pointer dereference, and doesn't
seem to cause any regressions so far.
But, if de_blocknr is 0 in the first place, it assumes that
req->pr_desc_bh and req->pr_bitmap_bh are set properly in the
corresponding "prepare" operation, nilfs_dat_prepare_end(), so I'd
like to review in a little more detail why that is broken.

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  0:38     ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  0:38 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen, chi.minghao, vishal.moola, linux-nilfs,
	linux-kernel, syzbot+ebe05ee8e98f755f61d0

Hi, ZhangPeng,

On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
>
> On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
> >
> > From: ZhangPeng <zhangpeng362@huawei.com>
> >
> > Syzbot reported a null-ptr-deref bug:
> >
> > NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
> > frequency < 30 seconds
> > general protection fault, probably for non-canonical address
> > 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> > CPU: 1 PID: 3603 Comm: segctord Not tainted
> > 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
> > Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
> > 10/11/2022
> > RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> > fs/nilfs2/alloc.c:608
> > Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> > 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> > 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> > RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> > RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> > RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> > RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> > R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> > R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> > FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> > knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> > Call Trace:
> >  <TASK>
> >  nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
> >  nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
> >  nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
> >  nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
> >  nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
> >  nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
> >  nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
> >  nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
> >  nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
> >  nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
> >  nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
> >  nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
> >  nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
> >  nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
> >  nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
> >  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
> >  nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
> >  kthread+0x2e4/0x3a0 kernel/kthread.c:376
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
> >  </TASK>
> > Modules linked in:
> > ---[ end trace 0000000000000000 ]---
> > RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> > fs/nilfs2/alloc.c:608
> > Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> > 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> > 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> > RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> > RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> > RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> > RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> > R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> > R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> > FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> > knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> > ----------------
> > Code disassembly (best guess), 7 bytes skipped:
> >    0:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
> >    4:   0f 85 cd 05 00 00       jne    0x5d7
> >    a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
> >   11:   fc ff df
> >   14:   4c 8b 73 08             mov    0x8(%rbx),%r14
> >   18:   49 8d 7e 10             lea    0x10(%r14),%rdi
> >   1c:   48 89 fa                mov    %rdi,%rdx
> >   1f:   48 c1 ea 03             shr    $0x3,%rdx
> > * 23:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1) <-- trapping
> > instruction
> >   27:   0f 85 26 05 00 00       jne    0x553
> >   2d:   49 8b 46 10             mov    0x10(%r14),%rax
> >   31:   be a6 00 00 00          mov    $0xa6,%esi
> >   36:   48                      rex.W
> >   37:   c7                      .byte 0xc7
> >   38:   c7                      .byte 0xc7
> >
> > When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
> > blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
> > of oldreq at one level higher to 0. And we don't have a NULL check on
> > req->pr_desc_bh and req->pr_bitmap_bh in
> > nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
> > NULL and kmap() dereferences a NULL pointer, it leads to above crash.
> > Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
> > before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().
> >
> > Reported-by: syzbot+ebe05ee8e98f755f61d0@syzkaller.appspotmail.com
> > Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
> > Signed-off-by: ZhangPeng <zhangpeng362@huawei.com>
> > ---
> >  fs/nilfs2/dat.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
> > index 3b55e239705f..84ce050a8fa3 100644
> > --- a/fs/nilfs2/dat.c
> > +++ b/fs/nilfs2/dat.c
> > @@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
> >         kunmap_atomic(kaddr);
> >
> >         nilfs_dat_commit_entry(dat, req);
> > +
> > +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
> > +               return;
> >         nilfs_palloc_commit_free_entry(dat, req);
> >  }
> >
> > --
> > 2.25.1
> >
>
> Thank you for your help.
>
> This patch actually fixes the NULL pointer dereference, and doesn't
> seem to cause any regressions so far.
> But, if de_blocknr is 0 in the first place, it assumes that
> req->pr_desc_bh and req->pr_bitmap_bh are set properly in the
> corresponding "prepare" operation, nilfs_dat_prepare_end(), so I'd
> like to review in a little more detail why that is broken.
>
> Thanks,
> Ryusuke Konishi

What was happening here was complicated.
There were two state inconsistencies in DAT:
1) A virtual block number (vblocknr == 6145) in the report, was used
twice in a btree.
    Because of this, nilfs_dat_commit_end() on that address was called twice.
2) For the virtual block number (vblocknr == 6145), the DAT bitmap
status was free (second anomaly), and nilfs_dat_commit_alloc() was
performed at the same time as the second nilfs_dat_commit_end() above.

As I mentioned, when nilfs_dat_commit_end() frees a virtual block
number in the bitmap of DAT, nilfs_dat_prepare_end() usually allocates
buffer heads of the bitmap block and the descriptor block, and sets
them to the request struct.  But, the above two anomalies overlap, as
you pointed out, de_block is initialized to 0 by
nilfs_dat_prepare_alloc() in the middle of "prepare" and "commit", and
nilfs_dat_commit_end() tries to use unallocated buffer heads.

As a workaround, your patch only needs two minor fixes that I can handle:
1) The following part should use unlikely() annotation
 +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
    like
 +       if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL))
    because this usually doesn't happen.
2) The "Fixes" tag should be removed, because it's not the cause of this bug.

The problem is that this fix doesn't eliminate the double usage of the
virtual block number, and it leaves de_start and de_end of the virtual
block number broken, which leads to disk GC failure.

So, at least, we should call nilfs_error() to signal that the
filesystem is fatally flawed and prevent further operations.

Please give me a few more days to consider how to deal with it. I
would like to think about whether there is a good solution including
the GC problem.

As for the current outlook, it seems difficult to eliminate
duplication of the virtual block number, so I think we will call
nilfs_error() as follows:

        nilfs_dat_commit_entry(dat, req);
+        if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)) {
+                nilfs_error(dat->i_sb,
+                            "state inconsistency due to duplicate use
of vblocknr = %llu",
+                            (unsigned long long)req->pr_entry_nr);
+                return;
+       }
        nilfs_palloc_commit_free_entry(dat, req);

In that case, I would like to modify your patch and send it upstream
as your patch.  Is that OK ?
Or do you want to fix it more by yourself ?

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  0:38     ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  0:38 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen-Th6q7B73Y6EnDS1+zs4M5A,
	chi.minghao-Th6q7B73Y6EnDS1+zs4M5A,
	vishal.moola-Re5JQEeQqe8AvxtiuMwx3w,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8

Hi, ZhangPeng,

On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
>
> On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
> >
> > From: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >
> > Syzbot reported a null-ptr-deref bug:
> >
> > NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
> > frequency < 30 seconds
> > general protection fault, probably for non-canonical address
> > 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> > CPU: 1 PID: 3603 Comm: segctord Not tainted
> > 6.1.0-rc2-syzkaller-00105-gb229b6ca5abb #0
> > Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google
> > 10/11/2022
> > RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> > fs/nilfs2/alloc.c:608
> > Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> > 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> > 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> > RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> > RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> > RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> > RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> > R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> > R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> > FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> > knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> > Call Trace:
> >  <TASK>
> >  nilfs_dat_commit_free fs/nilfs2/dat.c:114 [inline]
> >  nilfs_dat_commit_end+0x464/0x5f0 fs/nilfs2/dat.c:193
> >  nilfs_dat_commit_update+0x26/0x40 fs/nilfs2/dat.c:236
> >  nilfs_btree_commit_update_v+0x87/0x4a0 fs/nilfs2/btree.c:1940
> >  nilfs_btree_commit_propagate_v fs/nilfs2/btree.c:2016 [inline]
> >  nilfs_btree_propagate_v fs/nilfs2/btree.c:2046 [inline]
> >  nilfs_btree_propagate+0xa00/0xd60 fs/nilfs2/btree.c:2088
> >  nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
> >  nilfs_collect_file_data+0x45/0xd0 fs/nilfs2/segment.c:568
> >  nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1018
> >  nilfs_segctor_scan_file+0x3f4/0x6f0 fs/nilfs2/segment.c:1067
> >  nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1197 [inline]
> >  nilfs_segctor_collect fs/nilfs2/segment.c:1503 [inline]
> >  nilfs_segctor_do_construct+0x12fc/0x6af0 fs/nilfs2/segment.c:2045
> >  nilfs_segctor_construct+0x8e3/0xb30 fs/nilfs2/segment.c:2379
> >  nilfs_segctor_thread_construct fs/nilfs2/segment.c:2487 [inline]
> >  nilfs_segctor_thread+0x3c3/0xf30 fs/nilfs2/segment.c:2570
> >  kthread+0x2e4/0x3a0 kernel/kthread.c:376
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
> >  </TASK>
> > Modules linked in:
> > ---[ end trace 0000000000000000 ]---
> > RIP: 0010:nilfs_palloc_commit_free_entry+0xe5/0x6b0
> > fs/nilfs2/alloc.c:608
> > Code: 00 00 00 00 fc ff df 80 3c 02 00 0f 85 cd 05 00 00 48 b8 00 00 00
> > 00 00 fc ff df 4c 8b 73 08 49 8d 7e 10 48 89 fa 48 c1 ea 03 <80> 3c 02
> > 00 0f 85 26 05 00 00 49 8b 46 10 be a6 00 00 00 48 c7 c7
> > RSP: 0018:ffffc90003dff830 EFLAGS: 00010212
> > RAX: dffffc0000000000 RBX: ffff88802594e218 RCX: 000000000000000d
> > RDX: 0000000000000002 RSI: 0000000000002000 RDI: 0000000000000010
> > RBP: ffff888071880222 R08: 0000000000000005 R09: 000000000000003f
> > R10: 000000000000000d R11: 0000000000000000 R12: ffff888071880158
> > R13: ffff88802594e220 R14: 0000000000000000 R15: 0000000000000004
> > FS:  0000000000000000(0000) GS:ffff8880b9b00000(0000)
> > knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fb1c08316a8 CR3: 0000000018560000 CR4: 0000000000350ee0
> > ----------------
> > Code disassembly (best guess), 7 bytes skipped:
> >    0:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
> >    4:   0f 85 cd 05 00 00       jne    0x5d7
> >    a:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
> >   11:   fc ff df
> >   14:   4c 8b 73 08             mov    0x8(%rbx),%r14
> >   18:   49 8d 7e 10             lea    0x10(%r14),%rdi
> >   1c:   48 89 fa                mov    %rdi,%rdx
> >   1f:   48 c1 ea 03             shr    $0x3,%rdx
> > * 23:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1) <-- trapping
> > instruction
> >   27:   0f 85 26 05 00 00       jne    0x553
> >   2d:   49 8b 46 10             mov    0x10(%r14),%rax
> >   31:   be a6 00 00 00          mov    $0xa6,%esi
> >   36:   48                      rex.W
> >   37:   c7                      .byte 0xc7
> >   38:   c7                      .byte 0xc7
> >
> > When maxlevelp is 1, there is a case where req->pr_desc_bh is NULL and
> > blocknr is 0, because nilfs_dat_commit_alloc() will modify the blocknr
> > of oldreq at one level higher to 0. And we don't have a NULL check on
> > req->pr_desc_bh and req->pr_bitmap_bh in
> > nilfs_palloc_commit_free_entry() function, so when req->pr_desc_bh is
> > NULL and kmap() dereferences a NULL pointer, it leads to above crash.
> > Fix this by adding a NULL check on req->pr_desc_bh and req->pr_bitmap_bh
> > before nilfs_palloc_commit_free_entry() in nilfs_dat_commit_free().
> >
> > Reported-by: syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8@public.gmane.org
> > Fixes: bd8169efae8b ("nilfs2: add update functions of virtual block address to dat")
> > Signed-off-by: ZhangPeng <zhangpeng362-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > ---
> >  fs/nilfs2/dat.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/fs/nilfs2/dat.c b/fs/nilfs2/dat.c
> > index 3b55e239705f..84ce050a8fa3 100644
> > --- a/fs/nilfs2/dat.c
> > +++ b/fs/nilfs2/dat.c
> > @@ -111,6 +111,9 @@ static void nilfs_dat_commit_free(struct inode *dat,
> >         kunmap_atomic(kaddr);
> >
> >         nilfs_dat_commit_entry(dat, req);
> > +
> > +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
> > +               return;
> >         nilfs_palloc_commit_free_entry(dat, req);
> >  }
> >
> > --
> > 2.25.1
> >
>
> Thank you for your help.
>
> This patch actually fixes the NULL pointer dereference, and doesn't
> seem to cause any regressions so far.
> But, if de_blocknr is 0 in the first place, it assumes that
> req->pr_desc_bh and req->pr_bitmap_bh are set properly in the
> corresponding "prepare" operation, nilfs_dat_prepare_end(), so I'd
> like to review in a little more detail why that is broken.
>
> Thanks,
> Ryusuke Konishi

What was happening here was complicated.
There were two state inconsistencies in DAT:
1) A virtual block number (vblocknr == 6145) in the report, was used
twice in a btree.
    Because of this, nilfs_dat_commit_end() on that address was called twice.
2) For the virtual block number (vblocknr == 6145), the DAT bitmap
status was free (second anomaly), and nilfs_dat_commit_alloc() was
performed at the same time as the second nilfs_dat_commit_end() above.

As I mentioned, when nilfs_dat_commit_end() frees a virtual block
number in the bitmap of DAT, nilfs_dat_prepare_end() usually allocates
buffer heads of the bitmap block and the descriptor block, and sets
them to the request struct.  But, the above two anomalies overlap, as
you pointed out, de_block is initialized to 0 by
nilfs_dat_prepare_alloc() in the middle of "prepare" and "commit", and
nilfs_dat_commit_end() tries to use unallocated buffer heads.

As a workaround, your patch only needs two minor fixes that I can handle:
1) The following part should use unlikely() annotation
 +       if (req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)
    like
 +       if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL))
    because this usually doesn't happen.
2) The "Fixes" tag should be removed, because it's not the cause of this bug.

The problem is that this fix doesn't eliminate the double usage of the
virtual block number, and it leaves de_start and de_end of the virtual
block number broken, which leads to disk GC failure.

So, at least, we should call nilfs_error() to signal that the
filesystem is fatally flawed and prevent further operations.

Please give me a few more days to consider how to deal with it. I
would like to think about whether there is a good solution including
the GC problem.

As for the current outlook, it seems difficult to eliminate
duplication of the virtual block number, so I think we will call
nilfs_error() as follows:

        nilfs_dat_commit_entry(dat, req);
+        if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)) {
+                nilfs_error(dat->i_sb,
+                            "state inconsistency due to duplicate use
of vblocknr = %llu",
+                            (unsigned long long)req->pr_entry_nr);
+                return;
+       }
        nilfs_palloc_commit_free_entry(dat, req);

In that case, I would like to modify your patch and send it upstream
as your patch.  Is that OK ?
Or do you want to fix it more by yourself ?

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  1:02       ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  1:02 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen, chi.minghao, vishal.moola, linux-nilfs,
	linux-kernel, syzbot+ebe05ee8e98f755f61d0

On Sat, Nov 19, 2022 at 9:38 AM Ryusuke Konishi wrote:
> On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
> > On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
> What was happening here was complicated.
> There were two state inconsistencies in DAT:
> 1) A virtual block number (vblocknr == 6145) in the report, was used
> twice in a btree.
>     Because of this, nilfs_dat_commit_end() on that address was called twice.
> 2) For the virtual block number (vblocknr == 6145), the DAT bitmap
> status was free (second anomaly), and nilfs_dat_commit_alloc() was
> performed at the same time as the second nilfs_dat_commit_end() above.
>
> As I mentioned, when nilfs_dat_commit_end() frees a virtual block
> number in the bitmap of DAT, nilfs_dat_prepare_end() usually allocates
> buffer heads of the bitmap block and the descriptor block, and sets
> them to the request struct.  But, the above two anomalies overlap, as

> you pointed out, de_block is initialized to 0 by
> nilfs_dat_prepare_alloc() in the middle of "prepare" and "commit", and

I wrote it wrong, this should be ".., de_blocknr is initialized to 0
by nilfs_dat_commit_alloc() ..".

Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  1:02       ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  1:02 UTC (permalink / raw)
  To: Peng Zhang
  Cc: ye.xingchen-Th6q7B73Y6EnDS1+zs4M5A,
	chi.minghao-Th6q7B73Y6EnDS1+zs4M5A,
	vishal.moola-Re5JQEeQqe8AvxtiuMwx3w,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8

On Sat, Nov 19, 2022 at 9:38 AM Ryusuke Konishi wrote:
> On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
> > On Mon, Nov 14, 2022 at 12:39 PM Peng Zhang wrote:
> What was happening here was complicated.
> There were two state inconsistencies in DAT:
> 1) A virtual block number (vblocknr == 6145) in the report, was used
> twice in a btree.
>     Because of this, nilfs_dat_commit_end() on that address was called twice.
> 2) For the virtual block number (vblocknr == 6145), the DAT bitmap
> status was free (second anomaly), and nilfs_dat_commit_alloc() was
> performed at the same time as the second nilfs_dat_commit_end() above.
>
> As I mentioned, when nilfs_dat_commit_end() frees a virtual block
> number in the bitmap of DAT, nilfs_dat_prepare_end() usually allocates
> buffer heads of the bitmap block and the descriptor block, and sets
> them to the request struct.  But, the above two anomalies overlap, as

> you pointed out, de_block is initialized to 0 by
> nilfs_dat_prepare_alloc() in the middle of "prepare" and "commit", and

I wrote it wrong, this should be ".., de_blocknr is initialized to 0
by nilfs_dat_commit_alloc() ..".

Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  7:24         ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  7:24 UTC (permalink / raw)
  To: zhangpeng (AS)
  Cc: ye.xingchen, chi.minghao, vishal.moola, linux-nilfs,
	linux-kernel, syzbot+ebe05ee8e98f755f61d0

On Sat, Nov 19, 2022 at 3:38 PM zhangpeng (AS)  wrote:
>
> Hi, ZhangPeng,
>
> On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
> As for the current outlook, it seems difficult to eliminate
> duplication of the virtual block number, so I think we will call
> nilfs_error() as follows:
>
>         nilfs_dat_commit_entry(dat, req);
> +        if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)) {
> +                nilfs_error(dat->i_sb,
> +                            "state inconsistency due to duplicate use
> of vblocknr = %llu",
> +                            (unsigned long long)req->pr_entry_nr);
> +                return;
> +       }
>         nilfs_palloc_commit_free_entry(dat, req);
>
> In that case, I would like to modify your patch and send it upstream
> as your patch.  Is that OK ?
> Or do you want to fix it more by yourself ?
>
> Thanks,
> Ryusuke Konishi
>
> Thanks for your advice.
>
> Please modify my patch and send it upstream.
>
> That is really a complicated problem. Duplication of the virtual block
> number is not easy to fix. It is necessary to prevent further operations
>  of the filesystem when the filesystem has a fatal flaw. I will continue
>  to follow up.

All right, thanks!

Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
@ 2022-11-19  7:24         ` Ryusuke Konishi
  0 siblings, 0 replies; 10+ messages in thread
From: Ryusuke Konishi @ 2022-11-19  7:24 UTC (permalink / raw)
  To: zhangpeng (AS)
  Cc: ye.xingchen-Th6q7B73Y6EnDS1+zs4M5A,
	chi.minghao-Th6q7B73Y6EnDS1+zs4M5A,
	vishal.moola-Re5JQEeQqe8AvxtiuMwx3w,
	linux-nilfs-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	syzbot+ebe05ee8e98f755f61d0-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8

On Sat, Nov 19, 2022 at 3:38 PM zhangpeng (AS)  wrote:
>
> Hi, ZhangPeng,
>
> On Tue, Nov 15, 2022 at 3:39 AM Ryusuke Konishi  wrote:
> As for the current outlook, it seems difficult to eliminate
> duplication of the virtual block number, so I think we will call
> nilfs_error() as follows:
>
>         nilfs_dat_commit_entry(dat, req);
> +        if (unlikely(req->pr_desc_bh == NULL || req->pr_bitmap_bh == NULL)) {
> +                nilfs_error(dat->i_sb,
> +                            "state inconsistency due to duplicate use
> of vblocknr = %llu",
> +                            (unsigned long long)req->pr_entry_nr);
> +                return;
> +       }
>         nilfs_palloc_commit_free_entry(dat, req);
>
> In that case, I would like to modify your patch and send it upstream
> as your patch.  Is that OK ?
> Or do you want to fix it more by yourself ?
>
> Thanks,
> Ryusuke Konishi
>
> Thanks for your advice.
>
> Please modify my patch and send it upstream.
>
> That is really a complicated problem. Duplication of the virtual block
> number is not easy to fix. It is necessary to prevent further operations
>  of the filesystem when the filesystem has a fatal flaw. I will continue
>  to follow up.

All right, thanks!

Ryusuke Konishi

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-11-19  7:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-14  4:04 [PATCH] nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() Peng Zhang
2022-11-14  4:04 ` Peng Zhang
2022-11-14 18:39 ` Ryusuke Konishi
2022-11-14 18:39   ` Ryusuke Konishi
2022-11-19  0:38   ` Ryusuke Konishi
2022-11-19  0:38     ` Ryusuke Konishi
2022-11-19  1:02     ` Ryusuke Konishi
2022-11-19  1:02       ` Ryusuke Konishi
     [not found]     ` <422d9c54-a119-b2a2-f381-11f83af3d9ea@huawei.com>
2022-11-19  7:24       ` Ryusuke Konishi
2022-11-19  7:24         ` Ryusuke Konishi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.