linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* general protection fault in wb_timer_fn
@ 2021-09-13  2:00 Hao Sun
  2021-09-14 10:45 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Sun @ 2021-09-13  2:00 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: linux-kernel

Hello,

When using Healer to fuzz the latest Linux kernel, the following crash
was triggered.

HEAD commit:  4b93c544e90e-thunderbolt: test: split up test cases
git tree: upstream
console output:
https://drive.google.com/file/d/1naN-5p-rFgKpHrshO_kQr5f_KlhvFGGU/view?usp=sharing
kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing
C reproducer: https://drive.google.com/file/d/19EDhssGw_V1oO2vWOPrgQqve0TdFSgvh/view?usp=sharing
Syzlang reproducer:
https://drive.google.com/file/d/13EGCCoaMe9oitrQCfy44BkGVLbKKFP6X/view?usp=sharing
Similar report:
https://groups.google.com/g/syzkaller-bugs/c/H7fKH_5GVSE/m/-1aj5-d1BAAJ

If you fix this issue, please add the following tag to the commit:
Reported-by: Hao Sun <sunhao.th@gmail.com>

general protection fault, probably for non-canonical address
0xdffffc0000000029: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000148-0x000000000000014f]
CPU: 2 PID: 8539 Comm: syz-executor Not tainted 5.14.0+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:latency_exceeded block/blk-wbt.c:237 [inline]
RIP: 0010:wb_timer_fn+0x149/0x2080 block/blk-wbt.c:360
Code: 03 80 3c 02 00 0f 85 08 1c 00 00 48 8b 9b b0 00 00 00 48 b8 00
00 00 00 00 fc ff df 48 8d bb 48 01 00 00 48 89 fa 48 c1 ea 03 <80> 3c
02 00 0f 85 d5 1b 00 00 48 8b 83 48 01 00 00 48 8d 7d 28 48
RSP: 0018:ffffc90000848cd0 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff888014e79c80
RDX: 0000000000000029 RSI: ffff888014e79c80 RDI: 0000000000000148
RBP: ffff88801e2adc00 R08: ffffffff83de2dfd R09: 0000000000000003
R10: 0000000000000005 R11: ffffed1003c55bb0 R12: 0000000000000003
R13: 0000000000000000 R14: ffff8881053bc800 R15: ffff88801e2adcd0
FS:  0000000003176940(0000) GS:ffff888063f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000014d53ad CR3: 000000002e5ff000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421
 expire_timers kernel/time/timer.c:1466 [inline]
 __run_timers.part.0+0x6b0/0xa90 kernel/time/timer.c:1734
 __run_timers kernel/time/timer.c:1715 [inline]
 run_timer_softirq+0xb6/0x1d0 kernel/time/timer.c:1747
 __do_softirq+0x1d7/0x93b kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu kernel/softirq.c:636 [inline]
 irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:638
RIP: 0010:get_current arch/x86/include/asm/current.h:15 [inline]
RIP: 0010:write_comp_data+0xe/0x70 kernel/kcov.c:217
Code: 00 8b 89 1c 15 00 00 48 8b 02 48 83 c0 01 48 39 c1 76 07 4c 89
04 c2 48 89 02 c3 90 49 89 fa bf 03 00 00 00 49 89 f1 49 89 d0 <65> 48
8b 34 25 40 f0 01 00 e8 34 ff ff ff 84 c0 74 44 48 8b 86 20
RSP: 0018:ffffc9000106f7f8 EFLAGS: 00000246
RAX: 000000000d0444bb RBX: 80000000fd589217 RCX: ffffffff81ac95bd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000007 R11: ffffed100247709c R12: 0000000000000000
R13: 0000000000000001 R14: dffffc0000000000 R15: ffff88802e806788
 copy_present_pte mm/memory.c:976 [inline]
 copy_pte_range mm/memory.c:1070 [inline]
 copy_pmd_range mm/memory.c:1156 [inline]
 copy_pud_range mm/memory.c:1193 [inline]
 copy_p4d_range mm/memory.c:1217 [inline]
 copy_page_range+0xf4d/0x4750 mm/memory.c:1290
 dup_mmap kernel/fork.c:610 [inline]
 dup_mm+0xa50/0x13d0 kernel/fork.c:1454
 copy_mm kernel/fork.c:1506 [inline]
 copy_process+0x6c23/0x73d0 kernel/fork.c:2195
 kernel_clone+0xe7/0x10d0 kernel/fork.c:2585
 __do_sys_clone+0xc8/0x110 kernel/fork.c:2702
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x471f0f
Code: ed 0f 85 64 01 00 00 64 4c 8b 0c 25 10 00 00 00 45 31 c0 4d 8d
91 d0 02 00 00 31 d2 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d
00 f0 ff ff 0f 87 f5 00 00 00 41 89 c5 85 c0 0f 85 fc 00 00
RSP: 002b:00007ffc62d06000 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000471f0f
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000003176940
R10: 0000000003176c10 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000005 R14: 00007ffc62d060b0 R15: 00007ffc62d0606c
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace ef2331b2238dd17a ]---
RIP: 0010:latency_exceeded block/blk-wbt.c:237 [inline]
RIP: 0010:wb_timer_fn+0x149/0x2080 block/blk-wbt.c:360
Code: 03 80 3c 02 00 0f 85 08 1c 00 00 48 8b 9b b0 00 00 00 48 b8 00
00 00 00 00 fc ff df 48 8d bb 48 01 00 00 48 89 fa 48 c1 ea 03 <80> 3c
02 00 0f 85 d5 1b 00 00 48 8b 83 48 01 00 00 48 8d 7d 28 48
RSP: 0018:ffffc90000848cd0 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff888014e79c80
RDX: 0000000000000029 RSI: ffff888014e79c80 RDI: 0000000000000148
RBP: ffff88801e2adc00 R08: ffffffff83de2dfd R09: 0000000000000003
R10: 0000000000000005 R11: ffffed1003c55bb0 R12: 0000000000000003
R13: 0000000000000000 R14: ffff8881053bc800 R15: ffff88801e2adcd0
FS:  0000000003176940(0000) GS:ffff888063f00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000014d53ad CR3: 000000002e5ff000 CR4: 0000000000350ee0
----------------
Code disassembly (best guess):
   0: 03 80 3c 02 00 0f    add    0xf00023c(%rax),%eax
   6: 85 08                test   %ecx,(%rax)
   8: 1c 00                sbb    $0x0,%al
   a: 00 48 8b              add    %cl,-0x75(%rax)
   d: 9b                    fwait
   e: b0 00                mov    $0x0,%al
  10: 00 00                add    %al,(%rax)
  12: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
  19: fc ff df
  1c: 48 8d bb 48 01 00 00 lea    0x148(%rbx),%rdi
  23: 48 89 fa              mov    %rdi,%rdx
  26: 48 c1 ea 03          shr    $0x3,%rdx
* 2a: 80 3c 02 00          cmpb   $0x0,(%rdx,%rax,1) <-- trapping instruction
  2e: 0f 85 d5 1b 00 00    jne    0x1c09
  34: 48 8b 83 48 01 00 00 mov    0x148(%rbx),%rax
  3b: 48 8d 7d 28          lea    0x28(%rbp),%rdi
  3f: 48                    rex.W%

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-13  2:00 general protection fault in wb_timer_fn Hao Sun
@ 2021-09-14 10:45 ` Christoph Hellwig
  2021-09-15  1:49   ` Hao Sun
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-09-14 10:45 UTC (permalink / raw)
  To: Hao Sun; +Cc: Jens Axboe, linux-block, linux-kernel

On Mon, Sep 13, 2021 at 10:00:27AM +0800, Hao Sun wrote:
> Hello,
> 
> When using Healer to fuzz the latest Linux kernel, the following crash
> was triggered.
> 
> HEAD commit:  4b93c544e90e-thunderbolt: test: split up test cases
> git tree: upstream
> console output:
> https://drive.google.com/file/d/1naN-5p-rFgKpHrshO_kQr5f_KlhvFGGU/view?usp=sharing
> kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing
> C reproducer: https://drive.google.com/file/d/19EDhssGw_V1oO2vWOPrgQqve0TdFSgvh/view?usp=sharing
> Syzlang reproducer:
> https://drive.google.com/file/d/13EGCCoaMe9oitrQCfy44BkGVLbKKFP6X/view?usp=sharing

All these google drive links just lead me to badly localized error
messages.  Can you upload these to a less broken hosting platform?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-14 10:45 ` Christoph Hellwig
@ 2021-09-15  1:49   ` Hao Sun
  2021-09-15  7:29     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Sun @ 2021-09-15  1:49 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, linux-kernel

Christoph Hellwig <hch@infradead.org> 于2021年9月14日周二 下午6:46写道:
>
> On Mon, Sep 13, 2021 at 10:00:27AM +0800, Hao Sun wrote:
> > Hello,
> >
> > When using Healer to fuzz the latest Linux kernel, the following crash
> > was triggered.
> >
> > HEAD commit:  4b93c544e90e-thunderbolt: test: split up test cases
> > git tree: upstream
> > console output:
> > https://drive.google.com/file/d/1naN-5p-rFgKpHrshO_kQr5f_KlhvFGGU/view?usp=sharing
> > kernel config: https://drive.google.com/file/d/1c0u2EeRDhRO-ZCxr9MP2VvAtJd6kfg-p/view?usp=sharing
> > C reproducer: https://drive.google.com/file/d/19EDhssGw_V1oO2vWOPrgQqve0TdFSgvh/view?usp=sharing
> > Syzlang reproducer:
> > https://drive.google.com/file/d/13EGCCoaMe9oitrQCfy44BkGVLbKKFP6X/view?usp=sharing
>
> All these google drive links just lead me to badly localized error
> messages.  Can you upload these to a less broken hosting platform?

console output: https://paste.ubuntu.com/p/5qHqPXWmCQ/
kernel config: https://paste.ubuntu.com/p/VsVbFh9ZpQ/
C reproducer: https://paste.ubuntu.com/p/yrYsn4zpcn/
Syzlang reproducer: https://paste.ubuntu.com/p/bCWyNyHncJ/

Just tried the C reproducer on the latest Linux kernel (6880fa6c5660
Linux 5.15-rc1).
The reproducer still crashed the kernel but with a different backtrace.

IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
Bluetooth: hci0: command 0x0409 tx timeout
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1510!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.14.0+ #15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Workqueue: events delayed_fput
RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4aef93fb08 CR3: 0000000108cf2000 CR4: 0000000000750ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 do_invalidatepage -origin/mm/truncate.c:157 [inline]
 truncate_cleanup_page+0x15c/0x280 -origin/mm/truncate.c:176
 truncate_inode_pages_range+0x169/0xc30 -origin/mm/truncate.c:325
 kill_bdev.isra.29+0x28/0x30
 blkdev_flush_mapping+0x4c/0x130 -origin/block/bdev.c:658
 blkdev_put_whole+0x54/0x60 -origin/block/bdev.c:689
 blkdev_put+0x6f/0x210 -origin/block/bdev.c:953
 blkdev_close+0x25/0x30 -origin/block/fops.c:459
 __fput+0xdf/0x380 -origin/fs/file_table.c:280
 delayed_fput+0x25/0x40 -origin/fs/file_table.c:308
 process_one_work+0x359/0x850 -origin/kernel/workqueue.c:2297
 worker_thread+0x41/0x4d0 -origin/kernel/workqueue.c:2444
 kthread+0x178/0x1b0 -origin/kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 -origin/arch/x86/entry/entry_64.S:295
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace 9dbb8f58f2109f10 ]---
RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff98674f000 CR3: 0000000106b2e000 CR4: 0000000000750ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-15  1:49   ` Hao Sun
@ 2021-09-15  7:29     ` Christoph Hellwig
  2021-09-15 19:42       ` Yang Shi
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-09-15  7:29 UTC (permalink / raw)
  To: Hao Sun
  Cc: Christoph Hellwig, Jens Axboe, linux-block, linux-kernel, linux-mm

On Wed, Sep 15, 2021 at 09:49:49AM +0800, Hao Sun wrote:
> console output: https://paste.ubuntu.com/p/5qHqPXWmCQ/
> kernel config: https://paste.ubuntu.com/p/VsVbFh9ZpQ/
> C reproducer: https://paste.ubuntu.com/p/yrYsn4zpcn/
> Syzlang reproducer: https://paste.ubuntu.com/p/bCWyNyHncJ/
> 
> Just tried the C reproducer on the latest Linux kernel (6880fa6c5660
> Linux 5.15-rc1).
> The reproducer still crashed the kernel but with a different backtrace.

Well, that trace looks very much like an issue in the MM truncate code.
Adding the linux-mm list.

> 
> IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
> Bluetooth: hci0: command 0x0409 tx timeout
> ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1510!
> invalid opcode: 0000 [#1] PREEMPT SMP
> CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.14.0+ #15
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> Workqueue: events delayed_fput
> RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
> Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
> c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
> e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
> RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
> RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
> RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
> R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
> R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
> FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4aef93fb08 CR3: 0000000108cf2000 CR4: 0000000000750ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
>  do_invalidatepage -origin/mm/truncate.c:157 [inline]
>  truncate_cleanup_page+0x15c/0x280 -origin/mm/truncate.c:176
>  truncate_inode_pages_range+0x169/0xc30 -origin/mm/truncate.c:325
>  kill_bdev.isra.29+0x28/0x30
>  blkdev_flush_mapping+0x4c/0x130 -origin/block/bdev.c:658
>  blkdev_put_whole+0x54/0x60 -origin/block/bdev.c:689
>  blkdev_put+0x6f/0x210 -origin/block/bdev.c:953
>  blkdev_close+0x25/0x30 -origin/block/fops.c:459
>  __fput+0xdf/0x380 -origin/fs/file_table.c:280
>  delayed_fput+0x25/0x40 -origin/fs/file_table.c:308
>  process_one_work+0x359/0x850 -origin/kernel/workqueue.c:2297
>  worker_thread+0x41/0x4d0 -origin/kernel/workqueue.c:2444
>  kthread+0x178/0x1b0 -origin/kernel/kthread.c:319
>  ret_from_fork+0x1f/0x30 -origin/arch/x86/entry/entry_64.S:295
> Modules linked in:
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> ---[ end trace 9dbb8f58f2109f10 ]---
> RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
> Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
> c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
> e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
> RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
> RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
> RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
> R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
> R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
> FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ff98674f000 CR3: 0000000106b2e000 CR4: 0000000000750ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
---end quoted text---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-15  7:29     ` Christoph Hellwig
@ 2021-09-15 19:42       ` Yang Shi
  2021-09-16 10:47         ` Hao Sun
  0 siblings, 1 reply; 7+ messages in thread
From: Yang Shi @ 2021-09-15 19:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hao Sun, Jens Axboe, linux-block, Linux Kernel Mailing List, Linux MM

On Wed, Sep 15, 2021 at 12:34 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Sep 15, 2021 at 09:49:49AM +0800, Hao Sun wrote:
> > console output: https://paste.ubuntu.com/p/5qHqPXWmCQ/
> > kernel config: https://paste.ubuntu.com/p/VsVbFh9ZpQ/
> > C reproducer: https://paste.ubuntu.com/p/yrYsn4zpcn/
> > Syzlang reproducer: https://paste.ubuntu.com/p/bCWyNyHncJ/
> >
> > Just tried the C reproducer on the latest Linux kernel (6880fa6c5660
> > Linux 5.15-rc1).
> > The reproducer still crashed the kernel but with a different backtrace.
>
> Well, that trace looks very much like an issue in the MM truncate code.
> Adding the linux-mm list.

The BUG is triggered if it tries to invalidate across pages. But it
hardcoded PAGE_SIZE. The offset passed in by truncate_cleanup_page()
is 0, but the length might be > PAGE_SIZE if it is a compound page. It
might be caused by READ_ONLY_THP_FOR_FS.

Could you please try the below debug patch to dump page details? I saw
your kernel config has DEBUG_VM enabled.

diff --git a/fs/buffer.c b/fs/buffer.c
index ab7573d72dd7..ed7256112c2b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1507,7 +1507,8 @@ void block_invalidatepage(struct page *page,
unsigned int offset,
        /*
         * Check for overflow
         */
-       BUG_ON(stop > PAGE_SIZE || stop < length);
+       VM_BUG_ON_PAGE((stop > PAGE_SIZE), page);
+       VM_BUG_ON_PAGE((stop < length), page);

        head = page_buffers(page);
        bh = head;

If my speculation is correct, I think the below patch should be able
to fix this issue.

diff --git a/fs/buffer.c b/fs/buffer.c
index ab7573d72dd7..18428cee59af 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1507,7 +1507,7 @@ void block_invalidatepage(struct page *page,
unsigned int offset,
        /*
         * Check for overflow
         */
-       BUG_ON(stop > PAGE_SIZE || stop < length);
+       BUG_ON(stop > thp_size(page) || stop < length);

        head = page_buffers(page);
        bh = head;

>
> >
> > IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> > IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
> > Bluetooth: hci0: command 0x0409 tx timeout
> > ------------[ cut here ]------------
> > kernel BUG at fs/buffer.c:1510!
> > invalid opcode: 0000 [#1] PREEMPT SMP
> > CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.14.0+ #15
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> > Workqueue: events delayed_fput
> > RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
> > Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
> > c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
> > e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
> > RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
> > RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
> > RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
> > RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
> > R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
> > R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
> > FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f4aef93fb08 CR3: 0000000108cf2000 CR4: 0000000000750ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > PKRU: 55555554
> > Call Trace:
> >  do_invalidatepage -origin/mm/truncate.c:157 [inline]
> >  truncate_cleanup_page+0x15c/0x280 -origin/mm/truncate.c:176
> >  truncate_inode_pages_range+0x169/0xc30 -origin/mm/truncate.c:325
> >  kill_bdev.isra.29+0x28/0x30
> >  blkdev_flush_mapping+0x4c/0x130 -origin/block/bdev.c:658
> >  blkdev_put_whole+0x54/0x60 -origin/block/bdev.c:689
> >  blkdev_put+0x6f/0x210 -origin/block/bdev.c:953
> >  blkdev_close+0x25/0x30 -origin/block/fops.c:459
> >  __fput+0xdf/0x380 -origin/fs/file_table.c:280
> >  delayed_fput+0x25/0x40 -origin/fs/file_table.c:308
> >  process_one_work+0x359/0x850 -origin/kernel/workqueue.c:2297
> >  worker_thread+0x41/0x4d0 -origin/kernel/workqueue.c:2444
> >  kthread+0x178/0x1b0 -origin/kernel/kthread.c:319
> >  ret_from_fork+0x1f/0x30 -origin/arch/x86/entry/entry_64.S:295
> > Modules linked in:
> > Dumping ftrace buffer:
> >    (ftrace buffer empty)
> > ---[ end trace 9dbb8f58f2109f10 ]---
> > RIP: 0010:block_invalidatepage+0x27f/0x2a0 -origin/fs/buffer.c:1510
> > Code: ff ff e8 b4 07 d7 ff b9 02 00 00 00 be 02 00 00 00 4c 89 ff 48
> > c7 c2 40 4e 25 84 e8 2b c2 c4 02 e9 c9 fe ff ff e8 91 07 d7 ff <0f> 0b
> > e8 8a 07 d7 ff 0f 0b e8 83 07 d7 ff 48 8d 5d ff e9 57 ff ff
> > RSP: 0018:ffffc9000065bb60 EFLAGS: 00010293
> > RAX: 0000000000000000 RBX: ffffea0000670000 RCX: 0000000000000000
> > RDX: ffff8880097fa240 RSI: ffffffff81608a9f RDI: ffffea0000670000
> > RBP: ffffea0000670000 R08: 0000000000000001 R09: 0000000000000000
> > R10: ffffc9000065b9f8 R11: 0000000000000003 R12: ffffffff81608820
> > R13: ffffc9000065bc68 R14: 0000000000000000 R15: ffffc9000065bbf0
> > FS:  0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007ff98674f000 CR3: 0000000106b2e000 CR4: 0000000000750ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > PKRU: 55555554
> ---end quoted text---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-15 19:42       ` Yang Shi
@ 2021-09-16 10:47         ` Hao Sun
  2021-09-16 20:18           ` Yang Shi
  0 siblings, 1 reply; 7+ messages in thread
From: Hao Sun @ 2021-09-16 10:47 UTC (permalink / raw)
  To: Yang Shi
  Cc: Christoph Hellwig, Jens Axboe, linux-block,
	Linux Kernel Mailing List, Linux MM

> The BUG is triggered if it tries to invalidate across pages. But it
> hardcoded PAGE_SIZE. The offset passed in by truncate_cleanup_page()
> is 0, but the length might be > PAGE_SIZE if it is a compound page. It
> might be caused by READ_ONLY_THP_FOR_FS.
>
> Could you please try the below debug patch to dump page details? I saw
> your kernel config has DEBUG_VM enabled.
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index ab7573d72dd7..ed7256112c2b 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1507,7 +1507,8 @@ void block_invalidatepage(struct page *page,
> unsigned int offset,
>         /*
>          * Check for overflow
>          */
> -       BUG_ON(stop > PAGE_SIZE || stop < length);
> +       VM_BUG_ON_PAGE((stop > PAGE_SIZE), page);
> +       VM_BUG_ON_PAGE((stop < length), page);
>
>         head = page_buffers(page);
>         bh = head;
>

Just patched it.
The following log was printed after executing the C reproducer.

IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
Bluetooth: hci0: command 0x0409 tx timeout
Bluetooth: hci0: command 0x041b tx timeout
Bluetooth: hci0: command 0x040f tx timeout
Bluetooth: hci0: command 0x0419 tx timeout
page:ffffea00009c0000 refcount:514 mapcount:0 mapping:ffff8881060db7b0
index:0x0 pfn:0x27000
head:ffffea00009c0000 order:9 compound_mapcount:0 compound_pincount:0
memcg:ffff8880118bc000
aops:def_blk_aops ino:fa00000
flags: 0xfff00000012037(locked|referenced|uptodate|lru|active|private|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000012037 ffffea00009491c8 ffff888010c7d030 ffff8881060db7b0
raw: 0000000000000000 ffff888025d10658 00000202ffffffff ffff8880118bc000
page dumped because: VM_BUG_ON_PAGE((stop > ((1UL) << 12)))
page_owner tracks the page as allocated
page last allocated via order 9, migratetype Movable, gfp_mask
0x13c24ca(GFP_TRANSHUGE|__GFP_THISNODE), pid 35, ts 579054356699,
free_ts 543915753195
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook mm/page_alloc.c:2418 [inline]
 prep_new_page+0x1a5/0x240 mm/page_alloc.c:2424
 get_page_from_freelist+0x1f10/0x3b70 mm/page_alloc.c:4153
 __alloc_pages+0x306/0x6e0 mm/page_alloc.c:5375
 __alloc_pages_node include/linux/gfp.h:570 [inline]
 khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:881
 collapse_file+0x20a/0x45f0 mm/khugepaged.c:1655
 khugepaged_scan_file mm/khugepaged.c:2051 [inline]
 khugepaged_scan_mm_slot mm/khugepaged.c:2146 [inline]
 khugepaged_do_scan mm/khugepaged.c:2230 [inline]
 khugepaged+0x2e65/0x5c50 mm/khugepaged.c:2275
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1338 [inline]
 free_pcp_prepare+0x412/0x900 mm/page_alloc.c:1389
 free_unref_page_prepare mm/page_alloc.c:3315 [inline]
 free_unref_page+0x19/0x580 mm/page_alloc.c:3394
 release_pages+0x87f/0x2920 mm/swap.c:926
 tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
 tlb_flush_mmu+0x8d/0x610 mm/mmu_gather.c:249
 tlb_finish_mmu+0x93/0x3c0 mm/mmu_gather.c:340
 unmap_region+0x27f/0x350 mm/mmap.c:2653
 __do_munmap+0xabc/0x11e0 mm/mmap.c:2884
 do_munmap mm/mmap.c:2895 [inline]
 munmap_vma_range mm/mmap.c:603 [inline]
 mmap_region+0x2c4/0x1340 mm/mmap.c:1742
 do_mmap+0x7f5/0xe60 mm/mmap.c:1575
 vm_mmap_pgoff+0x1b7/0x290 mm/util.c:519
 ksys_mmap_pgoff+0x49f/0x620 mm/mmap.c:1624
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1511!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 2954 Comm: kworker/1:2 Not tainted 5.15.0-rc1+ #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Workqueue: events delayed_fput
RIP: 0010:block_invalidatepage+0x599/0x680 fs/buffer.c:1511
Code: 0f 0b e8 9a 6b 9c ff 31 f6 4c 89 e7 e8 10 d7 bf ff e9 df fe ff
ff e8 86 6b 9c ff 48 c7 c6 00 4e 9a 89 4c 89 e7 e8 c7 16 d0 ff <0f> 0b
e8 70 6b 9c ff 48 c7 c6 60 4e 9a 89 4c 89 e7 e8 b1 16 d0 ff
RSP: 0018:ffffc9000e9078b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888100bd9c80
RDX: 0000000000000000 RSI: ffff888100bd9c80 RDI: 0000000000000002
RBP: 0000000000000000 R08: ffffffff81d9e369 R09: 000000000000ffff
R10: 0000000000000003 R11: ffffed1026b83f53 R12: ffffea00009c0000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000200000
FS:  0000000000000000(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffde41e8960 CR3: 0000000019cef000 CR4: 0000000000350ee0
Call Trace:
 do_invalidatepage mm/truncate.c:157 [inline]
 truncate_cleanup_page+0x3e4/0x620 mm/truncate.c:176
 truncate_inode_pages_range+0x26c/0x1910 mm/truncate.c:325
 kill_bdev.isra.0+0x5f/0x80 block/bdev.c:77
 blkdev_flush_mapping+0xdf/0x2e0 block/bdev.c:658
 blkdev_put_whole+0xe8/0x110 block/bdev.c:689
 blkdev_put+0x23c/0x6f0 block/bdev.c:953
 blkdev_close+0x8d/0xb0 block/fops.c:459
 __fput+0x288/0x9f0 fs/file_table.c:280
 delayed_fput+0x56/0x70 fs/file_table.c:308
 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297
 worker_thread+0x90/0xed0 kernel/workqueue.c:2444
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace 77cfca54575c5255 ]---
RIP: 0010:block_invalidatepage+0x599/0x680 fs/buffer.c:1511
Code: 0f 0b e8 9a 6b 9c ff 31 f6 4c 89 e7 e8 10 d7 bf ff e9 df fe ff
ff e8 86 6b 9c ff 48 c7 c6 00 4e 9a 89 4c 89 e7 e8 c7 16 d0 ff <0f> 0b
e8 70 6b 9c ff 48 c7 c6 60 4e 9a 89 4c 89 e7 e8 b1 16 d0 ff
RSP: 0018:ffffc9000e9078b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888100bd9c80
RDX: 0000000000000000 RSI: ffff888100bd9c80 RDI: 0000000000000002
RBP: 0000000000000000 R08: ffffffff81d9e369 R09: 000000000000ffff
R10: 0000000000000003 R11: ffffed1026b83f53 R12: ffffea00009c0000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000200000
FS:  0000000000000000(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f78dffb7ef8 CR3: 000000001f2ef000 CR4: 0000000000350ee0


> If my speculation is correct, I think the below patch should be able
> to fix this issue.
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index ab7573d72dd7..18428cee59af 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1507,7 +1507,7 @@ void block_invalidatepage(struct page *page,
> unsigned int offset,
>         /*
>          * Check for overflow
>          */
> -       BUG_ON(stop > PAGE_SIZE || stop < length);
> +       BUG_ON(stop > thp_size(page) || stop < length);
>
>         head = page_buffers(page);
>         bh = head;
>

Yes, the C reproducer can not crash the kernel anymore after patching
the above code.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault in wb_timer_fn
  2021-09-16 10:47         ` Hao Sun
@ 2021-09-16 20:18           ` Yang Shi
  0 siblings, 0 replies; 7+ messages in thread
From: Yang Shi @ 2021-09-16 20:18 UTC (permalink / raw)
  To: Hao Sun, Matthew Wilcox, Hugh Dickins
  Cc: Christoph Hellwig, Jens Axboe, linux-block,
	Linux Kernel Mailing List, Linux MM

On Thu, Sep 16, 2021 at 3:47 AM Hao Sun <sunhao.th@gmail.com> wrote:
>
> > The BUG is triggered if it tries to invalidate across pages. But it
> > hardcoded PAGE_SIZE. The offset passed in by truncate_cleanup_page()
> > is 0, but the length might be > PAGE_SIZE if it is a compound page. It
> > might be caused by READ_ONLY_THP_FOR_FS.
> >
> > Could you please try the below debug patch to dump page details? I saw
> > your kernel config has DEBUG_VM enabled.
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index ab7573d72dd7..ed7256112c2b 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -1507,7 +1507,8 @@ void block_invalidatepage(struct page *page,
> > unsigned int offset,
> >         /*
> >          * Check for overflow
> >          */
> > -       BUG_ON(stop > PAGE_SIZE || stop < length);
> > +       VM_BUG_ON_PAGE((stop > PAGE_SIZE), page);
> > +       VM_BUG_ON_PAGE((stop < length), page);
> >
> >         head = page_buffers(page);
> >         bh = head;
> >
>
> Just patched it.
> The following log was printed after executing the C reproducer.
>
> IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
> IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
> Bluetooth: hci0: command 0x0409 tx timeout
> Bluetooth: hci0: command 0x041b tx timeout
> Bluetooth: hci0: command 0x040f tx timeout
> Bluetooth: hci0: command 0x0419 tx timeout
> page:ffffea00009c0000 refcount:514 mapcount:0 mapping:ffff8881060db7b0
> index:0x0 pfn:0x27000
> head:ffffea00009c0000 order:9 compound_mapcount:0 compound_pincount:0
> memcg:ffff8880118bc000
> aops:def_blk_aops ino:fa00000
> flags: 0xfff00000012037(locked|referenced|uptodate|lru|active|private|head|node=0|zone=1|lastcpupid=0x7ff)
> raw: 00fff00000012037 ffffea00009491c8 ffff888010c7d030 ffff8881060db7b0
> raw: 0000000000000000 ffff888025d10658 00000202ffffffff ffff8880118bc000
> page dumped because: VM_BUG_ON_PAGE((stop > ((1UL) << 12)))
> page_owner tracks the page as allocated
> page last allocated via order 9, migratetype Movable, gfp_mask
> 0x13c24ca(GFP_TRANSHUGE|__GFP_THISNODE), pid 35, ts 579054356699,
> free_ts 543915753195
>  set_page_owner include/linux/page_owner.h:31 [inline]
>  post_alloc_hook mm/page_alloc.c:2418 [inline]
>  prep_new_page+0x1a5/0x240 mm/page_alloc.c:2424
>  get_page_from_freelist+0x1f10/0x3b70 mm/page_alloc.c:4153
>  __alloc_pages+0x306/0x6e0 mm/page_alloc.c:5375
>  __alloc_pages_node include/linux/gfp.h:570 [inline]
>  khugepaged_alloc_page+0xa0/0x170 mm/khugepaged.c:881
>  collapse_file+0x20a/0x45f0 mm/khugepaged.c:1655
>  khugepaged_scan_file mm/khugepaged.c:2051 [inline]
>  khugepaged_scan_mm_slot mm/khugepaged.c:2146 [inline]
>  khugepaged_do_scan mm/khugepaged.c:2230 [inline]
>  khugepaged+0x2e65/0x5c50 mm/khugepaged.c:2275
>  kthread+0x3e5/0x4d0 kernel/kthread.c:319
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
> page last free stack trace:
>  reset_page_owner include/linux/page_owner.h:24 [inline]
>  free_pages_prepare mm/page_alloc.c:1338 [inline]
>  free_pcp_prepare+0x412/0x900 mm/page_alloc.c:1389
>  free_unref_page_prepare mm/page_alloc.c:3315 [inline]
>  free_unref_page+0x19/0x580 mm/page_alloc.c:3394
>  release_pages+0x87f/0x2920 mm/swap.c:926
>  tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
>  tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
>  tlb_flush_mmu+0x8d/0x610 mm/mmu_gather.c:249
>  tlb_finish_mmu+0x93/0x3c0 mm/mmu_gather.c:340
>  unmap_region+0x27f/0x350 mm/mmap.c:2653
>  __do_munmap+0xabc/0x11e0 mm/mmap.c:2884
>  do_munmap mm/mmap.c:2895 [inline]
>  munmap_vma_range mm/mmap.c:603 [inline]
>  mmap_region+0x2c4/0x1340 mm/mmap.c:1742
>  do_mmap+0x7f5/0xe60 mm/mmap.c:1575
>  vm_mmap_pgoff+0x1b7/0x290 mm/util.c:519
>  ksys_mmap_pgoff+0x49f/0x620 mm/mmap.c:1624
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1511!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 2954 Comm: kworker/1:2 Not tainted 5.15.0-rc1+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> Workqueue: events delayed_fput
> RIP: 0010:block_invalidatepage+0x599/0x680 fs/buffer.c:1511
> Code: 0f 0b e8 9a 6b 9c ff 31 f6 4c 89 e7 e8 10 d7 bf ff e9 df fe ff
> ff e8 86 6b 9c ff 48 c7 c6 00 4e 9a 89 4c 89 e7 e8 c7 16 d0 ff <0f> 0b
> e8 70 6b 9c ff 48 c7 c6 60 4e 9a 89 4c 89 e7 e8 b1 16 d0 ff
> RSP: 0018:ffffc9000e9078b8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888100bd9c80
> RDX: 0000000000000000 RSI: ffff888100bd9c80 RDI: 0000000000000002
> RBP: 0000000000000000 R08: ffffffff81d9e369 R09: 000000000000ffff
> R10: 0000000000000003 R11: ffffed1026b83f53 R12: ffffea00009c0000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000200000
> FS:  0000000000000000(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffde41e8960 CR3: 0000000019cef000 CR4: 0000000000350ee0
> Call Trace:
>  do_invalidatepage mm/truncate.c:157 [inline]
>  truncate_cleanup_page+0x3e4/0x620 mm/truncate.c:176
>  truncate_inode_pages_range+0x26c/0x1910 mm/truncate.c:325
>  kill_bdev.isra.0+0x5f/0x80 block/bdev.c:77
>  blkdev_flush_mapping+0xdf/0x2e0 block/bdev.c:658
>  blkdev_put_whole+0xe8/0x110 block/bdev.c:689
>  blkdev_put+0x23c/0x6f0 block/bdev.c:953
>  blkdev_close+0x8d/0xb0 block/fops.c:459
>  __fput+0x288/0x9f0 fs/file_table.c:280
>  delayed_fput+0x56/0x70 fs/file_table.c:308
>  process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297
>  worker_thread+0x90/0xed0 kernel/workqueue.c:2444
>  kthread+0x3e5/0x4d0 kernel/kthread.c:319
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
> Modules linked in:
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> ---[ end trace 77cfca54575c5255 ]---
> RIP: 0010:block_invalidatepage+0x599/0x680 fs/buffer.c:1511
> Code: 0f 0b e8 9a 6b 9c ff 31 f6 4c 89 e7 e8 10 d7 bf ff e9 df fe ff
> ff e8 86 6b 9c ff 48 c7 c6 00 4e 9a 89 4c 89 e7 e8 c7 16 d0 ff <0f> 0b
> e8 70 6b 9c ff 48 c7 c6 60 4e 9a 89 4c 89 e7 e8 b1 16 d0 ff
> RSP: 0018:ffffc9000e9078b8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff888100bd9c80
> RDX: 0000000000000000 RSI: ffff888100bd9c80 RDI: 0000000000000002
> RBP: 0000000000000000 R08: ffffffff81d9e369 R09: 000000000000ffff
> R10: 0000000000000003 R11: ffffed1026b83f53 R12: ffffea00009c0000
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000200000
> FS:  0000000000000000(0000) GS:ffff888135c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f78dffb7ef8 CR3: 000000001f2ef000 CR4: 0000000000350ee0
>
>
> > If my speculation is correct, I think the below patch should be able
> > to fix this issue.
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index ab7573d72dd7..18428cee59af 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -1507,7 +1507,7 @@ void block_invalidatepage(struct page *page,
> > unsigned int offset,
> >         /*
> >          * Check for overflow
> >          */
> > -       BUG_ON(stop > PAGE_SIZE || stop < length);
> > +       BUG_ON(stop > thp_size(page) || stop < length);
> >
> >         head = page_buffers(page);
> >         bh = head;
> >
>")
> Yes, the C reproducer can not crash the kernel anymore after patching
> the above code.

Thank you for running the test. This does prove my speculation. It
seems commit eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint
on file-backed THPs") opens much more cases for file THPs.

It seems your test case opens null block device and mmaps with
PROT_EXEC. This is why the THP is collapsed.

The above fix is kind of ad-hoc. The further investigation shows
bigger problem in invalidatepage(). All the implementations are *NOT*
THP aware and hardcoded PAGE_SIZE. Some triggers BUG(), like
block_invalidatepage(), some just returns error if length is >
PAGE_SIZE.

We could convert PAGE_SIZE to thp_size(), but it seems not enough
since the current implementations just invalidate one subpage
(typically head page), but it is not enough since other subpages may
have private too because PG_private is per subpage so there may be
multiple subpages have private IIUC. This may cause the THP not
splittable and reclaimable since the extra refcount pins from private
of subpages prevent this.

I could submit a patch to close the BUG() for now since more work
definitely needs to be done to make all the things right. However, how
to fix this may have conflicts with Willy's page folio work, so this
may not happen at any time soon.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-16 20:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13  2:00 general protection fault in wb_timer_fn Hao Sun
2021-09-14 10:45 ` Christoph Hellwig
2021-09-15  1:49   ` Hao Sun
2021-09-15  7:29     ` Christoph Hellwig
2021-09-15 19:42       ` Yang Shi
2021-09-16 10:47         ` Hao Sun
2021-09-16 20:18           ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).