All of lore.kernel.org
 help / color / mirror / Atom feed
* [Question] do we need fail __device_add_disk when functions in it return error
       [not found] <3d3dce33-a2d7-ebfa-c35e-55c2de7df7af@huawei.com>
@ 2021-04-07 11:50 ` Yufen Yu
  2021-04-07 12:55   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 2+ messages in thread
From: Yufen Yu @ 2021-04-07 11:50 UTC (permalink / raw)
  To: linux-block
  Cc: axboe, Christoph Hellwig, Ming Lei, Greg Kroah-Hartman, rafael

Hi, all

Recently, our syzbot test reported NULL pointer dereference in device_del()
by injecting memory allocation fail in device_add() invoking from register_disk(),
as following:

general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 0 PID: 9441 Comm: syz-executor348 Tainted: G        W         5.10.0+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
RIP: 0010:kill_device+0x9b/0x160 drivers/base/core.c:3050
Code: fd 49 83 c6 48 4c 89 f3 48 c1 eb 03 42 80 3c 3b 00 74 08 4c 89 f7 e8 94 ef 8c fd bd 08 01 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <42> 8a 04 38 84 c0 0f 85 89 00 00 00 0f b6 6d 00 83 e5 01 31 ff 89
RSP: 0018:ffffc900032bfd38 EFLAGS: 00010206
RAX: 0000000000000021 RBX: 1ffff1100299b816 RCX: ffff8881115cc200
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 0000000000000108 R08: dffffc0000000000 R09: ffff8881115cc200
R10: ffffed100299b830 R11: 0000000000000000 R12: dffffc0000000000
R13: 0000000000000008 R14: ffff888014cdc0b0 R15: dffffc0000000000
FS:  0000000000e29880(0000) GS:ffff888061000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006cc098 CR3: 0000000018764000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  device_del+0x62/0xa00 drivers/base/core.c:3078
  del_gendisk+0x725/0x880 block/genhd.c:952
  loop_remove+0x42/0xa0 drivers/block/loop.c:2193
  loop_control_ioctl+0x31d/0x3a0 drivers/block/loop.c:2292
  vfs_ioctl+0xa2/0xf0 fs/ioctl.c:48
  __do_sys_ioctl fs/ioctl.c:753 [inline]
  __se_sys_ioctl fs/ioctl.c:739 [inline]
  __x64_sys_ioctl+0xfe/0x140 fs/ioctl.c:739
  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

When device_add() fail caused by some memory allocation fail, it will free
'dev->p' and set the pointer as NULL. Since add_disk() will ignore the functions
fail and go on working, it will cause various NULL pointer dereference in del_gendisk(),
including 'dev->p' is NULL in kill_device() from device_del(), 'name' is NULL in
sysfs_remove_link() from del_gendisk(), kobj->sd is 'NULL' when call dpm_sysfs_remove()
from device_del(), and so on.

I try to fix the the problem by passing device_del() when device_add() fail[1], which is
is not complete. And Greg thinks we should fix up users of device_del().

My question is do we need to check return value of register_disk() and fail __device_add_disk()
when functions in it fail.
Is there some historical reason that let us ignore various fail from __device_add_disk()
and register_disk()?

[1] https://lore.kernel.org/linux-block/20210401130138.2164928-1-yuyufen@huawei.com/

Thanks,
Yufen

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Question] do we need fail __device_add_disk when functions in it return error
  2021-04-07 11:50 ` [Question] do we need fail __device_add_disk when functions in it return error Yufen Yu
@ 2021-04-07 12:55   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2021-04-07 12:55 UTC (permalink / raw)
  To: Yufen Yu; +Cc: linux-block, axboe, Christoph Hellwig, Ming Lei, rafael

On Wed, Apr 07, 2021 at 07:50:20PM +0800, Yufen Yu wrote:
> Hi, all
> 
> Recently, our syzbot test reported NULL pointer dereference in device_del()
> by injecting memory allocation fail in device_add() invoking from register_disk(),
> as following:
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
> CPU: 0 PID: 9441 Comm: syz-executor348 Tainted: G        W         5.10.0+ #6
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> RIP: 0010:kill_device+0x9b/0x160 drivers/base/core.c:3050
> Code: fd 49 83 c6 48 4c 89 f3 48 c1 eb 03 42 80 3c 3b 00 74 08 4c 89 f7 e8 94 ef 8c fd bd 08 01 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <42> 8a 04 38 84 c0 0f 85 89 00 00 00 0f b6 6d 00 83 e5 01 31 ff 89
> RSP: 0018:ffffc900032bfd38 EFLAGS: 00010206
> RAX: 0000000000000021 RBX: 1ffff1100299b816 RCX: ffff8881115cc200
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
> RBP: 0000000000000108 R08: dffffc0000000000 R09: ffff8881115cc200
> R10: ffffed100299b830 R11: 0000000000000000 R12: dffffc0000000000
> R13: 0000000000000008 R14: ffff888014cdc0b0 R15: dffffc0000000000
> FS:  0000000000e29880(0000) GS:ffff888061000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000006cc098 CR3: 0000000018764000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  device_del+0x62/0xa00 drivers/base/core.c:3078
>  del_gendisk+0x725/0x880 block/genhd.c:952
>  loop_remove+0x42/0xa0 drivers/block/loop.c:2193
>  loop_control_ioctl+0x31d/0x3a0 drivers/block/loop.c:2292
>  vfs_ioctl+0xa2/0xf0 fs/ioctl.c:48
>  __do_sys_ioctl fs/ioctl.c:753 [inline]
>  __se_sys_ioctl fs/ioctl.c:739 [inline]
>  __x64_sys_ioctl+0xfe/0x140 fs/ioctl.c:739
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> When device_add() fail caused by some memory allocation fail, it will free
> 'dev->p' and set the pointer as NULL. Since add_disk() will ignore the functions
> fail and go on working, it will cause various NULL pointer dereference in del_gendisk(),
> including 'dev->p' is NULL in kill_device() from device_del(), 'name' is NULL in
> sysfs_remove_link() from del_gendisk(), kobj->sd is 'NULL' when call dpm_sysfs_remove()
> from device_del(), and so on.
> 
> I try to fix the the problem by passing device_del() when device_add() fail[1], which is
> is not complete. And Greg thinks we should fix up users of device_del().
> 
> My question is do we need to check return value of register_disk() and fail __device_add_disk()
> when functions in it fail.
> Is there some historical reason that let us ignore various fail from __device_add_disk()
> and register_disk()?

Probably because the only way you can ever hit this is if you are
injecting faults into the system.  Can you ever normally fail the memory
allocation any other way under a normal system operation?

Try making a patch and see what it looks like...

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-07 12:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <3d3dce33-a2d7-ebfa-c35e-55c2de7df7af@huawei.com>
2021-04-07 11:50 ` [Question] do we need fail __device_add_disk when functions in it return error Yufen Yu
2021-04-07 12:55   ` Greg Kroah-Hartman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.