general protection fault in del_gendisk

* general protection fault in del_gendisk
@ 2021-10-29 19:13 Tadeusz Struk
  2021-11-01 20:01 ` Tadeusz Struk
  0 siblings, 1 reply; 6+ messages in thread
From: Tadeusz Struk @ 2021-10-29 19:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

Hi,
I'm looking at a bug found by the syzkaller robot [1], and I just wanted
to confirm that my understanding is correct, and the issue can be closed.
First, the kernel is configured with some fault injections enabled:

CONFIG_FAULT_INJECTION=y
CONFIG_FAILSLAB=y
CONFIG_FAIL_PAGE_ALLOC=y

The test adds loop devices, which causes some entries in sysfs to be created.
It does some magic with ioctls, which calls:
__device_add_disk() -> register_disk()
which eventually triggers sysfs_create_files() and it crashes there,
in line 627 [2], because the fault injector logic triggers it.
That can be seen in the trace [3]:
[   34.089707][ T1813] FAULT_INJECTION: forcing a failure.

Sysfs code returns a -ENOMEM error, but because the __device_add_disk()
implementation mostly uses void function, and doesn't return on errors [4]
it goes farther, hits some warnings, like:
disk_add_events() -> sysfs_create_files() -> sysfs_create_file_ns() - > WARN()
and eventually triggers general protection fault in sysfs code, and panics there.

I think for this to recover and return an error to the caller via ioctl()
the __device_add_disk() code would need be reworked to handle errors,
and return errors to the caller.
My question is: is it implemented like this by design? Are there any plans
to make it fail more gracefully?

[1] https://syzkaller.appspot.com/bug?id=c234dd5151b92650adff0683a8c567c269fb39e5
[2] https://elixir.bootlin.com/linux/v5.14.15/source/fs/kernfs/dir.c#L583
[3] https://syzkaller.appspot.com/text?tag=CrashLog&x=113d8bfcb00000
[4] https://elixir.bootlin.com/linux/v5.14.15/source/block/genhd.c#L532

-- 
Thanks,
Tadeusz

^ permalink raw reply	[flat|nested] 6+ messages in thread