All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: Fix UAF during concurrent mount and device scan
@ 2020-01-09 11:02 Nikolay Borisov
  2020-01-09 14:44 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Nikolay Borisov @ 2020-01-09 11:02 UTC (permalink / raw)
  To: linux-btrfs; +Cc: jth, dsterba, Nikolay Borisov

Johannes' recent device close cleanup patches uncovered an UAF which was
triggered during device rescan, in device_list_add. In particular,
device_list_add was seeing a struct btrfs_device with a stale
fs_info member but valid bdev one. This caused btrfs_info_in_rcu to
crash inside btrfs_printk when the latter was dereferencing
device->fs_info. Kasan report looks like:

==================================================================
BUG: KASAN: use-after-free in device_list_add+0xf60/0x10e0 [btrfs]
BTRFS info (device dm-0): has skinny extents
Read of size 8 at addr ffff88810c70a348 by task systemd-udevd/10461

CPU: 3 PID: 10461 Comm: systemd-udevd Tainted: G        W    L    5.5.0-rc5-default #842
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 dump_stack+0x75/0xa0
 ? device_list_add+0xf60/0x10e0 [btrfs]
 print_address_description.constprop.8+0x92/0x3c2
 ? device_list_add+0xf60/0x10e0 [btrfs]
 ? device_list_add+0xf60/0x10e0 [btrfs]
 __kasan_report.cold.10+0x1a/0x3b
 ? iput+0x40/0x600
 ? device_list_add+0xf60/0x10e0 [btrfs]
 kasan_report+0xe/0x20
 device_list_add+0xf60/0x10e0 [btrfs]
 ? btrfs_alloc_device+0x680/0x680 [btrfs]
 ? blkdev_get+0x166/0x220
 ? btrfs_scan_one_device+0x38f/0x530 [btrfs]
 btrfs_scan_one_device+0x38f/0x530 [btrfs]
 ? device_list_add+0x10e0/0x10e0 [btrfs]
 ? __mutex_lock_slowpath+0x10/0x10
 ? _copy_from_user+0x70/0xa0
 btrfs_control_ioctl+0xb8/0x210 [btrfs]
 do_vfs_ioctl+0x190/0xf10
 ? may_open_dev+0xc0/0xc0
 ? ___slab_alloc+0x4af/0x4f0
 ? compat_ioctl_preallocate+0x1b0/0x1b0
 ? __fsnotify_inode_delete+0x20/0x20
 ? __kasan_slab_free+0x1da/0x270
 ? do_sys_open+0x16b/0x350
 ? kmem_cache_free+0xab/0x330
 ksys_ioctl+0x3a/0x70
 __x64_sys_ioctl+0x6f/0xb0
 do_syscall_64+0x8c/0x3a0
 ? prepare_exit_to_usermode+0x1d/0x1a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3bca9caf47
RSP: 002b:00007ffd418e45b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3bca9caf47
RDX: 00007ffd418e45d0 RSI: 0000000090009427 RDI: 0000000000000007
RBP: 00007ffd418e45d0 R08: 302d6d642f766564 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000007
R13: 0000000000000000 R14: 000055c30563ab70 R15: 000055c305636870

Allocated by task 10425:
 save_stack+0x19/0x80
 __kasan_kmalloc.constprop.8+0xc1/0xd0
 kvmalloc_node+0x40/0x70
 btrfs_mount_root+0xc6/0xee0 [btrfs]
 legacy_get_tree+0xed/0x1d0
 vfs_get_tree+0x7d/0x230
 fc_mount+0xf/0x80
 vfs_kern_mount.part.44+0x5c/0x80
 btrfs_mount+0x220/0x1310 [btrfs]
 legacy_get_tree+0xed/0x1d0
 vfs_get_tree+0x7d/0x230
 do_mount+0xfbf/0x1560
 __x64_sys_mount+0x162/0x1b0
 do_syscall_64+0x8c/0x3a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 10454:
 save_stack+0x19/0x80
 __kasan_slab_free+0x1c5/0x270
 kfree+0xc9/0x2a0
 deactivate_locked_super+0x74/0xc0
 deactivate_super+0x123/0x140
 cleanup_mnt+0x204/0x450
 task_work_run+0x104/0x180
 exit_to_usermode_loop+0xfa/0x120
 do_syscall_64+0x2d7/0x3a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Memory state around the buggy address:
 ffff88810c70a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88810c70a280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88810c70a300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
 ffff88810c70a380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88810c70a400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Instrumenting the code proved the theory:

mount-1231  [002] ....    46.498558: btrfs_init_devices_late: Updating fs_info for ffff888114e14400: old = 0000000000000000 new=ffff888110f2a000
umount-1271  [001] ....    46.562981: close_fs_devices.part.34: set bdev to NULL for ffff888114e14400 (STALE FS_DEVICES), fs_info= ffff888110f2a000
umount-1271  [001] ....    46.563317: btrfs_kill_super: Freeing fs_info: ffff888110f2a000
mount-1280  [001] ....    46.586577: open_fs_devices: Setting valid bdev to ffff888114e14400 (fs_info = ffff888110f2a000
systemd-udevd-1282  [003] ....    46.595672: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
systemd-udevd-1282  [003] ....    46.603702: device_list_add: device: dm-0
systemd-udevd-1282  [003] ....    46.611201: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
systemd-udevd-1282  [003] ....    46.611204: device_list_add: device: dm-0
systemd-udevd-1287  [003] .N..    46.646134: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
systemd-udevd-1287  [003] .N..    46.646138: device_list_add: device: dm-0
systemd-udevd-1287  [003] ....    46.653470: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
systemd-udevd-1287  [003] ....    46.653473: device_list_add: device: dm-0
mount-1280  [001] ....    46.696126: btrfs_init_devices_late: Updating fs_info for ffff888114e14400: old = ffff888110f2a000 new=ffff888111662000

This log shows how an fs  is being unmounted which causes device close
routine to be invoked. It sets bdev to NULL but doesn't reset fs_info.
Afterwards the fs_info itself is freed from btrfs_kill_super at the same
time the device is still anchored at fs_devices list. Subsequently a
mount is triggered which sets btrfs_fs_device::bdev to a valid value, yet
btrfs_fs_device::fs_info is still stale/freed. Before btrfs_fill_super
is called and re-initializes btrfs_fs_device::fs_info a concurrent device
scan is triggered, it finds the device with its ->bdev pointer set to
valid value and eventually calls btrfs_info_in_rcu in device_list_add
which causes the crash.

Simply setting btrfs_fs_device::fs_info to NULL prevents the crash but
doesn't fix the race. In fact the race cannot be solved because device
scan is asynchronous in its nature so it makes no sense to try and
synchronize it with pending mounts.

Fixes: cc1824fcd334 ("btrfs: reset device back to allocation state when removing")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---

With this fix I can no longer get generic/085 to crash/generate the KASAN warning,
even with  an mdelay added in btrfs_mount_root which triggered the issue reliably.

 fs/btrfs/volumes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 65e78e59d5c4..ad8944cc4dd1 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1086,6 +1086,7 @@ static void btrfs_close_one_device(struct btrfs_device *device)

 	atomic_set(&device->dev_stats_ccnt, 0);
 	extent_io_tree_release(&device->alloc_state);
+	device->fs_info = NULL;

 	/* Verify the device is back in a pristine state  */
 	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
--
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-09 11:02 [PATCH] btrfs: Fix UAF during concurrent mount and device scan Nikolay Borisov
@ 2020-01-09 14:44 ` David Sterba
  2020-01-09 15:46   ` Nikolay Borisov
  2020-01-13 17:06   ` David Sterba
  2020-01-09 15:20 ` Johannes Thumshirn
  2020-01-10  9:05 ` Anand Jain
  2 siblings, 2 replies; 7+ messages in thread
From: David Sterba @ 2020-01-09 14:44 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs, jth, dsterba

On Thu, Jan 09, 2020 at 01:02:10PM +0200, Nikolay Borisov wrote:
> This log shows how an fs  is being unmounted which causes device close
> routine to be invoked. It sets bdev to NULL but doesn't reset fs_info.
> Afterwards the fs_info itself is freed from btrfs_kill_super at the same
> time the device is still anchored at fs_devices list. Subsequently a
> mount is triggered which sets btrfs_fs_device::bdev to a valid value, yet
> btrfs_fs_device::fs_info is still stale/freed. Before btrfs_fill_super
> is called and re-initializes btrfs_fs_device::fs_info a concurrent device
> scan is triggered, it finds the device with its ->bdev pointer set to
> valid value and eventually calls btrfs_info_in_rcu in device_list_add
> which causes the crash.
> 
> Simply setting btrfs_fs_device::fs_info to NULL prevents the crash but
> doesn't fix the race. In fact the race cannot be solved because device
> scan is asynchronous in its nature so it makes no sense to try and
> synchronize it with pending mounts.

We've had bugs when mount and scan raced, I don't see why you think this
cannot be fixed and synchronized in this case. At minimum I'd think that
the device_list_mutex should be enough as it's the one thing that
excludes mount and scan for the new and removed members of the device
lists etc.

> Fixes: cc1824fcd334 ("btrfs: reset device back to allocation state when removing")

That's still it misc-next so I'd rather fold it into the patch than to
have this split fix.

> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
> 
> With this fix I can no longer get generic/085 to crash/generate the KASAN warning,
> even with  an mdelay added in btrfs_mount_root which triggered the issue reliably.
> 
>  fs/btrfs/volumes.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 65e78e59d5c4..ad8944cc4dd1 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1086,6 +1086,7 @@ static void btrfs_close_one_device(struct btrfs_device *device)
> 
>  	atomic_set(&device->dev_stats_ccnt, 0);
>  	extent_io_tree_release(&device->alloc_state);
> +	device->fs_info = NULL;
> 
>  	/* Verify the device is back in a pristine state  */
>  	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-09 11:02 [PATCH] btrfs: Fix UAF during concurrent mount and device scan Nikolay Borisov
  2020-01-09 14:44 ` David Sterba
@ 2020-01-09 15:20 ` Johannes Thumshirn
  2020-01-10  9:05 ` Anand Jain
  2 siblings, 0 replies; 7+ messages in thread
From: Johannes Thumshirn @ 2020-01-09 15:20 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: dsterba

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-09 14:44 ` David Sterba
@ 2020-01-09 15:46   ` Nikolay Borisov
  2020-01-13 17:06   ` David Sterba
  1 sibling, 0 replies; 7+ messages in thread
From: Nikolay Borisov @ 2020-01-09 15:46 UTC (permalink / raw)
  To: dsterba, linux-btrfs, jth, dsterba



On 9.01.20 г. 16:44 ч., David Sterba wrote:
> On Thu, Jan 09, 2020 at 01:02:10PM +0200, Nikolay Borisov wrote:
>> This log shows how an fs  is being unmounted which causes device close
>> routine to be invoked. It sets bdev to NULL but doesn't reset fs_info.
>> Afterwards the fs_info itself is freed from btrfs_kill_super at the same
>> time the device is still anchored at fs_devices list. Subsequently a
>> mount is triggered which sets btrfs_fs_device::bdev to a valid value, yet
>> btrfs_fs_device::fs_info is still stale/freed. Before btrfs_fill_super
>> is called and re-initializes btrfs_fs_device::fs_info a concurrent device
>> scan is triggered, it finds the device with its ->bdev pointer set to
>> valid value and eventually calls btrfs_info_in_rcu in device_list_add
>> which causes the crash.
>>
>> Simply setting btrfs_fs_device::fs_info to NULL prevents the crash but
>> doesn't fix the race. In fact the race cannot be solved because device
>> scan is asynchronous in its nature so it makes no sense to try and
>> synchronize it with pending mounts.
> 
> We've had bugs when mount and scan raced, I don't see why you think this
> cannot be fixed and synchronized in this case. At minimum I'd think that
> the device_list_mutex should be enough as it's the one thing that
> excludes mount and scan for the new and removed members of the device
> lists etc.

Yes, but this means mount will also need to take device_list_mutex for
the duration of the whole mount operation.

> 
>> Fixes: cc1824fcd334 ("btrfs: reset device back to allocation state when removing")
> 
> That's still it misc-next so I'd rather fold it into the patch than to
> have this split fix.
> 
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
>> ---
>>
>> With this fix I can no longer get generic/085 to crash/generate the KASAN warning,
>> even with  an mdelay added in btrfs_mount_root which triggered the issue reliably.
>>
>>  fs/btrfs/volumes.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 65e78e59d5c4..ad8944cc4dd1 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1086,6 +1086,7 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>
>>  	atomic_set(&device->dev_stats_ccnt, 0);
>>  	extent_io_tree_release(&device->alloc_state);
>> +	device->fs_info = NULL;
>>
>>  	/* Verify the device is back in a pristine state  */
>>  	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
>> --
>> 2.17.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-09 11:02 [PATCH] btrfs: Fix UAF during concurrent mount and device scan Nikolay Borisov
  2020-01-09 14:44 ` David Sterba
  2020-01-09 15:20 ` Johannes Thumshirn
@ 2020-01-10  9:05 ` Anand Jain
  2 siblings, 0 replies; 7+ messages in thread
From: Anand Jain @ 2020-01-10  9:05 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs; +Cc: jth, dsterba

On 9/1/20 7:02 PM, Nikolay Borisov wrote:
> Johannes' recent device close cleanup patches uncovered an UAF which was


> triggered during device rescan, in device_list_add. In particular,
> device_list_add was seeing a struct btrfs_device with a stale
> fs_info member

> but valid bdev one.

   Strange. But a closed device shall have bdev NULL may be it was
   just closing, not yet completely closed?

----------
static void btrfs_close_one_device(struct btrfs_device *device)
::
         btrfs_close_bdev(device);
         if (device->bdev) {
                 fs_devices->open_devices--;
                 device->bdev = NULL;   <----
         }
         clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
----------

  So to me it looks like the thread which called device scan (systemd)
  is racing with some other thread,  do you see the any contending
  threads?

> This caused btrfs_info_in_rcu to
> crash inside btrfs_printk when the latter was dereferencing
> device->fs_info.

  Do you mean when dereferencing device::fs_info::sb ?
  A null device::fs_info we won't crash it will just print unknown.

------------
                 printk("%sBTRFS %s (device %s): %pV\n", lvl, type,
                         fs_info ? fs_info->sb->s_id : "<unknown>", &vaf);
-------------

> Kasan report looks like:
> 
> ==================================================================
> BUG: KASAN: use-after-free in device_list_add+0xf60/0x10e0 [btrfs]
> BTRFS info (device dm-0): has skinny extents
> Read of size 8 at addr ffff88810c70a348 by task systemd-udevd/10461
> 
> CPU: 3 PID: 10461 Comm: systemd-udevd Tainted: G        W    L    5.5.0-rc5-default #842
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
> Call Trace:
>   dump_stack+0x75/0xa0
>   ? device_list_add+0xf60/0x10e0 [btrfs]
>   print_address_description.constprop.8+0x92/0x3c2
>   ? device_list_add+0xf60/0x10e0 [btrfs]
>   ? device_list_add+0xf60/0x10e0 [btrfs]
>   __kasan_report.cold.10+0x1a/0x3b
>   ? iput+0x40/0x600
>   ? device_list_add+0xf60/0x10e0 [btrfs]
>   kasan_report+0xe/0x20
>   device_list_add+0xf60/0x10e0 [btrfs]
>   ? btrfs_alloc_device+0x680/0x680 [btrfs]
>   ? blkdev_get+0x166/0x220
>   ? btrfs_scan_one_device+0x38f/0x530 [btrfs]
>   btrfs_scan_one_device+0x38f/0x530 [btrfs]
>   ? device_list_add+0x10e0/0x10e0 [btrfs]
>   ? __mutex_lock_slowpath+0x10/0x10
>   ? _copy_from_user+0x70/0xa0
>   btrfs_control_ioctl+0xb8/0x210 [btrfs]
>   do_vfs_ioctl+0x190/0xf10
>   ? may_open_dev+0xc0/0xc0
>   ? ___slab_alloc+0x4af/0x4f0
>   ? compat_ioctl_preallocate+0x1b0/0x1b0
>   ? __fsnotify_inode_delete+0x20/0x20
>   ? __kasan_slab_free+0x1da/0x270
>   ? do_sys_open+0x16b/0x350
>   ? kmem_cache_free+0xab/0x330
>   ksys_ioctl+0x3a/0x70
>   __x64_sys_ioctl+0x6f/0xb0
>   do_syscall_64+0x8c/0x3a0
>   ? prepare_exit_to_usermode+0x1d/0x1a0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f3bca9caf47
> RSP: 002b:00007ffd418e45b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3bca9caf47
> RDX: 00007ffd418e45d0 RSI: 0000000090009427 RDI: 0000000000000007
> RBP: 00007ffd418e45d0 R08: 302d6d642f766564 R09: 0000000000000003
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000007
> R13: 0000000000000000 R14: 000055c30563ab70 R15: 000055c305636870
> 
> Allocated by task 10425:
>   save_stack+0x19/0x80
>   __kasan_kmalloc.constprop.8+0xc1/0xd0
>   kvmalloc_node+0x40/0x70
>   btrfs_mount_root+0xc6/0xee0 [btrfs]
>   legacy_get_tree+0xed/0x1d0
>   vfs_get_tree+0x7d/0x230
>   fc_mount+0xf/0x80
>   vfs_kern_mount.part.44+0x5c/0x80
>   btrfs_mount+0x220/0x1310 [btrfs]
>   legacy_get_tree+0xed/0x1d0
>   vfs_get_tree+0x7d/0x230
>   do_mount+0xfbf/0x1560
>   __x64_sys_mount+0x162/0x1b0
>   do_syscall_64+0x8c/0x3a0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Freed by task 10454:
>   save_stack+0x19/0x80
>   __kasan_slab_free+0x1c5/0x270
>   kfree+0xc9/0x2a0
>   deactivate_locked_super+0x74/0xc0
>   deactivate_super+0x123/0x140
>   cleanup_mnt+0x204/0x450
>   task_work_run+0x104/0x180
>   exit_to_usermode_loop+0xfa/0x120
>   do_syscall_64+0x2d7/0x3a0
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Memory state around the buggy address:
>   ffff88810c70a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ffff88810c70a280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ffff88810c70a300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                ^
>   ffff88810c70a380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ffff88810c70a400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> 
> Instrumenting the code proved the theory:
> 
> mount-1231  [002] ....    46.498558: btrfs_init_devices_late: Updating fs_info for ffff888114e14400: old = 0000000000000000 new=ffff888110f2a000
> umount-1271  [001] ....    46.562981: close_fs_devices.part.34: set bdev to NULL for ffff888114e14400 (STALE FS_DEVICES), fs_info= ffff888110f2a000
> umount-1271  [001] ....    46.563317: btrfs_kill_super: Freeing fs_info: ffff888110f2a000
> mount-1280  [001] ....    46.586577: open_fs_devices: Setting valid bdev to ffff888114e14400 (fs_info = ffff888110f2a000
> systemd-udevd-1282  [003] ....    46.595672: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
> systemd-udevd-1282  [003] ....    46.603702: device_list_add: device: dm-0
> systemd-udevd-1282  [003] ....    46.611201: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
> systemd-udevd-1282  [003] ....    46.611204: device_list_add: device: dm-0
> systemd-udevd-1287  [003] .N..    46.646134: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
> systemd-udevd-1287  [003] .N..    46.646138: device_list_add: device: dm-0
> systemd-udevd-1287  [003] ....    46.653470: device_list_add: found device with bdev: ffff888114e14400 fs_info: ffff888110f2a000
> systemd-udevd-1287  [003] ....    46.653473: device_list_add: device: dm-0
> mount-1280  [001] ....    46.696126: btrfs_init_devices_late: Updating fs_info for ffff888114e14400: old = ffff888110f2a000 new=ffff888111662000
> 
> This log shows how an fs  is being unmounted which causes device close
> routine to be invoked. It sets bdev to NULL but doesn't reset fs_info.
> Afterwards the fs_info itself is freed from btrfs_kill_super at the same
> time the device is still anchored at fs_devices list. Subsequently a
> mount is triggered which sets btrfs_fs_device::bdev to a valid value, yet
                                  ^^^
  device::bdev.

> btrfs_fs_device::fs_info is still stale/freed. Before btrfs_fill_super
> is called and re-initializes btrfs_fs_device::fs_info a concurrent device
> scan is triggered, it finds the device with its ->bdev pointer set to
> valid value and eventually calls btrfs_info_in_rcu in device_list_add
> which causes the crash.
> 
> Simply setting btrfs_fs_device::fs_info to NULL prevents the crash but
> doesn't fix the race. In fact the race cannot be solved because device
> scan is asynchronous in its nature so it makes no sense to try and
> synchronize it with pending mounts.
> 
> Fixes: cc1824fcd334 ("btrfs: reset device back to allocation state when removing")

  On the basics of that its fundamentally wrong to use the fs_info in
  device_list_add(), I have sent a fix. Which will indirectly fix this
  UAF, unfortunately. And as of now I am not convinced if this is
  related to cc1824fcd334. Few analysis items are missing IMO.

  Thanks for the report.

Anand

> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
> 
> With this fix I can no longer get generic/085 to crash/generate the KASAN warning,
> even with  an mdelay added in btrfs_mount_root which triggered the issue reliably.
> 
>   fs/btrfs/volumes.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 65e78e59d5c4..ad8944cc4dd1 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1086,6 +1086,7 @@ static void btrfs_close_one_device(struct btrfs_device *device)
> 
>   	atomic_set(&device->dev_stats_ccnt, 0);
>   	extent_io_tree_release(&device->alloc_state);
> +	device->fs_info = NULL;
> 
>   	/* Verify the device is back in a pristine state  */
>   	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
> --
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-09 14:44 ` David Sterba
  2020-01-09 15:46   ` Nikolay Borisov
@ 2020-01-13 17:06   ` David Sterba
  2020-01-13 17:27     ` Johannes Thumshirn
  1 sibling, 1 reply; 7+ messages in thread
From: David Sterba @ 2020-01-13 17:06 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, linux-btrfs, jth, dsterba

On Thu, Jan 09, 2020 at 03:44:44PM +0100, David Sterba wrote:
> On Thu, Jan 09, 2020 at 01:02:10PM +0200, Nikolay Borisov wrote:
> > This log shows how an fs  is being unmounted which causes device close
> > routine to be invoked. It sets bdev to NULL but doesn't reset fs_info.
> > Afterwards the fs_info itself is freed from btrfs_kill_super at the same
> > time the device is still anchored at fs_devices list. Subsequently a
> > mount is triggered which sets btrfs_fs_device::bdev to a valid value, yet
> > btrfs_fs_device::fs_info is still stale/freed. Before btrfs_fill_super
> > is called and re-initializes btrfs_fs_device::fs_info a concurrent device
> > scan is triggered, it finds the device with its ->bdev pointer set to
> > valid value and eventually calls btrfs_info_in_rcu in device_list_add
> > which causes the crash.
> > 
> > Simply setting btrfs_fs_device::fs_info to NULL prevents the crash but
> > doesn't fix the race. In fact the race cannot be solved because device
> > scan is asynchronous in its nature so it makes no sense to try and
> > synchronize it with pending mounts.
> 
> We've had bugs when mount and scan raced, I don't see why you think this
> cannot be fixed and synchronized in this case. At minimum I'd think that
> the device_list_mutex should be enough as it's the one thing that
> excludes mount and scan for the new and removed members of the device
> lists etc.
> 
> > Fixes: cc1824fcd334 ("btrfs: reset device back to allocation state when removing")
> 
> That's still it misc-next so I'd rather fold it into the patch than to
> have this split fix.

Now folded to "btrfs: reset device back to allocation state when
removing" and pushed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] btrfs: Fix UAF during concurrent mount and device scan
  2020-01-13 17:06   ` David Sterba
@ 2020-01-13 17:27     ` Johannes Thumshirn
  0 siblings, 0 replies; 7+ messages in thread
From: Johannes Thumshirn @ 2020-01-13 17:27 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, linux-btrfs, jth, dsterba

On 13/01/2020, 18:06, "linux-btrfs-owner@vger.kernel.org on behalf of David Sterba" <linux-btrfs-owner@vger.kernel.org on behalf of dsterba@suse.cz> wrote:

[...]
    
    Now folded to "btrfs: reset device back to allocation state when
    removing" and pushed.
    
Thanks to you both


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-13 17:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 11:02 [PATCH] btrfs: Fix UAF during concurrent mount and device scan Nikolay Borisov
2020-01-09 14:44 ` David Sterba
2020-01-09 15:46   ` Nikolay Borisov
2020-01-13 17:06   ` David Sterba
2020-01-13 17:27     ` Johannes Thumshirn
2020-01-09 15:20 ` Johannes Thumshirn
2020-01-10  9:05 ` Anand Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.