All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Handle bad dev_root properly with rescue=all
@ 2021-03-11 16:23 Josef Bacik
  2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Josef Bacik @ 2021-03-11 16:23 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

My recent debugging session with Neal's broken filesystem uncovered a glaring
hole in my rescue=all patches, they don't deal with a NULL dev_root properly.
In testing I only ever tested corrupting the extent tree or the csum tree, since
those are the most problematic.  The following 3 fixes allowed Neal to get
rescue=all working without panicing the machine, and I verified everything by
using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,

Josef

Josef Bacik (3):
  btrfs: init devices always
  btrfs: do not init dev stats if we have no dev_root
  btrfs: don't init dev replace for bad dev root

 fs/btrfs/dev-replace.c | 3 +++
 fs/btrfs/disk-io.c     | 2 +-
 fs/btrfs/volumes.c     | 3 +++
 3 files changed, 7 insertions(+), 1 deletion(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] btrfs: init devices always
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
@ 2021-03-11 16:23 ` Josef Bacik
  2021-03-12  5:52   ` Anand Jain
  2021-03-12  5:58   ` Anand Jain
  2021-03-11 16:23 ` [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root Josef Bacik
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 15+ messages in thread
From: Josef Bacik @ 2021-03-11 16:23 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Neal Gompa

Neal reported a panic trying to use -o rescue=all

BUG: kernel NULL pointer dereference, address: 0000000000000030
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
Call Trace:
 ? btrfs_init_dev_stats+0x1f/0xf0
 ? kmem_cache_alloc+0xef/0x1f0
 btrfs_init_dev_stats+0x5f/0xf0
 open_ctree+0x10cb/0x1720
 btrfs_mount_root.cold+0x12/0xea
 legacy_get_tree+0x27/0x40
 vfs_get_tree+0x25/0xb0
 vfs_kern_mount.part.0+0x71/0xb0
 btrfs_mount+0x10d/0x380
 legacy_get_tree+0x27/0x40
 vfs_get_tree+0x25/0xb0
 path_mount+0x433/0xa00
 __x64_sys_mount+0xe3/0x120
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xae

This happens because when we call btrfs_init_dev_stats we do
device->fs_info->dev_root.  However device->fs_info isn't init'ed
because we were only calling btrfs_init_devices_late() if we properly
read the device root.  However we don't actually need the device root to
init the devices, this function simply assigns the devices their
->fs_info pointer properly, so this needs to be done unconditionally
always so that we can properly deref device->fs_info in rescue cases.

Reported-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/disk-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 41b718cfea40..63656bf23ff2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2387,8 +2387,8 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
 	} else {
 		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
 		fs_info->dev_root = root;
-		btrfs_init_devices_late(fs_info);
 	}
+	btrfs_init_devices_late(fs_info);
 
 	/* If IGNOREDATACSUMS is set don't bother reading the csum root. */
 	if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
  2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
@ 2021-03-11 16:23 ` Josef Bacik
  2021-03-12  5:59   ` Anand Jain
  2021-03-11 16:23 ` [PATCH 3/3] btrfs: don't init dev replace for bad dev root Josef Bacik
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Josef Bacik @ 2021-03-11 16:23 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Neal Gompa

Neal reported a panic trying to use -o rescue=all

BUG: kernel NULL pointer dereference, address: 0000000000000030
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1
RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0
RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10
RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000
RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000
R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00
R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10
FS:  00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0
Call Trace:
? btrfs_init_dev_stats+0x1f/0xf0
btrfs_init_dev_stats+0x62/0xf0
open_ctree+0x1019/0x15ff
btrfs_mount_root.cold+0x13/0xfa
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x131/0x3d0
? legacy_get_tree+0x27/0x40
? btrfs_show_options+0x640/0x640
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
path_mount+0x441/0xa80
__x64_sys_mount+0xf4/0x130
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f698c04e52e

This happens because we unconditionally attempt to init device stats on
mount, but we may not have been able to read the device root.  Fix this
by skipping init'ing the device stats if we do not have a device root.

Reported-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 995920fcce9b..d4ca721c1d91 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7448,6 +7448,9 @@ static int btrfs_device_init_dev_stats(struct btrfs_device *device,
 	int item_size;
 	int i, ret, slot;
 
+	if (!device->fs_info->dev_root)
+		return 0;
+
 	key.objectid = BTRFS_DEV_STATS_OBJECTID;
 	key.type = BTRFS_PERSISTENT_ITEM_KEY;
 	key.offset = device->devid;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] btrfs: don't init dev replace for bad dev root
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
  2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
  2021-03-11 16:23 ` [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root Josef Bacik
@ 2021-03-11 16:23 ` Josef Bacik
  2021-03-12  6:50   ` Anand Jain
  2021-03-11 19:18 ` [PATCH 0/3] Handle bad dev_root properly with rescue=all Neal Gompa
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Josef Bacik @ 2021-03-11 16:23 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Neal Gompa

While helping Neal fix his broken file system I added a debug patch to
catch if we were calling btrfs_search_slot with a NULL root, and this
stack trace popped

we tried to search with a NULL root
CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
Call Trace:
 dump_stack+0x6b/0x83
 btrfs_search_slot.cold+0x11/0x1b
 ? btrfs_init_dev_replace+0x36/0x450
 btrfs_init_dev_replace+0x71/0x450
 open_ctree+0x1054/0x1610
 btrfs_mount_root.cold+0x13/0xfa
 legacy_get_tree+0x27/0x40
 vfs_get_tree+0x25/0xb0
 vfs_kern_mount.part.0+0x71/0xb0
 btrfs_mount+0x131/0x3d0
 ? legacy_get_tree+0x27/0x40
 ? btrfs_show_options+0x640/0x640
 legacy_get_tree+0x27/0x40
 vfs_get_tree+0x25/0xb0
 path_mount+0x441/0xa80
 __x64_sys_mount+0xf4/0x130
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f644730352e

Fix this by not starting the device replace stuff if we do not have a
NULL dev root.

Reported-by: Neal Gompa <ngompa13@gmail.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/dev-replace.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 3a9c1e046ebe..d05f73530af7 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -81,6 +81,9 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info)
 	struct btrfs_dev_replace_item *ptr;
 	u64 src_devid;
 
+	if (!dev_root)
+		return 0;
+
 	path = btrfs_alloc_path();
 	if (!path) {
 		ret = -ENOMEM;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] Handle bad dev_root properly with rescue=all
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
                   ` (2 preceding siblings ...)
  2021-03-11 16:23 ` [PATCH 3/3] btrfs: don't init dev replace for bad dev root Josef Bacik
@ 2021-03-11 19:18 ` Neal Gompa
  2021-03-17 12:27 ` David Sterba
  2021-03-18 15:43 ` David Sterba
  5 siblings, 0 replies; 15+ messages in thread
From: Neal Gompa @ 2021-03-11 19:18 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Btrfs BTRFS, kernel-team

On Thu, Mar 11, 2021 at 11:25 AM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Hello,
>
> My recent debugging session with Neal's broken filesystem uncovered a glaring
> hole in my rescue=all patches, they don't deal with a NULL dev_root properly.
> In testing I only ever tested corrupting the extent tree or the csum tree, since
> those are the most problematic.  The following 3 fixes allowed Neal to get
> rescue=all working without panicing the machine, and I verified everything by
> using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,
>
> Josef
>
> Josef Bacik (3):
>   btrfs: init devices always
>   btrfs: do not init dev stats if we have no dev_root
>   btrfs: don't init dev replace for bad dev root
>
>  fs/btrfs/dev-replace.c | 3 +++
>  fs/btrfs/disk-io.c     | 2 +-
>  fs/btrfs/volumes.c     | 3 +++
>  3 files changed, 7 insertions(+), 1 deletion(-)
>
> --
> 2.26.2
>

I'm very happy that my freak disk controller issue yielded some good
fixes for Btrfs rescue mode code. :)

Reviewed-by: Neal Gompa <ngompa13@gmail.com>


-- 
真実はいつも一つ!/ Always, there's only one truth!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] btrfs: init devices always
  2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
@ 2021-03-12  5:52   ` Anand Jain
  2021-03-12  5:57     ` Anand Jain
  2021-03-12  5:58   ` Anand Jain
  1 sibling, 1 reply; 15+ messages in thread
From: Anand Jain @ 2021-03-12  5:52 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Neal Gompa

On 12/3/21 12:23 am, Josef Bacik wrote:
> Neal reported a panic trying to use -o rescue=all
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000030
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
> RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
> RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
> RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
> RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
> R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
> R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
> FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
> Call Trace:
>   ? btrfs_init_dev_stats+0x1f/0xf0
>   ? kmem_cache_alloc+0xef/0x1f0
>   btrfs_init_dev_stats+0x5f/0xf0
>   open_ctree+0x10cb/0x1720
>   btrfs_mount_root.cold+0x12/0xea
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   vfs_kern_mount.part.0+0x71/0xb0
>   btrfs_mount+0x10d/0x380
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   path_mount+0x433/0xa00
>   __x64_sys_mount+0xe3/0x120
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> This happens because when we call btrfs_init_dev_stats we do
> device->fs_info->dev_root.  However device->fs_info isn't init'ed
> because we were only calling btrfs_init_devices_late() if we properly
> read the device root.  


> However we don't actually need the device root to
> init the devices, this function simply assigns the devices their
> ->fs_info pointer properly, so this needs to be done unconditionally
> always so that we can properly deref device->fs_info in rescue cases.
  btrfs_device_init_dev_stats() calls btrfs_search_slot() leading
  to btrfs_search_slot_get_root(), and does de-reference root (dev_root)
  to get fs_info.

-------------
  static int btrfs_device_init_dev_stats(struct btrfs_device *device,
                                        struct btrfs_path *path)
::
         ret = btrfs_search_slot(NULL, device->fs_info->dev_root, &key, 
path, 0, 0);


int btrfs_search_slot(struct btrfs_trans_handle *trans, struct 
btrfs_root *root, ...)
::
         b = btrfs_search_slot_get_root(root, p, write_lock_level);


static struct extent_buffer *btrfs_search_slot_get_root(struct 
btrfs_root *root, ...)
{
         struct btrfs_fs_info *fs_info = root->fs_info;
--------------

  Can we allocate a dummy dev_root and set its dev_root::fs_info?

Thanks, Anand


> Reported-by: Neal Gompa <ngompa13@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/disk-io.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 41b718cfea40..63656bf23ff2 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2387,8 +2387,8 @@ static int btrfs_read_roots(struct btrfs_fs_info *fs_info)
>   	} else {
>   		set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
>   		fs_info->dev_root = root;
> -		btrfs_init_devices_late(fs_info);
>   	}
> +	btrfs_init_devices_late(fs_info);
>   
>   	/* If IGNOREDATACSUMS is set don't bother reading the csum root. */
>   	if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] btrfs: init devices always
  2021-03-12  5:52   ` Anand Jain
@ 2021-03-12  5:57     ` Anand Jain
  2021-03-17 11:03       ` David Sterba
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Jain @ 2021-03-12  5:57 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Neal Gompa



On 12/3/21 1:52 pm, Anand Jain wrote:
> On 12/3/21 12:23 am, Josef Bacik wrote:
>> Neal reported a panic trying to use -o rescue=all
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000030
>> PGD 0 P4D 0
>> Oops: 0000 [#1] SMP NOPTI
>> CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 
>> 04/01/2014
>> RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
>> RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
>> RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
>> RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
>> RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
>> R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
>> R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
>> FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) 
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
>> Call Trace:
>>   ? btrfs_init_dev_stats+0x1f/0xf0
>>   ? kmem_cache_alloc+0xef/0x1f0
>>   btrfs_init_dev_stats+0x5f/0xf0
>>   open_ctree+0x10cb/0x1720
>>   btrfs_mount_root.cold+0x12/0xea
>>   legacy_get_tree+0x27/0x40
>>   vfs_get_tree+0x25/0xb0
>>   vfs_kern_mount.part.0+0x71/0xb0
>>   btrfs_mount+0x10d/0x380
>>   legacy_get_tree+0x27/0x40
>>   vfs_get_tree+0x25/0xb0
>>   path_mount+0x433/0xa00
>>   __x64_sys_mount+0xe3/0x120
>>   do_syscall_64+0x33/0x40
>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> This happens because when we call btrfs_init_dev_stats we do
>> device->fs_info->dev_root.  However device->fs_info isn't init'ed
>> because we were only calling btrfs_init_devices_late() if we properly
>> read the device root. 
> 
> 
>> However we don't actually need the device root to
>> init the devices, this function simply assigns the devices their
>> ->fs_info pointer properly, so this needs to be done unconditionally
>> always so that we can properly deref device->fs_info in rescue cases.


>   btrfs_device_init_dev_stats() calls btrfs_search_slot() leading
>   to btrfs_search_slot_get_root(), and does de-reference root (dev_root)
>   to get fs_info.

  Never mind. patch 2/3 handles it. Spoke too early.
  Maybe can reorder the patches during integration.
-Anand


> -------------
>   static int btrfs_device_init_dev_stats(struct btrfs_device *device,
>                                         struct btrfs_path *path)
> ::
>          ret = btrfs_search_slot(NULL, device->fs_info->dev_root, &key, 
> path, 0, 0);
> 
> 
> int btrfs_search_slot(struct btrfs_trans_handle *trans, struct 
> btrfs_root *root, ...)
> ::
>          b = btrfs_search_slot_get_root(root, p, write_lock_level);
> 
> 
> static struct extent_buffer *btrfs_search_slot_get_root(struct 
> btrfs_root *root, ...)
> {
>          struct btrfs_fs_info *fs_info = root->fs_info;
> --------------
> 
>   Can we allocate a dummy dev_root and set its dev_root::fs_info?



> Thanks, Anand
> 
> 
>> Reported-by: Neal Gompa <ngompa13@gmail.com>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/disk-io.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 41b718cfea40..63656bf23ff2 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -2387,8 +2387,8 @@ static int btrfs_read_roots(struct btrfs_fs_info 
>> *fs_info)
>>       } else {
>>           set_bit(BTRFS_ROOT_TRACK_DIRTY, &root->state);
>>           fs_info->dev_root = root;
>> -        btrfs_init_devices_late(fs_info);
>>       }
>> +    btrfs_init_devices_late(fs_info);
>>       /* If IGNOREDATACSUMS is set don't bother reading the csum root. */
>>       if (!btrfs_test_opt(fs_info, IGNOREDATACSUMS)) {
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] btrfs: init devices always
  2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
  2021-03-12  5:52   ` Anand Jain
@ 2021-03-12  5:58   ` Anand Jain
  1 sibling, 0 replies; 15+ messages in thread
From: Anand Jain @ 2021-03-12  5:58 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Neal Gompa

On 12/3/21 12:23 am, Josef Bacik wrote:
> Neal reported a panic trying to use -o rescue=all
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000030
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP NOPTI
> CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
> RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
> RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
> RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
> RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
> R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
> R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
> FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
> Call Trace:
>   ? btrfs_init_dev_stats+0x1f/0xf0
>   ? kmem_cache_alloc+0xef/0x1f0
>   btrfs_init_dev_stats+0x5f/0xf0
>   open_ctree+0x10cb/0x1720
>   btrfs_mount_root.cold+0x12/0xea
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   vfs_kern_mount.part.0+0x71/0xb0
>   btrfs_mount+0x10d/0x380
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   path_mount+0x433/0xa00
>   __x64_sys_mount+0xe3/0x120
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> This happens because when we call btrfs_init_dev_stats we do
> device->fs_info->dev_root.  However device->fs_info isn't init'ed
> because we were only calling btrfs_init_devices_late() if we properly
> read the device root.  However we don't actually need the device root to
> init the devices, this function simply assigns the devices their
> ->fs_info pointer properly, so this needs to be done unconditionally
> always so that we can properly deref device->fs_info in rescue cases.
> 
> Reported-by: Neal Gompa <ngompa13@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root
  2021-03-11 16:23 ` [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root Josef Bacik
@ 2021-03-12  5:59   ` Anand Jain
  0 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2021-03-12  5:59 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Neal Gompa

On 12/3/21 12:23 am, Josef Bacik wrote:
> Neal reported a panic trying to use -o rescue=all
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000030
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1
> RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0
> RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10
> RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000
> RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000
> R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00
> R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10
> FS:  00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0
> Call Trace:
> ? btrfs_init_dev_stats+0x1f/0xf0
> btrfs_init_dev_stats+0x62/0xf0
> open_ctree+0x1019/0x15ff
> btrfs_mount_root.cold+0x13/0xfa
> legacy_get_tree+0x27/0x40
> vfs_get_tree+0x25/0xb0
> vfs_kern_mount.part.0+0x71/0xb0
> btrfs_mount+0x131/0x3d0
> ? legacy_get_tree+0x27/0x40
> ? btrfs_show_options+0x640/0x640
> legacy_get_tree+0x27/0x40
> vfs_get_tree+0x25/0xb0
> path_mount+0x441/0xa80
> __x64_sys_mount+0xf4/0x130
> do_syscall_64+0x33/0x40
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f698c04e52e
> 
> This happens because we unconditionally attempt to init device stats on
> mount, but we may not have been able to read the device root.  Fix this
> by skipping init'ing the device stats if we do not have a device root.
> 
> Reported-by: Neal Gompa <ngompa13@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

  Reviewed-by: Anand Jain <anand.jain@oracle.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] btrfs: don't init dev replace for bad dev root
  2021-03-11 16:23 ` [PATCH 3/3] btrfs: don't init dev replace for bad dev root Josef Bacik
@ 2021-03-12  6:50   ` Anand Jain
  0 siblings, 0 replies; 15+ messages in thread
From: Anand Jain @ 2021-03-12  6:50 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Neal Gompa

On 12/3/21 12:23 am, Josef Bacik wrote:
> While helping Neal fix his broken file system I added a debug patch to
> catch if we were calling btrfs_search_slot with a NULL root, and this
> stack trace popped
> 
> we tried to search with a NULL root
> CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
> Call Trace:
>   dump_stack+0x6b/0x83
>   btrfs_search_slot.cold+0x11/0x1b
>   ? btrfs_init_dev_replace+0x36/0x450
>   btrfs_init_dev_replace+0x71/0x450
>   open_ctree+0x1054/0x1610
>   btrfs_mount_root.cold+0x13/0xfa
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   vfs_kern_mount.part.0+0x71/0xb0
>   btrfs_mount+0x131/0x3d0
>   ? legacy_get_tree+0x27/0x40
>   ? btrfs_show_options+0x640/0x640
>   legacy_get_tree+0x27/0x40
>   vfs_get_tree+0x25/0xb0
>   path_mount+0x441/0xa80
>   __x64_sys_mount+0xf4/0x130
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f644730352e
> 
> Fix this by not starting the device replace stuff if we do not have a
> NULL dev root.
> 
> Reported-by: Neal Gompa <ngompa13@gmail.com>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/dev-replace.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 3a9c1e046ebe..d05f73530af7 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -81,6 +81,9 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info)
>   	struct btrfs_dev_replace_item *ptr;
>   	u64 src_devid;
>   
> +	if (!dev_root)
> +		return 0;
> +

  If we have a replace device target. fs_info->dev_replace->is_valid is
  0, and btrfs_dev_replace_is_ongoing() is false. And num_device count
  at various places does not include replace target device. And also
  this happens only in RO mount.

  Reviewed-by: Anand Jain <anand.jain@oracle.com>


>   	path = btrfs_alloc_path();
>   	if (!path) {
>   		ret = -ENOMEM;
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] btrfs: init devices always
  2021-03-12  5:57     ` Anand Jain
@ 2021-03-17 11:03       ` David Sterba
  0 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2021-03-17 11:03 UTC (permalink / raw)
  To: Anand Jain; +Cc: Josef Bacik, linux-btrfs, kernel-team, Neal Gompa

On Fri, Mar 12, 2021 at 01:57:32PM +0800, Anand Jain wrote:
> 
> 
> On 12/3/21 1:52 pm, Anand Jain wrote:
> > On 12/3/21 12:23 am, Josef Bacik wrote:
> >> Neal reported a panic trying to use -o rescue=all
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000030
> >> PGD 0 P4D 0
> >> Oops: 0000 [#1] SMP NOPTI
> >> CPU: 0 PID: 696 Comm: mount Tainted: G        W         5.12.0-rc2+ #296
> >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 
> >> 04/01/2014
> >> RIP: 0010:btrfs_device_init_dev_stats+0x1d/0x200
> >> RSP: 0018:ffffafaec1483bb8 EFLAGS: 00010286
> >> RAX: 0000000000000000 RBX: ffff9a5715bcb298 RCX: 0000000000000070
> >> RDX: ffff9a5703248000 RSI: ffff9a57052ea150 RDI: ffff9a5715bca400
> >> RBP: ffff9a57052ea150 R08: 0000000000000070 R09: ffff9a57052ea150
> >> R10: 000130faf0741c10 R11: 0000000000000000 R12: ffff9a5703700000
> >> R13: 0000000000000000 R14: ffff9a5715bcb278 R15: ffff9a57052ea150
> >> FS:  00007f600d122c40(0000) GS:ffff9a577bc00000(0000) 
> >> knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 0000000000000030 CR3: 0000000112a46005 CR4: 0000000000370ef0
> >> Call Trace:
> >>   ? btrfs_init_dev_stats+0x1f/0xf0
> >>   ? kmem_cache_alloc+0xef/0x1f0
> >>   btrfs_init_dev_stats+0x5f/0xf0
> >>   open_ctree+0x10cb/0x1720
> >>   btrfs_mount_root.cold+0x12/0xea
> >>   legacy_get_tree+0x27/0x40
> >>   vfs_get_tree+0x25/0xb0
> >>   vfs_kern_mount.part.0+0x71/0xb0
> >>   btrfs_mount+0x10d/0x380
> >>   legacy_get_tree+0x27/0x40
> >>   vfs_get_tree+0x25/0xb0
> >>   path_mount+0x433/0xa00
> >>   __x64_sys_mount+0xe3/0x120
> >>   do_syscall_64+0x33/0x40
> >>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>
> >> This happens because when we call btrfs_init_dev_stats we do
> >> device->fs_info->dev_root.  However device->fs_info isn't init'ed
> >> because we were only calling btrfs_init_devices_late() if we properly
> >> read the device root. 
> > 
> > 
> >> However we don't actually need the device root to
> >> init the devices, this function simply assigns the devices their
> >> ->fs_info pointer properly, so this needs to be done unconditionally
> >> always so that we can properly deref device->fs_info in rescue cases.
> 
> 
> >   btrfs_device_init_dev_stats() calls btrfs_search_slot() leading
> >   to btrfs_search_slot_get_root(), and does de-reference root (dev_root)
> >   to get fs_info.
> 
>   Never mind. patch 2/3 handles it. Spoke too early.
>   Maybe can reorder the patches during integration.

I think swapping the order makes sense, though all the patches fix some
sort of crash, the number should be decreasing.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] Handle bad dev_root properly with rescue=all
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
                   ` (3 preceding siblings ...)
  2021-03-11 19:18 ` [PATCH 0/3] Handle bad dev_root properly with rescue=all Neal Gompa
@ 2021-03-17 12:27 ` David Sterba
  2021-03-17 15:30   ` Josef Bacik
  2021-03-18 15:43 ` David Sterba
  5 siblings, 1 reply; 15+ messages in thread
From: David Sterba @ 2021-03-17 12:27 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Mar 11, 2021 at 11:23:13AM -0500, Josef Bacik wrote:
> Hello,
> 
> My recent debugging session with Neal's broken filesystem uncovered a glaring
> hole in my rescue=all patches, they don't deal with a NULL dev_root properly.
> In testing I only ever tested corrupting the extent tree or the csum tree, since
> those are the most problematic.  The following 3 fixes allowed Neal to get
> rescue=all working without panicing the machine, and I verified everything by
> using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,

When rescue= is set lots of things can't work, I was wondering if we
should add messages once the "if (!dev_root)" cases are hit but I don't
think we need it, it should be clear enough from the other rescue=
related messages.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] Handle bad dev_root properly with rescue=all
  2021-03-17 12:27 ` David Sterba
@ 2021-03-17 15:30   ` Josef Bacik
  0 siblings, 0 replies; 15+ messages in thread
From: Josef Bacik @ 2021-03-17 15:30 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On 3/17/21 8:27 AM, David Sterba wrote:
> On Thu, Mar 11, 2021 at 11:23:13AM -0500, Josef Bacik wrote:
>> Hello,
>>
>> My recent debugging session with Neal's broken filesystem uncovered a glaring
>> hole in my rescue=all patches, they don't deal with a NULL dev_root properly.
>> In testing I only ever tested corrupting the extent tree or the csum tree, since
>> those are the most problematic.  The following 3 fixes allowed Neal to get
>> rescue=all working without panicing the machine, and I verified everything by
>> using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,
> 
> When rescue= is set lots of things can't work, I was wondering if we
> should add messages once the "if (!dev_root)" cases are hit but I don't
> think we need it, it should be clear enough from the other rescue=
> related messages.
> 

Yeah I went back and forth on this, and I came to the same conclusion as you. 
If we're doing rescue=all we know things are bad, and we just want our data 
back.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] Handle bad dev_root properly with rescue=all
  2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
                   ` (4 preceding siblings ...)
  2021-03-17 12:27 ` David Sterba
@ 2021-03-18 15:43 ` David Sterba
  2021-03-18 20:45   ` Josef Bacik
  5 siblings, 1 reply; 15+ messages in thread
From: David Sterba @ 2021-03-18 15:43 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Mar 11, 2021 at 11:23:13AM -0500, Josef Bacik wrote:
[...]

> rescue=all working without panicing the machine,
> and I verified everything by
> using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,

We need to have such testing part of some suite but it depends on the
utilities that we don't want to ship by default. Last time we discussed
how to make btrfs-corrupt-block or similar available in source form in
fstests it was not pretty, like having half of the btrfs-progs sources
providing code to read and modify the data structures.

So, what if: we have such tests in the btrfs-progs testsuite. As they're
potentially dangerous, not run by default, but available for easy setup
and test in a VM. The testsuite can be exported into a tarball, with the
binaries included. Or simply run it built from git, that works too just
needs more dependencies installed.

I was thinking about collecting all the stress tests into one place,
fstests already has some but my idea is to provide stress testing of
btrfs-specific features, more at once. Or require some 3rd party tools
and data sources to provide the load.

It would be some infrastructure duplication with fstests, but lots of
that is already done in btrfs-progs and I'd rather go with some
duplication instead of development-time-only testing.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/3] Handle bad dev_root properly with rescue=all
  2021-03-18 15:43 ` David Sterba
@ 2021-03-18 20:45   ` Josef Bacik
  0 siblings, 0 replies; 15+ messages in thread
From: Josef Bacik @ 2021-03-18 20:45 UTC (permalink / raw)
  To: dsterba, linux-btrfs, kernel-team

On 3/18/21 11:43 AM, David Sterba wrote:
> On Thu, Mar 11, 2021 at 11:23:13AM -0500, Josef Bacik wrote:
> [...]
> 
>> rescue=all working without panicing the machine,
>> and I verified everything by
>> using btrfs-corrupt-block to corrupt a dev root of a file system.  Thanks,
> 
> We need to have such testing part of some suite but it depends on the
> utilities that we don't want to ship by default. Last time we discussed
> how to make btrfs-corrupt-block or similar available in source form in
> fstests it was not pretty, like having half of the btrfs-progs sources
> providing code to read and modify the data structures.
> 
> So, what if: we have such tests in the btrfs-progs testsuite. As they're
> potentially dangerous, not run by default, but available for easy setup
> and test in a VM. The testsuite can be exported into a tarball, with the
> binaries included. Or simply run it built from git, that works too just
> needs more dependencies installed.
> 
> I was thinking about collecting all the stress tests into one place,
> fstests already has some but my idea is to provide stress testing of
> btrfs-specific features, more at once. Or require some 3rd party tools
> and data sources to provide the load.
> 
> It would be some infrastructure duplication with fstests, but lots of
> that is already done in btrfs-progs and I'd rather go with some
> duplication instead of development-time-only testing.
> 

I actually had Boris working on this, basically allow us to set 
BTRFS_CORRUPT_PROG inside our fstests config, and then allow us to write these 
tests for fstests.  I'd prefer to keep this inside xfstests for mostly selfish 
reasons, it's easier for me to adapt the existing continual testing to take 
advantage of this and automatically run the xfstests stuff and post the results. 
  If we keep it in btrfs-progs I have to write a bunch of code to run those to 
post the results to our continual testing stuff.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-18 20:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11 16:23 [PATCH 0/3] Handle bad dev_root properly with rescue=all Josef Bacik
2021-03-11 16:23 ` [PATCH 1/3] btrfs: init devices always Josef Bacik
2021-03-12  5:52   ` Anand Jain
2021-03-12  5:57     ` Anand Jain
2021-03-17 11:03       ` David Sterba
2021-03-12  5:58   ` Anand Jain
2021-03-11 16:23 ` [PATCH 2/3] btrfs: do not init dev stats if we have no dev_root Josef Bacik
2021-03-12  5:59   ` Anand Jain
2021-03-11 16:23 ` [PATCH 3/3] btrfs: don't init dev replace for bad dev root Josef Bacik
2021-03-12  6:50   ` Anand Jain
2021-03-11 19:18 ` [PATCH 0/3] Handle bad dev_root properly with rescue=all Neal Gompa
2021-03-17 12:27 ` David Sterba
2021-03-17 15:30   ` Josef Bacik
2021-03-18 15:43 ` David Sterba
2021-03-18 20:45   ` Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.