All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device()
@ 2019-11-13 10:27 Johannes Thumshirn
  2019-11-13 10:27 ` [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before Johannes Thumshirn
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

This series attempts to remove the BUG_ON()s in btrfs_close_one_device().
Therefore some reorganization of btrfs_close_one_device() and
close_fs_devices() was needed.

Forthermore a new BUG_ON() had to be temporarily introduced but is removed
again in the last patch of theis series.

Although it is generally legal to return -ENOMEM on umount(2), the error
handling up until close_ctree() as neither close_ctree() nor btrfs_put_super()
would be able to handle the error.

This series has passed fstests without any deviation from the baseline and
also the new error handling was tested via error injection using this snippet:

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7c55169c0613..c58802c9c39c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1069,6 +1069,9 @@ static int btrfs_close_one_device(struct btrfs_device *device)
 
 	new_device = btrfs_alloc_device(NULL, &device->devid,
 					device->uuid);
+	btrfs_free_device(new_device);
+	pr_err("%s() INJECTING -ENOMEM\n", __func__);
+	new_device = ERR_PTR(-ENOMEM);
 	if (IS_ERR(new_device))
 		return PTR_ERR(new_device);

Changes to v1:

Fixed the decremt of btrfs_fs_devices::seeding.

In addition to this, I've added two patches changing btrfs_fs_devices::seeding
and btrfs_fs_devices::rotating to bool, as they are in fact used as booleans.

Johannes Thumshirn (7):
  btrfs: decrement number of open devices after closing the device not
    before
  btrfs: handle device allocation failure in btrfs_close_one_device()
  btrfs: handle allocation failure in strdup
  btrfs: handle error return of close_fs_devices()
  btrfs: remove final BUG_ON() in close_fs_devices()
  btrfs: change btrfs_fs_devices::seeing to bool
  btrfs: change btrfs_fs_devices::rotating to bool

 fs/btrfs/volumes.c | 81 ++++++++++++++++++++++++++++++++++++------------------
 fs/btrfs/volumes.h |  4 +--
 2 files changed, 56 insertions(+), 29 deletions(-)

-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-13 10:27 ` [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device() Johannes Thumshirn
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

In btrfs_close_one_device we're decrementing the number of open devices
before we're calling btrfs_close_bdev().

As there is no intermediate exit between these points in this function it
is technically OK to do so, but it a) is a bit harder to understand and b)
further refactoring to implement proper error handling in
btrfs_close_one_device() will change the layout of the function.

Move both operations closer together and move the decrement step after
btrfs_close_bdev().

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 22a5bd991e47..5ee26e7fca32 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1067,9 +1067,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
 	struct btrfs_device *new_device;
 	struct rcu_string *name;
 
-	if (device->bdev)
-		fs_devices->open_devices--;
-
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
 	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
 		list_del_init(&device->dev_alloc_list);
@@ -1080,6 +1077,8 @@ static void btrfs_close_one_device(struct btrfs_device *device)
 		fs_devices->missing_devices--;
 
 	btrfs_close_bdev(device);
+	if (device->bdev)
+		fs_devices->open_devices--;
 
 	new_device = btrfs_alloc_device(NULL, &device->devid,
 					device->uuid);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
  2019-11-13 10:27 ` [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-13 14:58   ` David Sterba
  2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

In btrfs_close_one_device() we're allocating a new device and if this
fails we BUG().

Move the allocation to the top of the function and return an error in case
it failed.

The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
btrfs_close_one_device() as further work is pending to untangle this.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5ee26e7fca32..0a2a73907563 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
 	blkdev_put(device->bdev, device->mode);
 }
 
-static void btrfs_close_one_device(struct btrfs_device *device)
+static int btrfs_close_one_device(struct btrfs_device *device)
 {
 	struct btrfs_fs_devices *fs_devices = device->fs_devices;
 	struct btrfs_device *new_device;
 	struct rcu_string *name;
 
+	new_device = btrfs_alloc_device(NULL, &device->devid,
+					device->uuid);
+	if (IS_ERR(new_device))
+		goto err_close_device;
+
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
 	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
 		list_del_init(&device->dev_alloc_list);
@@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
 	if (device->bdev)
 		fs_devices->open_devices--;
 
-	new_device = btrfs_alloc_device(NULL, &device->devid,
-					device->uuid);
-	BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
-
 	/* Safe because we are under uuid_mutex */
 	if (device->name) {
 		name = rcu_string_strdup(device->name->str, GFP_NOFS);
@@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
 
 	synchronize_rcu();
 	btrfs_free_device(device);
+
+	return 0;
+
+err_close_device:
+	btrfs_close_bdev(device);
+	if (device->bdev) {
+		fs_devices->open_devices--;
+		btrfs_sysfs_rm_device_link(fs_devices, device);
+		device->bdev = NULL;
+	}
+
+	return -ENOMEM;
 }
 
 static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
 {
 	struct btrfs_device *device, *tmp;
+	int ret;
 
 	if (--fs_devices->opened > 0)
 		return 0;
 
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list) {
-		btrfs_close_one_device(device);
+		ret = btrfs_close_one_device(device);
+		BUG_ON(ret); /* -ENOMEM */
 	}
 	mutex_unlock(&fs_devices->device_list_mutex);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 3/7] btrfs: handle allocation failure in strdup
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
  2019-11-13 10:27 ` [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before Johannes Thumshirn
  2019-11-13 10:27 ` [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device() Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-14 11:00   ` Anand Jain
  2019-11-15 21:11   ` Nikolay Borisov
  2019-11-13 10:27 ` [PATCH v2 4/7] btrfs: handle error return of close_fs_devices() Johannes Thumshirn
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

Gracefully handle allocation failures in btrfs_close_one_device()'s
rcu_string_strdup() instead of crashing the machine.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0a2a73907563..e5864ca3bb3b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1064,7 +1064,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
 static int btrfs_close_one_device(struct btrfs_device *device)
 {
 	struct btrfs_fs_devices *fs_devices = device->fs_devices;
-	struct btrfs_device *new_device;
+	struct btrfs_device *new_device = NULL;
 	struct rcu_string *name;
 
 	new_device = btrfs_alloc_device(NULL, &device->devid,
@@ -1072,6 +1072,15 @@ static int btrfs_close_one_device(struct btrfs_device *device)
 	if (IS_ERR(new_device))
 		goto err_close_device;
 
+	/* Safe because we are under uuid_mutex */
+	if (device->name) {
+		name = rcu_string_strdup(device->name->str, GFP_NOFS);
+		if (!name)
+			goto err_free_device;
+
+		rcu_assign_pointer(new_device->name, name);
+	}
+
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
 	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
 		list_del_init(&device->dev_alloc_list);
@@ -1085,13 +1094,6 @@ static int btrfs_close_one_device(struct btrfs_device *device)
 	if (device->bdev)
 		fs_devices->open_devices--;
 
-	/* Safe because we are under uuid_mutex */
-	if (device->name) {
-		name = rcu_string_strdup(device->name->str, GFP_NOFS);
-		BUG_ON(!name); /* -ENOMEM */
-		rcu_assign_pointer(new_device->name, name);
-	}
-
 	list_replace_rcu(&device->dev_list, &new_device->dev_list);
 	new_device->fs_devices = device->fs_devices;
 
@@ -1100,6 +1102,10 @@ static int btrfs_close_one_device(struct btrfs_device *device)
 
 	return 0;
 
+err_free_device:
+	if (new_device)
+		btrfs_free_device(new_device);
+
 err_close_device:
 	btrfs_close_bdev(device);
 	if (device->bdev) {
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/7] btrfs: handle error return of close_fs_devices()
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (2 preceding siblings ...)
  2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-13 15:00   ` David Sterba
  2019-11-13 10:27 ` [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices() Johannes Thumshirn
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

close_fs_devices() will be able to return an error instead of crashing
after the following patch.

Prepare btrfs_close_devices() for this.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e5864ca3bb3b..be1fd935edf7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1143,10 +1143,10 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
 int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 {
 	struct btrfs_fs_devices *seed_devices = NULL;
-	int ret;
+	int err, err2 = 0;
 
 	mutex_lock(&uuid_mutex);
-	ret = close_fs_devices(fs_devices);
+	err = close_fs_devices(fs_devices);
 	if (!fs_devices->opened) {
 		seed_devices = fs_devices->seed;
 		fs_devices->seed = NULL;
@@ -1156,10 +1156,13 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 	while (seed_devices) {
 		fs_devices = seed_devices;
 		seed_devices = fs_devices->seed;
-		close_fs_devices(fs_devices);
+		err2 = close_fs_devices(fs_devices);
 		free_fs_devices(fs_devices);
 	}
-	return ret;
+
+	if (err2)
+		return err2;
+	return err;
 }
 
 static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices()
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (3 preceding siblings ...)
  2019-11-13 10:27 ` [PATCH v2 4/7] btrfs: handle error return of close_fs_devices() Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-13 15:02   ` David Sterba
  2019-11-13 10:27 ` [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool Johannes Thumshirn
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

Now that the preparation work is done, remove the temporary BUG_ON() in
close_fs_devices() and return an error instead.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>

---
Changes to v1:
- btrfs_fs_devices::seeding is a 'boolean' flags and no counter, don't
  decrement it (Qu)
---
 fs/btrfs/volumes.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index be1fd935edf7..25e4608e20f1 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1128,7 +1128,11 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list) {
 		ret = btrfs_close_one_device(device);
-		BUG_ON(ret); /* -ENOMEM */
+		if (ret) {
+			mutex_unlock(&fs_devices->device_list_mutex);
+			return ret;
+		}
+		fs_devices->opened--;
 	}
 	mutex_unlock(&fs_devices->device_list_mutex);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (4 preceding siblings ...)
  2019-11-13 10:27 ` [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices() Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-14 11:04   ` Anand Jain
  2019-11-13 10:27 ` [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating " Johannes Thumshirn
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

struct btrfs_fs_devices::seeding currently is declared as an integer
variable but only used as a boolean.

Change the variable definition to bool and update to code touching it to
set 'true' and 'false'.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 8 ++++----
 fs/btrfs/volumes.h | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 25e4608e20f1..ce9c6fa3a32c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -634,7 +634,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 		}
 
 		clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
-		fs_devices->seeding = 1;
+		fs_devices->seeding = true;
 	} else {
 		if (bdev_read_only(bdev))
 			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
@@ -1139,7 +1139,7 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
 	WARN_ON(fs_devices->open_devices);
 	WARN_ON(fs_devices->rw_devices);
 	fs_devices->opened = 0;
-	fs_devices->seeding = 0;
+	fs_devices->seeding = false;
 
 	return 0;
 }
@@ -2297,7 +2297,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info)
 	list_splice_init(&fs_devices->alloc_list, &seed_devices->alloc_list);
 	mutex_unlock(&fs_info->chunk_mutex);
 
-	fs_devices->seeding = 0;
+	fs_devices->seeding = false;
 	fs_devices->num_devices = 0;
 	fs_devices->open_devices = 0;
 	fs_devices->missing_devices = 0;
@@ -6681,7 +6681,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
 		if (IS_ERR(fs_devices))
 			return fs_devices;
 
-		fs_devices->seeding = 1;
+		fs_devices->seeding = true;
 		fs_devices->opened = 1;
 		return fs_devices;
 	}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 46987a2da786..8e9513b3fe9d 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -243,7 +243,7 @@ struct btrfs_fs_devices {
 	struct list_head alloc_list;
 
 	struct btrfs_fs_devices *seed;
-	int seeding;
+	bool seeding;
 
 	int opened;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating to bool
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (5 preceding siblings ...)
  2019-11-13 10:27 ` [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool Johannes Thumshirn
@ 2019-11-13 10:27 ` Johannes Thumshirn
  2019-11-14 11:05   ` Anand Jain
  2019-11-13 11:56 ` [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Qu Wenruo
  2019-11-13 15:05 ` David Sterba
  8 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-13 10:27 UTC (permalink / raw)
  To: David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist, Johannes Thumshirn

struct btrfs_fs_devices::rotating currently is declared as an integer
variable but only used as a boolean.

Change the variable definition to bool and update to code touching it to
set 'true' and 'false'.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/volumes.c | 6 +++---
 fs/btrfs/volumes.h | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ce9c6fa3a32c..6d3bfea2e2d5 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -644,7 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
 
 	q = bdev_get_queue(bdev);
 	if (!blk_queue_nonrot(q))
-		fs_devices->rotating = 1;
+		fs_devices->rotating = true;
 
 	device->bdev = bdev;
 	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
@@ -2301,7 +2301,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info)
 	fs_devices->num_devices = 0;
 	fs_devices->open_devices = 0;
 	fs_devices->missing_devices = 0;
-	fs_devices->rotating = 0;
+	fs_devices->rotating = false;
 	fs_devices->seed = seed_devices;
 
 	generate_random_uuid(fs_devices->fsid);
@@ -2496,7 +2496,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	atomic64_add(device->total_bytes, &fs_info->free_chunk_space);
 
 	if (!blk_queue_nonrot(q))
-		fs_devices->rotating = 1;
+		fs_devices->rotating = true;
 
 	orig_super_total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
 	btrfs_set_super_total_bytes(fs_info->super_copy,
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 8e9513b3fe9d..fc1b564b9cfe 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -250,7 +250,7 @@ struct btrfs_fs_devices {
 	/* set when we find or add a device that doesn't have the
 	 * nonrot flag set
 	 */
-	int rotating;
+	bool rotating;
 
 	struct btrfs_fs_info *fs_info;
 	/* sysfs kobjects */
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device()
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (6 preceding siblings ...)
  2019-11-13 10:27 ` [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating " Johannes Thumshirn
@ 2019-11-13 11:56 ` Qu Wenruo
  2019-11-13 15:05 ` David Sterba
  8 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2019-11-13 11:56 UTC (permalink / raw)
  To: Johannes Thumshirn, David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist


[-- Attachment #1.1: Type: text/plain, Size: 2182 bytes --]



On 2019/11/13 下午6:27, Johannes Thumshirn wrote:
> This series attempts to remove the BUG_ON()s in btrfs_close_one_device().
> Therefore some reorganization of btrfs_close_one_device() and
> close_fs_devices() was needed.
> 
> Forthermore a new BUG_ON() had to be temporarily introduced but is removed
> again in the last patch of theis series.
> 
> Although it is generally legal to return -ENOMEM on umount(2), the error
> handling up until close_ctree() as neither close_ctree() nor btrfs_put_super()
> would be able to handle the error.
> 
> This series has passed fstests without any deviation from the baseline and
> also the new error handling was tested via error injection using this snippet:
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 7c55169c0613..c58802c9c39c 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1069,6 +1069,9 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>  
>  	new_device = btrfs_alloc_device(NULL, &device->devid,
>  					device->uuid);
> +	btrfs_free_device(new_device);
> +	pr_err("%s() INJECTING -ENOMEM\n", __func__);
> +	new_device = ERR_PTR(-ENOMEM);
>  	if (IS_ERR(new_device))
>  		return PTR_ERR(new_device);
> 
> Changes to v1:
> 
> Fixed the decremt of btrfs_fs_devices::seeding.
> 
> In addition to this, I've added two patches changing btrfs_fs_devices::seeding
> and btrfs_fs_devices::rotating to bool, as they are in fact used as booleans.

Reviewed-by: Qu Wenruo <wqu@suse.com>

Thanks,
Qu

> 
> Johannes Thumshirn (7):
>   btrfs: decrement number of open devices after closing the device not
>     before
>   btrfs: handle device allocation failure in btrfs_close_one_device()
>   btrfs: handle allocation failure in strdup
>   btrfs: handle error return of close_fs_devices()
>   btrfs: remove final BUG_ON() in close_fs_devices()
>   btrfs: change btrfs_fs_devices::seeing to bool
>   btrfs: change btrfs_fs_devices::rotating to bool
> 
>  fs/btrfs/volumes.c | 81 ++++++++++++++++++++++++++++++++++++------------------
>  fs/btrfs/volumes.h |  4 +--
>  2 files changed, 56 insertions(+), 29 deletions(-)
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-13 10:27 ` [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device() Johannes Thumshirn
@ 2019-11-13 14:58   ` David Sterba
  2019-11-14  8:48     ` Johannes Thumshirn
  0 siblings, 1 reply; 24+ messages in thread
From: David Sterba @ 2019-11-13 14:58 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
> In btrfs_close_one_device() we're allocating a new device and if this
> fails we BUG().
> 
> Move the allocation to the top of the function and return an error in case
> it failed.
> 
> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
> btrfs_close_one_device() as further work is pending to untangle this.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 5ee26e7fca32..0a2a73907563 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>  	blkdev_put(device->bdev, device->mode);
>  }
>  
> -static void btrfs_close_one_device(struct btrfs_device *device)
> +static int btrfs_close_one_device(struct btrfs_device *device)
>  {
>  	struct btrfs_fs_devices *fs_devices = device->fs_devices;
>  	struct btrfs_device *new_device;
>  	struct rcu_string *name;
>  
> +	new_device = btrfs_alloc_device(NULL, &device->devid,
> +					device->uuid);
> +	if (IS_ERR(new_device))
> +		goto err_close_device;
> +
>  	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>  	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
>  		list_del_init(&device->dev_alloc_list);
> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>  	if (device->bdev)
>  		fs_devices->open_devices--;
>  
> -	new_device = btrfs_alloc_device(NULL, &device->devid,
> -					device->uuid);
> -	BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
> -
>  	/* Safe because we are under uuid_mutex */
>  	if (device->name) {
>  		name = rcu_string_strdup(device->name->str, GFP_NOFS);
> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>  
>  	synchronize_rcu();
>  	btrfs_free_device(device);
> +
> +	return 0;
> +
> +err_close_device:
> +	btrfs_close_bdev(device);
> +	if (device->bdev) {
> +		fs_devices->open_devices--;
> +		btrfs_sysfs_rm_device_link(fs_devices, device);
> +		device->bdev = NULL;
> +	}

I don't understand this part: the 'device' pointer is from the argument,
so the device we want to delete from the list and for that all the state
bit tests, bdev close, list replace rcu and synchronize_rcu should
happen -- in case we have a newly allocated new_device.

What I don't understand how the short version after label
err_close_device: is correct. The device is still left in the list but
with NULL bdev but rw_devices, missing_devices is untouched.

That a device closing needs to allocate memory for a new device instead
of reinitializing it again is stupid but with the simplified device
closing I'm not sure the state is well defined.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/7] btrfs: handle error return of close_fs_devices()
  2019-11-13 10:27 ` [PATCH v2 4/7] btrfs: handle error return of close_fs_devices() Johannes Thumshirn
@ 2019-11-13 15:00   ` David Sterba
  2019-11-14  8:15     ` Johannes Thumshirn
  0 siblings, 1 reply; 24+ messages in thread
From: David Sterba @ 2019-11-13 15:00 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On Wed, Nov 13, 2019 at 11:27:25AM +0100, Johannes Thumshirn wrote:
> close_fs_devices() will be able to return an error instead of crashing
> after the following patch.
> 
> Prepare btrfs_close_devices() for this.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/volumes.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index e5864ca3bb3b..be1fd935edf7 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1143,10 +1143,10 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
>  int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
>  {
>  	struct btrfs_fs_devices *seed_devices = NULL;
> -	int ret;
> +	int err, err2 = 0;

Please use ret and ret2.

>  	mutex_lock(&uuid_mutex);
> -	ret = close_fs_devices(fs_devices);
> +	err = close_fs_devices(fs_devices);
>  	if (!fs_devices->opened) {
>  		seed_devices = fs_devices->seed;
>  		fs_devices->seed = NULL;
> @@ -1156,10 +1156,13 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
>  	while (seed_devices) {
>  		fs_devices = seed_devices;
>  		seed_devices = fs_devices->seed;
> -		close_fs_devices(fs_devices);
> +		err2 = close_fs_devices(fs_devices);

So only the last error value gets propagated to the return statements.
Is that intentional?

>  		free_fs_devices(fs_devices);
>  	}
> -	return ret;
> +
> +	if (err2)
> +		return err2;
> +	return err;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices()
  2019-11-13 10:27 ` [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices() Johannes Thumshirn
@ 2019-11-13 15:02   ` David Sterba
  2019-11-14  9:01     ` Johannes Thumshirn
  0 siblings, 1 reply; 24+ messages in thread
From: David Sterba @ 2019-11-13 15:02 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On Wed, Nov 13, 2019 at 11:27:26AM +0100, Johannes Thumshirn wrote:
> Now that the preparation work is done, remove the temporary BUG_ON() in
> close_fs_devices() and return an error instead.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> 
> ---
> Changes to v1:
> - btrfs_fs_devices::seeding is a 'boolean' flags and no counter, don't
>   decrement it (Qu)
> ---
>  fs/btrfs/volumes.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index be1fd935edf7..25e4608e20f1 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1128,7 +1128,11 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
>  	mutex_lock(&fs_devices->device_list_mutex);
>  	list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list) {
>  		ret = btrfs_close_one_device(device);
> -		BUG_ON(ret); /* -ENOMEM */
> +		if (ret) {
> +			mutex_unlock(&fs_devices->device_list_mutex);
> +			return ret;

This can fail in the middle of the loop thus leaving some devices in the
list and keeping open_devices half-changed.

Not all callers of close_fs_devices handle the errors so this can break
invariants, where eg. fs_devices->opened is expected to be 0 after the
function call, similar for ->seeding or ->rw_devices.

> +		}
> +		fs_devices->opened--;
>  	}
>  	mutex_unlock(&fs_devices->device_list_mutex);
>  
> -- 
> 2.16.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device()
  2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
                   ` (7 preceding siblings ...)
  2019-11-13 11:56 ` [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Qu Wenruo
@ 2019-11-13 15:05 ` David Sterba
  8 siblings, 0 replies; 24+ messages in thread
From: David Sterba @ 2019-11-13 15:05 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On Wed, Nov 13, 2019 at 11:27:21AM +0100, Johannes Thumshirn wrote:
> This series attempts to remove the BUG_ON()s in btrfs_close_one_device().
> Therefore some reorganization of btrfs_close_one_device() and
> close_fs_devices() was needed.
> 
> Forthermore a new BUG_ON() had to be temporarily introduced but is removed
> again in the last patch of theis series.
> 
> Although it is generally legal to return -ENOMEM on umount(2), the error
> handling up until close_ctree() as neither close_ctree() nor btrfs_put_super()
> would be able to handle the error.
> 
> This series has passed fstests without any deviation from the baseline and
> also the new error handling was tested via error injection using this snippet:
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 7c55169c0613..c58802c9c39c 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1069,6 +1069,9 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>  
>  	new_device = btrfs_alloc_device(NULL, &device->devid,
>  					device->uuid);
> +	btrfs_free_device(new_device);
> +	pr_err("%s() INJECTING -ENOMEM\n", __func__);
> +	new_device = ERR_PTR(-ENOMEM);
>  	if (IS_ERR(new_device))
>  		return PTR_ERR(new_device);
> 
> Changes to v1:
> 
> Fixed the decremt of btrfs_fs_devices::seeding.
> 
> In addition to this, I've added two patches changing btrfs_fs_devices::seeding
> and btrfs_fs_devices::rotating to bool, as they are in fact used as booleans.
> 
> Johannes Thumshirn (7):
>   btrfs: decrement number of open devices after closing the device not
>     before
>   btrfs: handle device allocation failure in btrfs_close_one_device()
>   btrfs: handle allocation failure in strdup
>   btrfs: handle error return of close_fs_devices()
>   btrfs: remove final BUG_ON() in close_fs_devices()
>   btrfs: change btrfs_fs_devices::seeing to bool
>   btrfs: change btrfs_fs_devices::rotating to bool

I'll add the last 2 for now as they're obviously correct, the rest has
some questions that I'd like to clarify before merging.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/7] btrfs: handle error return of close_fs_devices()
  2019-11-13 15:00   ` David Sterba
@ 2019-11-14  8:15     ` Johannes Thumshirn
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-14  8:15 UTC (permalink / raw)
  To: dsterba, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On 13/11/2019 16:00, David Sterba wrote:
> On Wed, Nov 13, 2019 at 11:27:25AM +0100, Johannes Thumshirn wrote:
>> close_fs_devices() will be able to return an error instead of crashing
>> after the following patch.
>>
>> Prepare btrfs_close_devices() for this.
>>
>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>> ---
>>  fs/btrfs/volumes.c | 11 +++++++----
>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index e5864ca3bb3b..be1fd935edf7 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1143,10 +1143,10 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
>>  int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
>>  {
>>  	struct btrfs_fs_devices *seed_devices = NULL;
>> -	int ret;
>> +	int err, err2 = 0;
> 
> Please use ret and ret2.

Sure.

>>  	mutex_lock(&uuid_mutex);
>> -	ret = close_fs_devices(fs_devices);
>> +	err = close_fs_devices(fs_devices);
>>  	if (!fs_devices->opened) {
>>  		seed_devices = fs_devices->seed;
>>  		fs_devices->seed = NULL;
>> @@ -1156,10 +1156,13 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
>>  	while (seed_devices) {
>>  		fs_devices = seed_devices;
>>  		seed_devices = fs_devices->seed;
>> -		close_fs_devices(fs_devices);
>> +		err2 = close_fs_devices(fs_devices);
> 
> So only the last error value gets propagated to the return statements.
> Is that intentional?

Yes it was. Even if the first close_fs_devices() call fails, we can
still try to close eventual seed devices. If this fails as well, we
return the second error. If it succeeds we'll return the first error (or 0).

Byte,
	Johannes
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-13 14:58   ` David Sterba
@ 2019-11-14  8:48     ` Johannes Thumshirn
  2019-11-14 10:56       ` Anand Jain
  0 siblings, 1 reply; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-14  8:48 UTC (permalink / raw)
  To: dsterba, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On 13/11/2019 15:58, David Sterba wrote:
> On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
>> In btrfs_close_one_device() we're allocating a new device and if this
>> fails we BUG().
>>
>> Move the allocation to the top of the function and return an error in case
>> it failed.
>>
>> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
>> btrfs_close_one_device() as further work is pending to untangle this.
>>
>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>> ---
>>  fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
>>  1 file changed, 21 insertions(+), 6 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 5ee26e7fca32..0a2a73907563 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>>  	blkdev_put(device->bdev, device->mode);
>>  }
>>  
>> -static void btrfs_close_one_device(struct btrfs_device *device)
>> +static int btrfs_close_one_device(struct btrfs_device *device)
>>  {
>>  	struct btrfs_fs_devices *fs_devices = device->fs_devices;
>>  	struct btrfs_device *new_device;
>>  	struct rcu_string *name;
>>  
>> +	new_device = btrfs_alloc_device(NULL, &device->devid,
>> +					device->uuid);
>> +	if (IS_ERR(new_device))
>> +		goto err_close_device;
>> +
>>  	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>>  	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
>>  		list_del_init(&device->dev_alloc_list);
>> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>  	if (device->bdev)
>>  		fs_devices->open_devices--;
>>  
>> -	new_device = btrfs_alloc_device(NULL, &device->devid,
>> -					device->uuid);
>> -	BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
>> -
>>  	/* Safe because we are under uuid_mutex */
>>  	if (device->name) {
>>  		name = rcu_string_strdup(device->name->str, GFP_NOFS);
>> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>  
>>  	synchronize_rcu();
>>  	btrfs_free_device(device);
>> +
>> +	return 0;
>> +
>> +err_close_device:
>> +	btrfs_close_bdev(device);
>> +	if (device->bdev) {
>> +		fs_devices->open_devices--;
>> +		btrfs_sysfs_rm_device_link(fs_devices, device);
>> +		device->bdev = NULL;
>> +	}
> 
> I don't understand this part: the 'device' pointer is from the argument,
> so the device we want to delete from the list and for that all the state
> bit tests, bdev close, list replace rcu and synchronize_rcu should
> happen -- in case we have a newly allocated new_device.
> 
> What I don't understand how the short version after label
> err_close_device: is correct. The device is still left in the list but
> with NULL bdev but rw_devices, missing_devices is untouched.
> 
> That a device closing needs to allocate memory for a new device instead
> of reinitializing it again is stupid but with the simplified device
> closing I'm not sure the state is well defined.

As we couldn't allocate memory to remove the device from the list, we
have to keep it in the list (technically even leaking some memory here).

What we definitively need to do is clear the ->bdev pointer, otherwise
we'll trip over a NULL-pointer in open_fs_devices().

open_fs_devices() will traverse the list and call
btrfs_open_one_device() this will fail as device->bdev is (still) set
thus latest_dev is NULL and then this 'fs_devices->latest_bdev =
latest_dev->bdev;' will blow up.

If you have a better solution I'm all ears. This is what I came up with
to tackle the problem of half initialized devices.

One thing we could do though is call btrfs_free_stale_devices() in the
error case.

Byte,
	Johannes
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices()
  2019-11-13 15:02   ` David Sterba
@ 2019-11-14  9:01     ` Johannes Thumshirn
  0 siblings, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-14  9:01 UTC (permalink / raw)
  To: dsterba, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On 13/11/2019 16:02, David Sterba wrote:
> On Wed, Nov 13, 2019 at 11:27:26AM +0100, Johannes Thumshirn wrote:
>> Now that the preparation work is done, remove the temporary BUG_ON() in
>> close_fs_devices() and return an error instead.
>>
>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>
>> ---
>> Changes to v1:
>> - btrfs_fs_devices::seeding is a 'boolean' flags and no counter, don't
>>   decrement it (Qu)
>> ---
>>  fs/btrfs/volumes.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index be1fd935edf7..25e4608e20f1 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1128,7 +1128,11 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
>>  	mutex_lock(&fs_devices->device_list_mutex);
>>  	list_for_each_entry_safe(device, tmp, &fs_devices->devices, dev_list) {
>>  		ret = btrfs_close_one_device(device);
>> -		BUG_ON(ret); /* -ENOMEM */
>> +		if (ret) {
>> +			mutex_unlock(&fs_devices->device_list_mutex);
>> +			return ret;
> 
> This can fail in the middle of the loop thus leaving some devices in the
> list and keeping open_devices half-changed.
> 
> Not all callers of close_fs_devices handle the errors so this can break
> invariants, where eg. fs_devices->opened is expected to be 0 after the
> function call, similar for ->seeding or ->rw_devices.

Please see my answer to "btrfs: handle device allocation failure in
btrfs_close_one_device()" on this topic.

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-14  8:48     ` Johannes Thumshirn
@ 2019-11-14 10:56       ` Anand Jain
  2019-11-14 12:03         ` Johannes Thumshirn
  2019-11-14 13:02         ` Johannes Thumshirn
  0 siblings, 2 replies; 24+ messages in thread
From: Anand Jain @ 2019-11-14 10:56 UTC (permalink / raw)
  To: Johannes Thumshirn, dsterba, David Sterba, Qu Wenru,
	Linux BTRFS Mailinglist

On 14/11/19 4:48 PM, Johannes Thumshirn wrote:
> On 13/11/2019 15:58, David Sterba wrote:
>> On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
>>> In btrfs_close_one_device() we're allocating a new device and if this
>>> fails we BUG().
>>>
>>> Move the allocation to the top of the function and return an error in case
>>> it failed.
>>>
>>> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
>>> btrfs_close_one_device() as further work is pending to untangle this.
>>>
>>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>> ---
>>>   fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
>>>   1 file changed, 21 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 5ee26e7fca32..0a2a73907563 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>>>   	blkdev_put(device->bdev, device->mode);
>>>   }
>>>   
>>> -static void btrfs_close_one_device(struct btrfs_device *device)
>>> +static int btrfs_close_one_device(struct btrfs_device *device)
>>>   {
>>>   	struct btrfs_fs_devices *fs_devices = device->fs_devices;
>>>   	struct btrfs_device *new_device;
>>>   	struct rcu_string *name;
>>>   
>>> +	new_device = btrfs_alloc_device(NULL, &device->devid,
>>> +					device->uuid);
>>> +	if (IS_ERR(new_device))
>>> +		goto err_close_device;
>>> +
>>>   	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>>>   	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
>>>   		list_del_init(&device->dev_alloc_list);
>>> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>   	if (device->bdev)
>>>   		fs_devices->open_devices--;
>>>   
>>> -	new_device = btrfs_alloc_device(NULL, &device->devid,
>>> -					device->uuid);
>>> -	BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
>>> -
>>>   	/* Safe because we are under uuid_mutex */
>>>   	if (device->name) {
>>>   		name = rcu_string_strdup(device->name->str, GFP_NOFS);
>>> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>>   
>>>   	synchronize_rcu();
>>>   	btrfs_free_device(device);
>>> +
>>> +	return 0;
>>> +
>>> +err_close_device:
>>> +	btrfs_close_bdev(device);
>>> +	if (device->bdev) {
>>> +		fs_devices->open_devices--;
>>> +		btrfs_sysfs_rm_device_link(fs_devices, device);
>>> +		device->bdev = NULL;
>>> +	}
>>
>> I don't understand this part: the 'device' pointer is from the argument,
>> so the device we want to delete from the list and for that all the state
>> bit tests, bdev close, list replace rcu and synchronize_rcu should
>> happen -- in case we have a newly allocated new_device.
>>
>> What I don't understand how the short version after label
>> err_close_device: is correct. The device is still left in the list but
>> with NULL bdev but rw_devices, missing_devices is untouched.
>>
>> That a device closing needs to allocate memory for a new device instead
>> of reinitializing it again is stupid but with the simplified device
>> closing I'm not sure the state is well defined.
> 
> As we couldn't allocate memory to remove the device from the list, we
> have to keep it in the list (technically even leaking some memory here).
> 
> What we definitively need to do is clear the ->bdev pointer, otherwise
> we'll trip over a NULL-pointer in open_fs_devices().
> 
> open_fs_devices() will traverse the list and call
> btrfs_open_one_device() this will fail as device->bdev is (still) set
> thus latest_dev is NULL and then this 'fs_devices->latest_bdev =
> latest_dev->bdev;' will blow up.
> 
> If you have a better solution I'm all ears. This is what I came up with
> to tackle the problem of half initialized devices.
> 
> One thing we could do though is call btrfs_free_stale_devices() in the
> error case.
> 
> Byte,
> 	Johannes
> 

Johannes,

   Thanks for attempting to fix this.

   I wrote comments about this unoptimized code here [1]

   [1]
    ML email therad
     'invalid opcode in close_fs_devices'

 
https://groups.google.com/forum/#!msg/syzkaller-bugs/eSgcqygYaXE/6wuz-0jMCwAJ

   You may want to review.

   Yes David is correct why a closed device will still remain in the
   dev_alloc_list even after the close here in this patch.

Thanks, Anand

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/7] btrfs: handle allocation failure in strdup
  2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
@ 2019-11-14 11:00   ` Anand Jain
  2019-11-15  9:39     ` David Sterba
  2019-11-15 21:11   ` Nikolay Borisov
  1 sibling, 1 reply; 24+ messages in thread
From: Anand Jain @ 2019-11-14 11:00 UTC (permalink / raw)
  To: Johannes Thumshirn, David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist

On 13/11/19 6:27 PM, Johannes Thumshirn wrote:
> Gracefully handle allocation failures in btrfs_close_one_device()'s
> rcu_string_strdup() instead of crashing the machine.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>   fs/btrfs/volumes.c | 22 ++++++++++++++--------
>   1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 0a2a73907563..e5864ca3bb3b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1064,7 +1064,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>   static int btrfs_close_one_device(struct btrfs_device *device)
>   {
>   	struct btrfs_fs_devices *fs_devices = device->fs_devices;
> -	struct btrfs_device *new_device;
> +	struct btrfs_device *new_device = NULL;
>   	struct rcu_string *name;
>   
>   	new_device = btrfs_alloc_device(NULL, &device->devid,
> @@ -1072,6 +1072,15 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>   	if (IS_ERR(new_device))
>   		goto err_close_device;
>   
> +	/* Safe because we are under uuid_mutex */
> +	if (device->name) {
> +		name = rcu_string_strdup(device->name->str, GFP_NOFS);
> +		if (!name)
> +			goto err_free_device;
> +
> +		rcu_assign_pointer(new_device->name, name);
> +	}
> +


  Any idea why do we need to strdup() at all to close a device?

Thanks, Anand


>   	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>   	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
>   		list_del_init(&device->dev_alloc_list);
> @@ -1085,13 +1094,6 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>   	if (device->bdev)
>   		fs_devices->open_devices--;
>   
> -	/* Safe because we are under uuid_mutex */
> -	if (device->name) {
> -		name = rcu_string_strdup(device->name->str, GFP_NOFS);
> -		BUG_ON(!name); /* -ENOMEM */
> -		rcu_assign_pointer(new_device->name, name);
> -	}
> -
>   	list_replace_rcu(&device->dev_list, &new_device->dev_list);
>   	new_device->fs_devices = device->fs_devices;
>   
> @@ -1100,6 +1102,10 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>   
>   	return 0;
>   
> +err_free_device:
> +	if (new_device)
> +		btrfs_free_device(new_device);
> +
>   err_close_device:
>   	btrfs_close_bdev(device);
>   	if (device->bdev) {
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool
  2019-11-13 10:27 ` [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool Johannes Thumshirn
@ 2019-11-14 11:04   ` Anand Jain
  0 siblings, 0 replies; 24+ messages in thread
From: Anand Jain @ 2019-11-14 11:04 UTC (permalink / raw)
  To: Johannes Thumshirn, David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist

On 13/11/19 6:27 PM, Johannes Thumshirn wrote:
> struct btrfs_fs_devices::seeding currently is declared as an integer
> variable but only used as a boolean.
> 
> Change the variable definition to bool and update to code touching it to
> set 'true' and 'false'.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>

Nit:
Re: [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool
                                                       ^d

Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks, Anand


> ---
>   fs/btrfs/volumes.c | 8 ++++----
>   fs/btrfs/volumes.h | 2 +-
>   2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 25e4608e20f1..ce9c6fa3a32c 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -634,7 +634,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>   		}
>   
>   		clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
> -		fs_devices->seeding = 1;
> +		fs_devices->seeding = true;
>   	} else {
>   		if (bdev_read_only(bdev))
>   			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
> @@ -1139,7 +1139,7 @@ static int close_fs_devices(struct btrfs_fs_devices *fs_devices)
>   	WARN_ON(fs_devices->open_devices);
>   	WARN_ON(fs_devices->rw_devices);
>   	fs_devices->opened = 0;
> -	fs_devices->seeding = 0;
> +	fs_devices->seeding = false;
>   
>   	return 0;
>   }
> @@ -2297,7 +2297,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info)
>   	list_splice_init(&fs_devices->alloc_list, &seed_devices->alloc_list);
>   	mutex_unlock(&fs_info->chunk_mutex);
>   
> -	fs_devices->seeding = 0;
> +	fs_devices->seeding = false;
>   	fs_devices->num_devices = 0;
>   	fs_devices->open_devices = 0;
>   	fs_devices->missing_devices = 0;
> @@ -6681,7 +6681,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info,
>   		if (IS_ERR(fs_devices))
>   			return fs_devices;
>   
> -		fs_devices->seeding = 1;
> +		fs_devices->seeding = true;
>   		fs_devices->opened = 1;
>   		return fs_devices;
>   	}
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 46987a2da786..8e9513b3fe9d 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -243,7 +243,7 @@ struct btrfs_fs_devices {
>   	struct list_head alloc_list;
>   
>   	struct btrfs_fs_devices *seed;
> -	int seeding;
> +	bool seeding;
>   
>   	int opened;
>   
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating to bool
  2019-11-13 10:27 ` [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating " Johannes Thumshirn
@ 2019-11-14 11:05   ` Anand Jain
  0 siblings, 0 replies; 24+ messages in thread
From: Anand Jain @ 2019-11-14 11:05 UTC (permalink / raw)
  To: Johannes Thumshirn, David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist

On 13/11/19 6:27 PM, Johannes Thumshirn wrote:
> struct btrfs_fs_devices::rotating currently is declared as an integer
> variable but only used as a boolean.
> 
> Change the variable definition to bool and update to code touching it to
> set 'true' and 'false'.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>


Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks, Anand


> ---
>   fs/btrfs/volumes.c | 6 +++---
>   fs/btrfs/volumes.h | 2 +-
>   2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index ce9c6fa3a32c..6d3bfea2e2d5 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -644,7 +644,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
>   
>   	q = bdev_get_queue(bdev);
>   	if (!blk_queue_nonrot(q))
> -		fs_devices->rotating = 1;
> +		fs_devices->rotating = true;
>   
>   	device->bdev = bdev;
>   	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
> @@ -2301,7 +2301,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info)
>   	fs_devices->num_devices = 0;
>   	fs_devices->open_devices = 0;
>   	fs_devices->missing_devices = 0;
> -	fs_devices->rotating = 0;
> +	fs_devices->rotating = false;
>   	fs_devices->seed = seed_devices;
>   
>   	generate_random_uuid(fs_devices->fsid);
> @@ -2496,7 +2496,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>   	atomic64_add(device->total_bytes, &fs_info->free_chunk_space);
>   
>   	if (!blk_queue_nonrot(q))
> -		fs_devices->rotating = 1;
> +		fs_devices->rotating = true;
>   
>   	orig_super_total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
>   	btrfs_set_super_total_bytes(fs_info->super_copy,
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 8e9513b3fe9d..fc1b564b9cfe 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -250,7 +250,7 @@ struct btrfs_fs_devices {
>   	/* set when we find or add a device that doesn't have the
>   	 * nonrot flag set
>   	 */
> -	int rotating;
> +	bool rotating;
>   
>   	struct btrfs_fs_info *fs_info;
>   	/* sysfs kobjects */
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-14 10:56       ` Anand Jain
@ 2019-11-14 12:03         ` Johannes Thumshirn
  2019-11-14 13:02         ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-14 12:03 UTC (permalink / raw)
  To: Anand Jain, dsterba, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On 14/11/2019 11:56, Anand Jain wrote:
> On 14/11/19 4:48 PM, Johannes Thumshirn wrote:
>> On 13/11/2019 15:58, David Sterba wrote:
>>> On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
>>>> In btrfs_close_one_device() we're allocating a new device and if this
>>>> fails we BUG().
>>>>
>>>> Move the allocation to the top of the function and return an error
>>>> in case
>>>> it failed.
>>>>
>>>> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
>>>> btrfs_close_one_device() as further work is pending to untangle this.
>>>>
>>>> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
>>>> ---
>>>>   fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
>>>>   1 file changed, 21 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>> index 5ee26e7fca32..0a2a73907563 100644
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct
>>>> btrfs_device *device)
>>>>       blkdev_put(device->bdev, device->mode);
>>>>   }
>>>>   -static void btrfs_close_one_device(struct btrfs_device *device)
>>>> +static int btrfs_close_one_device(struct btrfs_device *device)
>>>>   {
>>>>       struct btrfs_fs_devices *fs_devices = device->fs_devices;
>>>>       struct btrfs_device *new_device;
>>>>       struct rcu_string *name;
>>>>   +    new_device = btrfs_alloc_device(NULL, &device->devid,
>>>> +                    device->uuid);
>>>> +    if (IS_ERR(new_device))
>>>> +        goto err_close_device;
>>>> +
>>>>       if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
>>>>           device->devid != BTRFS_DEV_REPLACE_DEVID) {
>>>>           list_del_init(&device->dev_alloc_list);
>>>> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct
>>>> btrfs_device *device)
>>>>       if (device->bdev)
>>>>           fs_devices->open_devices--;
>>>>   -    new_device = btrfs_alloc_device(NULL, &device->devid,
>>>> -                    device->uuid);
>>>> -    BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
>>>> -
>>>>       /* Safe because we are under uuid_mutex */
>>>>       if (device->name) {
>>>>           name = rcu_string_strdup(device->name->str, GFP_NOFS);
>>>> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct
>>>> btrfs_device *device)
>>>>         synchronize_rcu();
>>>>       btrfs_free_device(device);
>>>> +
>>>> +    return 0;
>>>> +
>>>> +err_close_device:
>>>> +    btrfs_close_bdev(device);
>>>> +    if (device->bdev) {
>>>> +        fs_devices->open_devices--;
>>>> +        btrfs_sysfs_rm_device_link(fs_devices, device);
>>>> +        device->bdev = NULL;
>>>> +    }
>>>
>>> I don't understand this part: the 'device' pointer is from the argument,
>>> so the device we want to delete from the list and for that all the state
>>> bit tests, bdev close, list replace rcu and synchronize_rcu should
>>> happen -- in case we have a newly allocated new_device.
>>>
>>> What I don't understand how the short version after label
>>> err_close_device: is correct. The device is still left in the list but
>>> with NULL bdev but rw_devices, missing_devices is untouched.
>>>
>>> That a device closing needs to allocate memory for a new device instead
>>> of reinitializing it again is stupid but with the simplified device
>>> closing I'm not sure the state is well defined.
>>
>> As we couldn't allocate memory to remove the device from the list, we
>> have to keep it in the list (technically even leaking some memory here).
>>
>> What we definitively need to do is clear the ->bdev pointer, otherwise
>> we'll trip over a NULL-pointer in open_fs_devices().
>>
>> open_fs_devices() will traverse the list and call
>> btrfs_open_one_device() this will fail as device->bdev is (still) set
>> thus latest_dev is NULL and then this 'fs_devices->latest_bdev =
>> latest_dev->bdev;' will blow up.
>>
>> If you have a better solution I'm all ears. This is what I came up with
>> to tackle the problem of half initialized devices.
>>
>> One thing we could do though is call btrfs_free_stale_devices() in the
>> error case.
>>
>> Byte,
>>     Johannes
>>
> 
> Johannes,
> 
>   Thanks for attempting to fix this.
> 
>   I wrote comments about this unoptimized code here [1]
> 
>   [1]
>    ML email therad
>     'invalid opcode in close_fs_devices'
> 
> 
> https://groups.google.com/forum/#!msg/syzkaller-bugs/eSgcqygYaXE/6wuz-0jMCwAJ
> 
> 
>   You may want to review.
> 
>   Yes David is correct why a closed device will still remain in the
>   dev_alloc_list even after the close here in this patch.

Yes I know, this is why I did this dance. One thing I thought of is,
having a temporary list of the devices to delete and then do the
list_for_each_entry_safe() btrfs_close_one_device() loop on this list.

But this will only work if we really want to remove all devices.

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device()
  2019-11-14 10:56       ` Anand Jain
  2019-11-14 12:03         ` Johannes Thumshirn
@ 2019-11-14 13:02         ` Johannes Thumshirn
  1 sibling, 0 replies; 24+ messages in thread
From: Johannes Thumshirn @ 2019-11-14 13:02 UTC (permalink / raw)
  To: Anand Jain, dsterba, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On 14/11/2019 11:56, Anand Jain wrote:
> Yes David is correct why a closed device will still remain in the
>   dev_alloc_list even after the close here in this patch.

OK, re-visited the Code again. And I think you're right I've moved this
hunk quite a bit:

        if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
            device->devid != BTRFS_DEV_REPLACE_DEVID) {
                list_del_init(&device->dev_alloc_list);
                fs_devices->rw_devices--;
        }


My initial intention was to first have the allocations done so I don't
have to undo anything in case of a failure.

I'm back to the drawing board here.
-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@suse.de                                +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/7] btrfs: handle allocation failure in strdup
  2019-11-14 11:00   ` Anand Jain
@ 2019-11-15  9:39     ` David Sterba
  0 siblings, 0 replies; 24+ messages in thread
From: David Sterba @ 2019-11-15  9:39 UTC (permalink / raw)
  To: Anand Jain
  Cc: Johannes Thumshirn, David Sterba, Qu Wenru, Linux BTRFS Mailinglist

On Thu, Nov 14, 2019 at 07:00:54PM +0800, Anand Jain wrote:
> On 13/11/19 6:27 PM, Johannes Thumshirn wrote:
> > Gracefully handle allocation failures in btrfs_close_one_device()'s
> > rcu_string_strdup() instead of crashing the machine.
> > 
> > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> > ---
> >   fs/btrfs/volumes.c | 22 ++++++++++++++--------
> >   1 file changed, 14 insertions(+), 8 deletions(-)
> > 
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 0a2a73907563..e5864ca3bb3b 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -1064,7 +1064,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
> >   static int btrfs_close_one_device(struct btrfs_device *device)
> >   {
> >   	struct btrfs_fs_devices *fs_devices = device->fs_devices;
> > -	struct btrfs_device *new_device;
> > +	struct btrfs_device *new_device = NULL;
> >   	struct rcu_string *name;
> >   
> >   	new_device = btrfs_alloc_device(NULL, &device->devid,
> > @@ -1072,6 +1072,15 @@ static int btrfs_close_one_device(struct btrfs_device *device)
> >   	if (IS_ERR(new_device))
> >   		goto err_close_device;
> >   
> > +	/* Safe because we are under uuid_mutex */
> > +	if (device->name) {
> > +		name = rcu_string_strdup(device->name->str, GFP_NOFS);
> > +		if (!name)
> > +			goto err_free_device;
> > +
> > +		rcu_assign_pointer(new_device->name, name);
> > +	}
> > +
> 
>   Any idea why do we need to strdup() at all to close a device?

It shouldn't be needed but that's how it got implemented since the
beginning in e4404d6e8da678d852. The device on close is duplicated, so
has to be the name.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/7] btrfs: handle allocation failure in strdup
  2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
  2019-11-14 11:00   ` Anand Jain
@ 2019-11-15 21:11   ` Nikolay Borisov
  1 sibling, 0 replies; 24+ messages in thread
From: Nikolay Borisov @ 2019-11-15 21:11 UTC (permalink / raw)
  To: Johannes Thumshirn, David Sterba; +Cc: Qu Wenru, Linux BTRFS Mailinglist



On 13.11.19 г. 12:27 ч., Johannes Thumshirn wrote:
> Gracefully handle allocation failures in btrfs_close_one_device()'s
> rcu_string_strdup() instead of crashing the machine.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/volumes.c | 22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 0a2a73907563..e5864ca3bb3b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1064,7 +1064,7 @@ static void btrfs_close_bdev(struct btrfs_device *device)
>  static int btrfs_close_one_device(struct btrfs_device *device)
>  {
>  	struct btrfs_fs_devices *fs_devices = device->fs_devices;
> -	struct btrfs_device *new_device;
> +	struct btrfs_device *new_device = NULL;
>  	struct rcu_string *name;
>  
>  	new_device = btrfs_alloc_device(NULL, &device->devid,
> @@ -1072,6 +1072,15 @@ static int btrfs_close_one_device(struct btrfs_device *device)
>  	if (IS_ERR(new_device))
>  		goto err_close_device;
>  
> +	/* Safe because we are under uuid_mutex */
> +	if (device->name) {
> +		name = rcu_string_strdup(device->name->str, GFP_NOFS);
> +		if (!name)
> +			goto err_free_device;
> +
> +		rcu_assign_pointer(new_device->name, name);
> +	}

This could really be:


diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e148b13905c5..7bb3cd8afa7a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1086,11 +1086,8 @@ static void btrfs_close_one_device(struct
btrfs_device *device)
        BUG_ON(IS_ERR(new_device)); /* -ENOMEM */

        /* Safe because we are under uuid_mutex */
-       if (device->name) {
-               name = rcu_string_strdup(device->name->str, GFP_NOFS);
-               BUG_ON(!name); /* -ENOMEM */
-               rcu_assign_pointer(new_device->name, name);
-       }
+       new_device->name = device->name;
+       device->name = NULL;

        list_replace_rcu(&device->dev_list, &new_device->dev_list);
        new_device->fs_devices = device->fs_devices;

rcu_string_free already checks if device->name is non-NULL.


<snip>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-11-15 21:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-13 10:27 [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 1/7] btrfs: decrement number of open devices after closing the device not before Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 2/7] btrfs: handle device allocation failure in btrfs_close_one_device() Johannes Thumshirn
2019-11-13 14:58   ` David Sterba
2019-11-14  8:48     ` Johannes Thumshirn
2019-11-14 10:56       ` Anand Jain
2019-11-14 12:03         ` Johannes Thumshirn
2019-11-14 13:02         ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 3/7] btrfs: handle allocation failure in strdup Johannes Thumshirn
2019-11-14 11:00   ` Anand Jain
2019-11-15  9:39     ` David Sterba
2019-11-15 21:11   ` Nikolay Borisov
2019-11-13 10:27 ` [PATCH v2 4/7] btrfs: handle error return of close_fs_devices() Johannes Thumshirn
2019-11-13 15:00   ` David Sterba
2019-11-14  8:15     ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 5/7] btrfs: remove final BUG_ON() in close_fs_devices() Johannes Thumshirn
2019-11-13 15:02   ` David Sterba
2019-11-14  9:01     ` Johannes Thumshirn
2019-11-13 10:27 ` [PATCH v2 6/7] btrfs: change btrfs_fs_devices::seeing to bool Johannes Thumshirn
2019-11-14 11:04   ` Anand Jain
2019-11-13 10:27 ` [PATCH v2 7/7] btrfs: change btrfs_fs_devices::rotating " Johannes Thumshirn
2019-11-14 11:05   ` Anand Jain
2019-11-13 11:56 ` [PATCH v2 0/7] remove BUG_ON()s in btrfs_close_one_device() Qu Wenruo
2019-11-13 15:05 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.