Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation
@ 2020-02-12  7:20 Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 01/21] btrfs: change type of full_search to bool Naohiro Aota
                   ` (20 more replies)
  0 siblings, 21 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

This series refactors chunk allocation, device_extent allocation and
extent allocation functions and make them generalized to be able to
implement other allocation policy easily.

On top of this series, we can simplify some part of the "btrfs: zoned
block device support" series as adding a new type of chunk allocator
and extent allocator for zoned block devices. Furthermore, we will be
able to implement and test some other allocator in the idea page of
the wiki e.g. SSD caching, dedicated metadata drive, chunk allocation
groups, and so on.

This series has no functional changes except introducing "enum
btrfs_chunk_allocation_policy" and "enum
btrfs_extent_allocation_policy".

* Refactoring chunk/dev_extent allocator

Two functions are separated from find_free_dev_extent_start().
dev_extent_search_start() decides the starting position of the search.
dev_extent_hole_check() checks if a hole found is suitable for device
extent allocation.

__btrfs_alloc_chunk() is split into four functions. set_parameters()
initializes the parameters of an allocation. gather_device_info()
loops over devices and gather information of
them. decide_stripe_size() decides the size of chunk and
device_extent. And, create_chunk() creates a chunk and device extents.

* Refactoring extent allocator

Three functions are introduced in
find_free_extent(). prepare_allocation() initializes the parameters
and gives a hint byte to start the allocation with. do_allocation()
handles the actual allocation in a given block group.
release_block_group() is called when it gives up an allocation from a
block group, so the allocation context should be reset.

Two functions are introduced in find_free_extent_update_loop().
found_extent() is called when the allocator finally find a proper
extent. chunk_allocation_failed() is called when it failed to allocate
a new chunk. An allocator implementation can use this hook to set the
next stage to try e.g. LOOP_NO_EMPTY_SIZE.

Furthermore, LOOP_NO_EMPTY_SIZE stage is tweaked so that other
allocator than the current clustered allocator skips this stage.

* Patch organization

Patch 1 is a trivial patch to fix the type of an argument of
find_free_extent_update_loop().

Patch 2 removes a BUG_ON from __btrfs_alloc_chunk().

Patches 3-10 refactors chunk and device_extent allocation functions:
find_free_dev_extent_start() and __btrfs_alloc_chunk().

Patches 11-21 refactors extent allocation function: find_free_extent()
and find_free_extent_update_loop().

* Changelog

 - v2
   - Stop separating "clustered_alloc_info" from find_free_extent_ctl
   - Change return type of dev_extent_hole_check() to bool
   - Rename set_parameters() to init_alloc_chunk_ctl()
   - Add a patch to remove BUG_ON from __btrfs_alloc_chunk()

Naohiro Aota (21):
  btrfs: change type of full_search to bool
  btrfs: do not BUG_ON with invalid profile
  btrfs: introduce chunk allocation policy
  btrfs: refactor find_free_dev_extent_start()
  btrfs: introduce alloc_chunk_ctl
  btrfs: factor out init_alloc_chunk_ctl
  btrfs: factor out gather_device_info()
  btrfs: factor out decide_stripe_size()
  btrfs: factor out create_chunk()
  btrfs: parameterize dev_extent_min
  btrfs: introduce extent allocation policy
  btrfs: move hint_byte into find_free_extent_ctl
  btrfs: move vairalbes for clustered allocation into
    find_free_extent_ctl
  btrfs: factor out do_allocation()
  btrfs: drop unnecessary arguments from clustered allocation functions
  btrfs: factor out release_block_group()
  btrfs: factor out found_extent()
  btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  btrfs: factor out chunk_allocation_failed()
  btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  btrfs: factor out prepare_allocation()

 fs/btrfs/extent-tree.c | 313 +++++++++++++++++++++-----------
 fs/btrfs/volumes.c     | 398 +++++++++++++++++++++++++++--------------
 fs/btrfs/volumes.h     |   6 +
 3 files changed, 481 insertions(+), 236 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 01/21] btrfs: change type of full_search to bool
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota, Johannes Thumshirn

While the "full_search" variable defined in find_free_extent() is bool, but
the full_search argument of find_free_extent_update_loop() is defined as
int. Let's trivially fix the argument type.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0163fdd59f8f..227d4d628b90 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3672,7 +3672,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 					struct btrfs_free_cluster *last_ptr,
 					struct btrfs_key *ins,
 					struct find_free_extent_ctl *ffe_ctl,
-					int full_search, bool use_cluster)
+					bool full_search, bool use_cluster)
 {
 	struct btrfs_root *root = fs_info->extent_root;
 	int ret;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 01/21] btrfs: change type of full_search to bool Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  7:58   ` Johannes Thumshirn
  2020-02-12 14:14   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
                   ` (18 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Do not BUG_ON() when an invalid profile is passed to __btrfs_alloc_chunk().
Instead return -EINVAL with ASSERT() to catch a bug in the development
stage.

Suggested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9cfc668f91f4..911c6b7c650b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4809,7 +4809,10 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	int j;
 	int index;
 
-	BUG_ON(!alloc_profile_is_valid(type, 0));
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
 
 	if (list_empty(&fs_devices->alloc_list)) {
 		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 03/21] btrfs: introduce chunk allocation policy
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 01/21] btrfs: change type of full_search to bool Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota, Johannes Thumshirn

This commit introduces chunk allocation policy for btrfs. This policy
controls how btrfs allocate a chunk and device extents from devices.

There is no functional change introduced with this commit.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 1 +
 fs/btrfs/volumes.h | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 911c6b7c650b..9fcbed9b57f9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1209,6 +1209,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->opened = 1;
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
+	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
 out:
 	return ret;
 }
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 690d4f5a0653..f07e1c4240b9 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -209,6 +209,10 @@ BTRFS_DEVICE_GETSET_FUNCS(total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(disk_total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(bytes_used);
 
+enum btrfs_chunk_allocation_policy {
+	BTRFS_CHUNK_ALLOC_REGULAR,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -259,6 +263,8 @@ struct btrfs_fs_devices {
 	struct kobject fsid_kobj;
 	struct kobject *devices_kobj;
 	struct completion kobj_unregister;
+
+	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (2 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:16   ` Josef Bacik
  2020-02-13 11:32   ` Johannes Thumshirn
  2020-02-12  7:20 ` [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
                   ` (16 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out two functions from find_free_dev_extent_start().
dev_extent_search_start() decides the starting position of the search.
dev_extent_hole_check() checks if a hole found is suitable for device
extent allocation.

These functions also have the switch-cases to change the allocation
behavior depending on the policy.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 80 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 60 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9fcbed9b57f9..5f4e875a42f6 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1384,6 +1384,61 @@ static bool contains_pending_extent(struct btrfs_device *device, u64 *start,
 	return false;
 }
 
+static u64 dev_extent_search_start(struct btrfs_device *device, u64 start)
+{
+	switch (device->fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		/*
+		 * We don't want to overwrite the superblock on the
+		 * drive nor any area used by the boot loader (grub
+		 * for example), so we make sure to start at an offset
+		 * of at least 1MB.
+		 */
+		return max_t(u64, start, SZ_1M);
+	default:
+		BUG();
+	}
+}
+
+/**
+ * dev_extent_hole_check - check if specified hole is suitable for allocation
+ * @device:	the device which we have the hole
+ * @hole_start: starting position of the hole
+ * @hole_size:	the size of the hole
+ * @num_bytes:	the size of the free space that we need
+ *
+ * This function may modify @hole_start and @hole_end to reflect the
+ * suitable position for allocation. Returns 1 if hole position is
+ * updated, 0 otherwise.
+ */
+static bool dev_extent_hole_check(struct btrfs_device *device, u64 *hole_start,
+				  u64 *hole_size, u64 num_bytes)
+{
+	bool changed = false;
+	u64 hole_end = *hole_start + *hole_size;
+
+	/*
+	 * Have to check before we set max_hole_start, otherwise we
+	 * could end up sending back this offset anyway.
+	 */
+	if (contains_pending_extent(device, hole_start, *hole_size)) {
+		if (hole_end >= *hole_start)
+			*hole_size = hole_end - *hole_start;
+		else
+			*hole_size = 0;
+		changed = true;
+	}
+
+	switch (device->fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		/* No extra check */
+		break;
+	default:
+		BUG();
+	}
+
+	return changed;
+}
 
 /*
  * find_free_dev_extent_start - find free space in the specified device
@@ -1430,12 +1485,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 	int slot;
 	struct extent_buffer *l;
 
-	/*
-	 * We don't want to overwrite the superblock on the drive nor any area
-	 * used by the boot loader (grub for example), so we make sure to start
-	 * at an offset of at least 1MB.
-	 */
-	search_start = max_t(u64, search_start, SZ_1M);
+	search_start = dev_extent_search_start(device, search_start);
 
 	path = btrfs_alloc_path();
 	if (!path)
@@ -1493,18 +1543,8 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 
 		if (key.offset > search_start) {
 			hole_size = key.offset - search_start;
-
-			/*
-			 * Have to check before we set max_hole_start, otherwise
-			 * we could end up sending back this offset anyway.
-			 */
-			if (contains_pending_extent(device, &search_start,
-						    hole_size)) {
-				if (key.offset >= search_start)
-					hole_size = key.offset - search_start;
-				else
-					hole_size = 0;
-			}
+			dev_extent_hole_check(device, &search_start, &hole_size,
+					      num_bytes);
 
 			if (hole_size > max_hole_size) {
 				max_hole_start = search_start;
@@ -1543,8 +1583,8 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 	 */
 	if (search_end > search_start) {
 		hole_size = search_end - search_start;
-
-		if (contains_pending_extent(device, &search_start, hole_size)) {
+		if (dev_extent_hole_check(device, &search_start, &hole_size,
+					  num_bytes)) {
 			btrfs_release_path(path);
 			goto again;
 		}
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (3 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  8:00   ` Johannes Thumshirn
  2020-02-13 16:17   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
                   ` (15 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Introduce "struct alloc_chunk_ctl" to wrap needed parameters for the chunk
allocation.  This will be used to split __btrfs_alloc_chunk() into smaller
functions.

This commit folds a number of local variables in __btrfs_alloc_chunk() into
one "struct alloc_chunk_ctl ctl". There is no functional change.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 151 ++++++++++++++++++++++++++-------------------
 1 file changed, 86 insertions(+), 65 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5f4e875a42f6..ed90e9d2bd9b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4818,6 +4818,29 @@ static void check_raid1c34_incompat_flag(struct btrfs_fs_info *info, u64 type)
 	btrfs_set_fs_incompat(info, RAID1C34);
 }
 
+/*
+ * Structure used internally for __btrfs_alloc_chunk() function.
+ * Wraps needed parameters.
+ */
+struct alloc_chunk_ctl {
+	u64 start;
+	u64 type;
+	int num_stripes;	/* total number of stripes to allocate */
+	int sub_stripes;	/* sub_stripes info for map */
+	int dev_stripes;	/* stripes per dev */
+	int devs_max;		/* max devs to use */
+	int devs_min;		/* min devs needed */
+	int devs_increment;	/* ndevs has to be a multiple of this */
+	int ncopies;		/* how many copies to data has */
+	int nparity;		/* number of stripes worth of bytes to
+				   store parity information */
+	u64 max_stripe_size;
+	u64 max_chunk_size;
+	u64 stripe_size;
+	u64 chunk_size;
+	int ndevs;
+};
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4828,23 +4851,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	struct extent_map_tree *em_tree;
 	struct extent_map *em;
 	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
 	u64 total_avail;
-	int num_stripes;	/* total number of stripes to allocate */
 	int data_stripes;	/* number of stripes that count for
 				   block group size */
-	int sub_stripes;	/* sub_stripes info for map */
-	int dev_stripes;	/* stripes per dev */
-	int devs_max;		/* max devs to use */
-	int devs_min;		/* min devs needed */
-	int devs_increment;	/* ndevs has to be a multiple of this */
-	int ncopies;		/* how many copies to data has */
-	int nparity;		/* number of stripes worth of bytes to
-				   store parity information */
 	int ret;
-	u64 max_stripe_size;
-	u64 max_chunk_size;
-	u64 stripe_size;
-	u64 chunk_size;
 	int ndevs;
 	int i;
 	int j;
@@ -4861,32 +4872,36 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		return -ENOSPC;
 	}
 
+	ctl.start = start;
+	ctl.type = type;
+
 	index = btrfs_bg_flags_to_raid_index(type);
 
-	sub_stripes = btrfs_raid_array[index].sub_stripes;
-	dev_stripes = btrfs_raid_array[index].dev_stripes;
-	devs_max = btrfs_raid_array[index].devs_max;
-	if (!devs_max)
-		devs_max = BTRFS_MAX_DEVS(info);
-	devs_min = btrfs_raid_array[index].devs_min;
-	devs_increment = btrfs_raid_array[index].devs_increment;
-	ncopies = btrfs_raid_array[index].ncopies;
-	nparity = btrfs_raid_array[index].nparity;
+	ctl.sub_stripes = btrfs_raid_array[index].sub_stripes;
+	ctl.dev_stripes = btrfs_raid_array[index].dev_stripes;
+	ctl.devs_max = btrfs_raid_array[index].devs_max;
+	if (!ctl.devs_max)
+		ctl.devs_max = BTRFS_MAX_DEVS(info);
+	ctl.devs_min = btrfs_raid_array[index].devs_min;
+	ctl.devs_increment = btrfs_raid_array[index].devs_increment;
+	ctl.ncopies = btrfs_raid_array[index].ncopies;
+	ctl.nparity = btrfs_raid_array[index].nparity;
 
 	if (type & BTRFS_BLOCK_GROUP_DATA) {
-		max_stripe_size = SZ_1G;
-		max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
+		ctl.max_stripe_size = SZ_1G;
+		ctl.max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
 	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
 		/* for larger filesystems, use larger metadata chunks */
 		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-			max_stripe_size = SZ_1G;
+			ctl.max_stripe_size = SZ_1G;
 		else
-			max_stripe_size = SZ_256M;
-		max_chunk_size = max_stripe_size;
+			ctl.max_stripe_size = SZ_256M;
+		ctl.max_chunk_size = ctl.max_stripe_size;
 	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
-		max_stripe_size = SZ_32M;
-		max_chunk_size = 2 * max_stripe_size;
-		devs_max = min_t(int, devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
+		ctl.max_stripe_size = SZ_32M;
+		ctl.max_chunk_size = 2 * ctl.max_stripe_size;
+		ctl.devs_max = min_t(int, ctl.devs_max,
+				      BTRFS_MAX_DEVS_SYS_CHUNK);
 	} else {
 		btrfs_err(info, "invalid chunk type 0x%llx requested",
 		       type);
@@ -4894,8 +4909,8 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	}
 
 	/* We don't want a chunk larger than 10% of writable space */
-	max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
-			     max_chunk_size);
+	ctl.max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
+				  ctl.max_chunk_size);
 
 	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
 			       GFP_NOFS);
@@ -4931,21 +4946,21 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		if (total_avail == 0)
 			continue;
 
-		ret = find_free_dev_extent(device,
-					   max_stripe_size * dev_stripes,
-					   &dev_offset, &max_avail);
+		ret = find_free_dev_extent(
+			device, ctl.max_stripe_size * ctl.dev_stripes,
+			&dev_offset, &max_avail);
 		if (ret && ret != -ENOSPC)
 			goto error;
 
 		if (ret == 0)
-			max_avail = max_stripe_size * dev_stripes;
+			max_avail = ctl.max_stripe_size * ctl.dev_stripes;
 
-		if (max_avail < BTRFS_STRIPE_LEN * dev_stripes) {
+		if (max_avail < BTRFS_STRIPE_LEN * ctl.dev_stripes) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
 			"%s: devid %llu has no free space, have=%llu want=%u",
 					    __func__, device->devid, max_avail,
-					    BTRFS_STRIPE_LEN * dev_stripes);
+					    BTRFS_STRIPE_LEN * ctl.dev_stripes);
 			continue;
 		}
 
@@ -4960,30 +4975,31 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		devices_info[ndevs].dev = device;
 		++ndevs;
 	}
+	ctl.ndevs = ndevs;
 
 	/*
 	 * now sort the devices by hole size / available space
 	 */
-	sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
+	sort(devices_info, ctl.ndevs, sizeof(struct btrfs_device_info),
 	     btrfs_cmp_device_info, NULL);
 
 	/*
 	 * Round down to number of usable stripes, devs_increment can be any
 	 * number so we can't use round_down()
 	 */
-	ndevs -= ndevs % devs_increment;
+	ctl.ndevs -= ctl.ndevs % ctl.devs_increment;
 
-	if (ndevs < devs_min) {
+	if (ctl.ndevs < ctl.devs_min) {
 		ret = -ENOSPC;
 		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
 			btrfs_debug(info,
 	"%s: not enough devices with free space: have=%d minimum required=%d",
-				    __func__, ndevs, devs_min);
+				    __func__, ctl.ndevs, ctl.devs_min);
 		}
 		goto error;
 	}
 
-	ndevs = min(ndevs, devs_max);
+	ctl.ndevs = min(ctl.ndevs, ctl.devs_max);
 
 	/*
 	 * The primary goal is to maximize the number of stripes, so use as
@@ -4992,14 +5008,15 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	 * The DUP profile stores more than one stripe per device, the
 	 * max_avail is the total size so we have to adjust.
 	 */
-	stripe_size = div_u64(devices_info[ndevs - 1].max_avail, dev_stripes);
-	num_stripes = ndevs * dev_stripes;
+	ctl.stripe_size = div_u64(devices_info[ctl.ndevs - 1].max_avail,
+				   ctl.dev_stripes);
+	ctl.num_stripes = ctl.ndevs * ctl.dev_stripes;
 
 	/*
 	 * this will have to be fixed for RAID1 and RAID10 over
 	 * more drives
 	 */
-	data_stripes = (num_stripes - nparity) / ncopies;
+	data_stripes = (ctl.num_stripes - ctl.nparity) / ctl.ncopies;
 
 	/*
 	 * Use the number of data stripes to figure out how big this chunk
@@ -5007,44 +5024,46 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	 * and compare that answer with the max chunk size. If it's higher,
 	 * we try to reduce stripe_size.
 	 */
-	if (stripe_size * data_stripes > max_chunk_size) {
+	if (ctl.stripe_size * data_stripes > ctl.max_chunk_size) {
 		/*
 		 * Reduce stripe_size, round it up to a 16MB boundary again and
 		 * then use it, unless it ends up being even bigger than the
 		 * previous value we had already.
 		 */
-		stripe_size = min(round_up(div_u64(max_chunk_size,
-						   data_stripes), SZ_16M),
-				  stripe_size);
+		ctl.stripe_size =
+			min(round_up(div_u64(ctl.max_chunk_size, data_stripes),
+				     SZ_16M),
+			    ctl.stripe_size);
 	}
 
 	/* align to BTRFS_STRIPE_LEN */
-	stripe_size = round_down(stripe_size, BTRFS_STRIPE_LEN);
+	ctl.stripe_size = round_down(ctl.stripe_size, BTRFS_STRIPE_LEN);
 
-	map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS);
+	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
 	if (!map) {
 		ret = -ENOMEM;
 		goto error;
 	}
-	map->num_stripes = num_stripes;
 
-	for (i = 0; i < ndevs; ++i) {
-		for (j = 0; j < dev_stripes; ++j) {
-			int s = i * dev_stripes + j;
+	map->num_stripes = ctl.num_stripes;
+
+	for (i = 0; i < ctl.ndevs; ++i) {
+		for (j = 0; j < ctl.dev_stripes; ++j) {
+			int s = i * ctl.dev_stripes + j;
 			map->stripes[s].dev = devices_info[i].dev;
 			map->stripes[s].physical = devices_info[i].dev_offset +
-						   j * stripe_size;
+						   j * ctl.stripe_size;
 		}
 	}
 	map->stripe_len = BTRFS_STRIPE_LEN;
 	map->io_align = BTRFS_STRIPE_LEN;
 	map->io_width = BTRFS_STRIPE_LEN;
 	map->type = type;
-	map->sub_stripes = sub_stripes;
+	map->sub_stripes = ctl.sub_stripes;
 
-	chunk_size = stripe_size * data_stripes;
+	ctl.chunk_size = ctl.stripe_size * data_stripes;
 
-	trace_btrfs_chunk_alloc(info, map, start, chunk_size);
+	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
 
 	em = alloc_extent_map();
 	if (!em) {
@@ -5055,10 +5074,10 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
 	em->map_lookup = map;
 	em->start = start;
-	em->len = chunk_size;
+	em->len = ctl.chunk_size;
 	em->block_start = 0;
 	em->block_len = em->len;
-	em->orig_block_len = stripe_size;
+	em->orig_block_len = ctl.stripe_size;
 
 	em_tree = &info->mapping_tree;
 	write_lock(&em_tree->lock);
@@ -5070,20 +5089,22 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	}
 	write_unlock(&em_tree->lock);
 
-	ret = btrfs_make_block_group(trans, 0, type, start, chunk_size);
+	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
 	if (ret)
 		goto error_del_extent;
 
 	for (i = 0; i < map->num_stripes; i++) {
 		struct btrfs_device *dev = map->stripes[i].dev;
 
-		btrfs_device_set_bytes_used(dev, dev->bytes_used + stripe_size);
+		btrfs_device_set_bytes_used(dev,
+					    dev->bytes_used + ctl.stripe_size);
 		if (list_empty(&dev->post_commit_list))
 			list_add_tail(&dev->post_commit_list,
 				      &trans->transaction->dev_update_list);
 	}
 
-	atomic64_sub(stripe_size * map->num_stripes, &info->free_chunk_space);
+	atomic64_sub(ctl.stripe_size * map->num_stripes,
+		     &info->free_chunk_space);
 
 	free_extent_map(em);
 	check_raid56_incompat_flag(info, type);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (4 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-13 16:20   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 07/21] btrfs: factor out gather_device_info() Naohiro Aota
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out init_alloc_chunk_ctl() from __btrfs_alloc_chunk(). This function
initialises parameters of "struct alloc_chunk_ctl" for allocation.
init_alloc_chunk_ctl() handles a common part of the initialisation to load
the RAID parameters from btrfs_raid_array.
init_alloc_chunk_ctl_policy_regular() decides some parameters for its
allocation.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 97 ++++++++++++++++++++++++++++------------------
 1 file changed, 60 insertions(+), 37 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ed90e9d2bd9b..9181e3ab617d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4841,6 +4841,61 @@ struct alloc_chunk_ctl {
 	int ndevs;
 };
 
+static void
+init_alloc_chunk_ctl_policy_regular(struct btrfs_fs_devices *fs_devices,
+				    struct alloc_chunk_ctl *ctl)
+{
+	u64 type = ctl->type;
+
+	if (type & BTRFS_BLOCK_GROUP_DATA) {
+		ctl->max_stripe_size = SZ_1G;
+		ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
+	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
+		/* for larger filesystems, use larger metadata chunks */
+		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
+			ctl->max_stripe_size = SZ_1G;
+		else
+			ctl->max_stripe_size = SZ_256M;
+		ctl->max_chunk_size = ctl->max_stripe_size;
+	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
+		ctl->max_stripe_size = SZ_32M;
+		ctl->max_chunk_size = 2 * ctl->max_stripe_size;
+		ctl->devs_max = min_t(int, ctl->devs_max,
+				      BTRFS_MAX_DEVS_SYS_CHUNK);
+	} else {
+		BUG();
+	}
+
+	/* We don't want a chunk larger than 10% of writable space */
+	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
+				  ctl->max_chunk_size);
+}
+
+static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
+				 struct alloc_chunk_ctl *ctl)
+{
+	int index = btrfs_bg_flags_to_raid_index(ctl->type);
+
+	ctl->sub_stripes = btrfs_raid_array[index].sub_stripes;
+	ctl->dev_stripes = btrfs_raid_array[index].dev_stripes;
+	ctl->devs_max = btrfs_raid_array[index].devs_max;
+	if (!ctl->devs_max)
+		ctl->devs_max = BTRFS_MAX_DEVS(fs_devices->fs_info);
+	ctl->devs_min = btrfs_raid_array[index].devs_min;
+	ctl->devs_increment = btrfs_raid_array[index].devs_increment;
+	ctl->ncopies = btrfs_raid_array[index].ncopies;
+	ctl->nparity = btrfs_raid_array[index].nparity;
+	ctl->ndevs = 0;
+
+	switch (fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		init_alloc_chunk_ctl_policy_regular(fs_devices, ctl);
+		break;
+	default:
+		BUG();
+	}
+}
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4859,7 +4914,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	int ndevs;
 	int i;
 	int j;
-	int index;
 
 	if (!alloc_profile_is_valid(type, 0)) {
 		ASSERT(0);
@@ -4872,45 +4926,14 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		return -ENOSPC;
 	}
 
-	ctl.start = start;
-	ctl.type = type;
-
-	index = btrfs_bg_flags_to_raid_index(type);
-
-	ctl.sub_stripes = btrfs_raid_array[index].sub_stripes;
-	ctl.dev_stripes = btrfs_raid_array[index].dev_stripes;
-	ctl.devs_max = btrfs_raid_array[index].devs_max;
-	if (!ctl.devs_max)
-		ctl.devs_max = BTRFS_MAX_DEVS(info);
-	ctl.devs_min = btrfs_raid_array[index].devs_min;
-	ctl.devs_increment = btrfs_raid_array[index].devs_increment;
-	ctl.ncopies = btrfs_raid_array[index].ncopies;
-	ctl.nparity = btrfs_raid_array[index].nparity;
-
-	if (type & BTRFS_BLOCK_GROUP_DATA) {
-		ctl.max_stripe_size = SZ_1G;
-		ctl.max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
-	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
-		/* for larger filesystems, use larger metadata chunks */
-		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-			ctl.max_stripe_size = SZ_1G;
-		else
-			ctl.max_stripe_size = SZ_256M;
-		ctl.max_chunk_size = ctl.max_stripe_size;
-	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
-		ctl.max_stripe_size = SZ_32M;
-		ctl.max_chunk_size = 2 * ctl.max_stripe_size;
-		ctl.devs_max = min_t(int, ctl.devs_max,
-				      BTRFS_MAX_DEVS_SYS_CHUNK);
-	} else {
-		btrfs_err(info, "invalid chunk type 0x%llx requested",
-		       type);
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
 		BUG();
 	}
 
-	/* We don't want a chunk larger than 10% of writable space */
-	ctl.max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
-				  ctl.max_chunk_size);
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
 
 	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
 			       GFP_NOFS);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 07/21] btrfs: factor out gather_device_info()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (5 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  8:04   ` Johannes Thumshirn
  2020-02-12  7:20 ` [PATCH v2 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out gather_device_info() from __btrfs_alloc_chunk(). This function
iterates over devices list and gather information about devices. This
commit also introduces "max_avail" and "dev_extent_min" to fold the same
calculation to one variable.

This commit has no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 119 +++++++++++++++++++++++++--------------------
 1 file changed, 66 insertions(+), 53 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9181e3ab617d..2bd3ed830a28 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4896,59 +4896,25 @@ static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
 	}
 }
 
-static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
-			       u64 start, u64 type)
+static int gather_device_info(struct btrfs_fs_devices *fs_devices,
+			      struct alloc_chunk_ctl *ctl,
+			      struct btrfs_device_info *devices_info)
 {
-	struct btrfs_fs_info *info = trans->fs_info;
-	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct btrfs_fs_info *info = fs_devices->fs_info;
 	struct btrfs_device *device;
-	struct map_lookup *map = NULL;
-	struct extent_map_tree *em_tree;
-	struct extent_map *em;
-	struct btrfs_device_info *devices_info = NULL;
-	struct alloc_chunk_ctl ctl;
 	u64 total_avail;
-	int data_stripes;	/* number of stripes that count for
-				   block group size */
+	u64 dev_extent_want = ctl->max_stripe_size * ctl->dev_stripes;
+	u64 dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 	int ret;
-	int ndevs;
-	int i;
-	int j;
-
-	if (!alloc_profile_is_valid(type, 0)) {
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	if (list_empty(&fs_devices->alloc_list)) {
-		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-			btrfs_debug(info, "%s: no writable device", __func__);
-		return -ENOSPC;
-	}
-
-	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
-		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
-		BUG();
-	}
-
-	ctl.start = start;
-	ctl.type = type;
-	init_alloc_chunk_ctl(fs_devices, &ctl);
-
-	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
-			       GFP_NOFS);
-	if (!devices_info)
-		return -ENOMEM;
+	int ndevs = 0;
+	u64 max_avail;
+	u64 dev_offset;
 
 	/*
 	 * in the first pass through the devices list, we gather information
 	 * about the available holes on each device.
 	 */
-	ndevs = 0;
 	list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
-		u64 max_avail;
-		u64 dev_offset;
-
 		if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 			WARN(1, KERN_ERR
 			       "BTRFS: read-only device in alloc_list\n");
@@ -4969,21 +4935,20 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		if (total_avail == 0)
 			continue;
 
-		ret = find_free_dev_extent(
-			device, ctl.max_stripe_size * ctl.dev_stripes,
-			&dev_offset, &max_avail);
+		ret = find_free_dev_extent(device, dev_extent_want, &dev_offset,
+					   &max_avail);
 		if (ret && ret != -ENOSPC)
-			goto error;
+			return ret;
 
 		if (ret == 0)
-			max_avail = ctl.max_stripe_size * ctl.dev_stripes;
+			max_avail = dev_extent_want;
 
-		if (max_avail < BTRFS_STRIPE_LEN * ctl.dev_stripes) {
+		if (max_avail < dev_extent_min) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
-			"%s: devid %llu has no free space, have=%llu want=%u",
+			"%s: devid %llu has no free space, have=%llu want=%llu",
 					    __func__, device->devid, max_avail,
-					    BTRFS_STRIPE_LEN * ctl.dev_stripes);
+					    dev_extent_min);
 			continue;
 		}
 
@@ -4998,14 +4963,62 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		devices_info[ndevs].dev = device;
 		++ndevs;
 	}
-	ctl.ndevs = ndevs;
+	ctl->ndevs = ndevs;
 
 	/*
 	 * now sort the devices by hole size / available space
 	 */
-	sort(devices_info, ctl.ndevs, sizeof(struct btrfs_device_info),
+	sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
 	     btrfs_cmp_device_info, NULL);
 
+	return 0;
+}
+
+static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
+			       u64 start, u64 type)
+{
+	struct btrfs_fs_info *info = trans->fs_info;
+	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct map_lookup *map = NULL;
+	struct extent_map_tree *em_tree;
+	struct extent_map *em;
+	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
+	int data_stripes;	/* number of stripes that count for
+				   block group size */
+	int ret;
+	int i;
+	int j;
+
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	if (list_empty(&fs_devices->alloc_list)) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG))
+			btrfs_debug(info, "%s: no writable device", __func__);
+		return -ENOSPC;
+	}
+
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
+		BUG();
+	}
+
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
+
+	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
+			       GFP_NOFS);
+	if (!devices_info)
+		return -ENOMEM;
+
+	ret = gather_device_info(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto error;
+
 	/*
 	 * Round down to number of usable stripes, devs_increment can be any
 	 * number so we can't use round_down()
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 08/21] btrfs: factor out decide_stripe_size()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (6 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 07/21] btrfs: factor out gather_device_info() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 09/21] btrfs: factor out create_chunk() Naohiro Aota
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota, Johannes Thumshirn

Factor out decide_stripe_size() from __btrfs_alloc_chunk(). This function
calculates the actual stripe size to allocate. decide_stripe_size() handles
the common case to round down the 'ndevs' to 'devs_increment' and check the
upper and lower limitation of 'ndevs'. decide_stripe_size_regular() decides
the size of a stripe and the size of a chunk. The policy is to maximize the
number of stripes.

This commit has no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 138 ++++++++++++++++++++++++++-------------------
 1 file changed, 80 insertions(+), 58 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2bd3ed830a28..00085943e4dd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4974,6 +4974,84 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 	return 0;
 }
 
+static int decide_stripe_size_regular(struct alloc_chunk_ctl *ctl,
+				      struct btrfs_device_info *devices_info)
+{
+	int data_stripes;	/* number of stripes that count for
+				   block group size */
+
+	/*
+	 * The primary goal is to maximize the number of stripes, so use as
+	 * many devices as possible, even if the stripes are not maximum sized.
+	 *
+	 * The DUP profile stores more than one stripe per device, the
+	 * max_avail is the total size so we have to adjust.
+	 */
+	ctl->stripe_size = div_u64(devices_info[ctl->ndevs - 1].max_avail,
+				   ctl->dev_stripes);
+	ctl->num_stripes = ctl->ndevs * ctl->dev_stripes;
+
+	/*
+	 * this will have to be fixed for RAID1 and RAID10 over
+	 * more drives
+	 */
+	data_stripes = (ctl->num_stripes - ctl->nparity) / ctl->ncopies;
+
+	/*
+	 * Use the number of data stripes to figure out how big this chunk
+	 * is really going to be in terms of logical address space,
+	 * and compare that answer with the max chunk size. If it's higher,
+	 * we try to reduce stripe_size.
+	 */
+	if (ctl->stripe_size * data_stripes > ctl->max_chunk_size) {
+		/*
+		 * Reduce stripe_size, round it up to a 16MB boundary again and
+		 * then use it, unless it ends up being even bigger than the
+		 * previous value we had already.
+		 */
+		ctl->stripe_size = min(round_up(div_u64(ctl->max_chunk_size,
+							data_stripes), SZ_16M),
+				       ctl->stripe_size);
+	}
+
+	/* align to BTRFS_STRIPE_LEN */
+	ctl->stripe_size = round_down(ctl->stripe_size, BTRFS_STRIPE_LEN);
+	ctl->chunk_size = ctl->stripe_size * data_stripes;
+
+	return 0;
+}
+
+static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
+			      struct alloc_chunk_ctl *ctl,
+			      struct btrfs_device_info *devices_info)
+{
+	struct btrfs_fs_info *info = fs_devices->fs_info;
+
+	/*
+	 * Round down to number of usable stripes, devs_increment can be any
+	 * number so we can't use round_down()
+	 */
+	ctl->ndevs -= ctl->ndevs % ctl->devs_increment;
+
+	if (ctl->ndevs < ctl->devs_min) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
+			btrfs_debug(info,
+	"%s: not enough devices with free space: have=%d minimum required=%d",
+				    __func__, ctl->ndevs, ctl->devs_min);
+		}
+		return -ENOSPC;
+	}
+
+	ctl->ndevs = min(ctl->ndevs, ctl->devs_max);
+
+	switch (fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		return decide_stripe_size_regular(ctl, devices_info);
+	default:
+		BUG();
+	}
+}
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4984,8 +5062,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	struct extent_map *em;
 	struct btrfs_device_info *devices_info = NULL;
 	struct alloc_chunk_ctl ctl;
-	int data_stripes;	/* number of stripes that count for
-				   block group size */
 	int ret;
 	int i;
 	int j;
@@ -5019,61 +5095,9 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	if (ret < 0)
 		goto error;
 
-	/*
-	 * Round down to number of usable stripes, devs_increment can be any
-	 * number so we can't use round_down()
-	 */
-	ctl.ndevs -= ctl.ndevs % ctl.devs_increment;
-
-	if (ctl.ndevs < ctl.devs_min) {
-		ret = -ENOSPC;
-		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
-			btrfs_debug(info,
-	"%s: not enough devices with free space: have=%d minimum required=%d",
-				    __func__, ctl.ndevs, ctl.devs_min);
-		}
+	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
+	if (ret < 0)
 		goto error;
-	}
-
-	ctl.ndevs = min(ctl.ndevs, ctl.devs_max);
-
-	/*
-	 * The primary goal is to maximize the number of stripes, so use as
-	 * many devices as possible, even if the stripes are not maximum sized.
-	 *
-	 * The DUP profile stores more than one stripe per device, the
-	 * max_avail is the total size so we have to adjust.
-	 */
-	ctl.stripe_size = div_u64(devices_info[ctl.ndevs - 1].max_avail,
-				   ctl.dev_stripes);
-	ctl.num_stripes = ctl.ndevs * ctl.dev_stripes;
-
-	/*
-	 * this will have to be fixed for RAID1 and RAID10 over
-	 * more drives
-	 */
-	data_stripes = (ctl.num_stripes - ctl.nparity) / ctl.ncopies;
-
-	/*
-	 * Use the number of data stripes to figure out how big this chunk
-	 * is really going to be in terms of logical address space,
-	 * and compare that answer with the max chunk size. If it's higher,
-	 * we try to reduce stripe_size.
-	 */
-	if (ctl.stripe_size * data_stripes > ctl.max_chunk_size) {
-		/*
-		 * Reduce stripe_size, round it up to a 16MB boundary again and
-		 * then use it, unless it ends up being even bigger than the
-		 * previous value we had already.
-		 */
-		ctl.stripe_size =
-			min(round_up(div_u64(ctl.max_chunk_size, data_stripes),
-				     SZ_16M),
-			    ctl.stripe_size);
-	}
-
-	/* align to BTRFS_STRIPE_LEN */
-	ctl.stripe_size = round_down(ctl.stripe_size, BTRFS_STRIPE_LEN);
 
 	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
 	if (!map) {
@@ -5097,8 +5121,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	map->type = type;
 	map->sub_stripes = ctl.sub_stripes;
 
-	ctl.chunk_size = ctl.stripe_size * data_stripes;
-
 	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
 
 	em = alloc_extent_map();
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 09/21] btrfs: factor out create_chunk()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (7 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12  8:07   ` Johannes Thumshirn
  2020-02-13 16:24   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
                   ` (11 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out create_chunk() from __btrfs_alloc_chunk(). This function finally
creates a chunk. There is no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 130 ++++++++++++++++++++++++---------------------
 1 file changed, 70 insertions(+), 60 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 00085943e4dd..3e2e3896d72a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5052,90 +5052,53 @@ static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
 	}
 }
 
-static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
-			       u64 start, u64 type)
+static int create_chunk(struct btrfs_trans_handle *trans,
+			struct alloc_chunk_ctl *ctl,
+			struct btrfs_device_info *devices_info)
 {
 	struct btrfs_fs_info *info = trans->fs_info;
-	struct btrfs_fs_devices *fs_devices = info->fs_devices;
 	struct map_lookup *map = NULL;
 	struct extent_map_tree *em_tree;
 	struct extent_map *em;
-	struct btrfs_device_info *devices_info = NULL;
-	struct alloc_chunk_ctl ctl;
+	u64 start = ctl->start;
+	u64 type = ctl->type;
 	int ret;
 	int i;
 	int j;
 
-	if (!alloc_profile_is_valid(type, 0)) {
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	if (list_empty(&fs_devices->alloc_list)) {
-		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-			btrfs_debug(info, "%s: no writable device", __func__);
-		return -ENOSPC;
-	}
-
-	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
-		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
-		BUG();
-	}
-
-	ctl.start = start;
-	ctl.type = type;
-	init_alloc_chunk_ctl(fs_devices, &ctl);
-
-	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
-			       GFP_NOFS);
-	if (!devices_info)
+	map = kmalloc(map_lookup_size(ctl->num_stripes), GFP_NOFS);
+	if (!map)
 		return -ENOMEM;
+	map->num_stripes = ctl->num_stripes;
 
-	ret = gather_device_info(fs_devices, &ctl, devices_info);
-	if (ret < 0)
-		goto error;
-
-	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
-	if (ret < 0)
-		goto error;
-
-	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
-	if (!map) {
-		ret = -ENOMEM;
-		goto error;
-	}
-
-	map->num_stripes = ctl.num_stripes;
-
-	for (i = 0; i < ctl.ndevs; ++i) {
-		for (j = 0; j < ctl.dev_stripes; ++j) {
-			int s = i * ctl.dev_stripes + j;
+	for (i = 0; i < ctl->ndevs; ++i) {
+		for (j = 0; j < ctl->dev_stripes; ++j) {
+			int s = i * ctl->dev_stripes + j;
 			map->stripes[s].dev = devices_info[i].dev;
 			map->stripes[s].physical = devices_info[i].dev_offset +
-						   j * ctl.stripe_size;
+						   j * ctl->stripe_size;
 		}
 	}
 	map->stripe_len = BTRFS_STRIPE_LEN;
 	map->io_align = BTRFS_STRIPE_LEN;
 	map->io_width = BTRFS_STRIPE_LEN;
 	map->type = type;
-	map->sub_stripes = ctl.sub_stripes;
+	map->sub_stripes = ctl->sub_stripes;
 
-	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
+	trace_btrfs_chunk_alloc(info, map, start, ctl->chunk_size);
 
 	em = alloc_extent_map();
 	if (!em) {
 		kfree(map);
-		ret = -ENOMEM;
-		goto error;
+		return -ENOMEM;
 	}
 	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
 	em->map_lookup = map;
 	em->start = start;
-	em->len = ctl.chunk_size;
+	em->len = ctl->chunk_size;
 	em->block_start = 0;
 	em->block_len = em->len;
-	em->orig_block_len = ctl.stripe_size;
+	em->orig_block_len = ctl->stripe_size;
 
 	em_tree = &info->mapping_tree;
 	write_lock(&em_tree->lock);
@@ -5143,11 +5106,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	if (ret) {
 		write_unlock(&em_tree->lock);
 		free_extent_map(em);
-		goto error;
+		return ret;
 	}
 	write_unlock(&em_tree->lock);
 
-	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
+	ret = btrfs_make_block_group(trans, 0, type, start, ctl->chunk_size);
 	if (ret)
 		goto error_del_extent;
 
@@ -5155,20 +5118,19 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		struct btrfs_device *dev = map->stripes[i].dev;
 
 		btrfs_device_set_bytes_used(dev,
-					    dev->bytes_used + ctl.stripe_size);
+					    dev->bytes_used + ctl->stripe_size);
 		if (list_empty(&dev->post_commit_list))
 			list_add_tail(&dev->post_commit_list,
 				      &trans->transaction->dev_update_list);
 	}
 
-	atomic64_sub(ctl.stripe_size * map->num_stripes,
+	atomic64_sub(ctl->stripe_size * map->num_stripes,
 		     &info->free_chunk_space);
 
 	free_extent_map(em);
 	check_raid56_incompat_flag(info, type);
 	check_raid1c34_incompat_flag(info, type);
 
-	kfree(devices_info);
 	return 0;
 
 error_del_extent:
@@ -5180,7 +5142,55 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	free_extent_map(em);
 	/* One for the tree reference */
 	free_extent_map(em);
-error:
+
+	return ret;
+}
+
+static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
+			       u64 start, u64 type)
+{
+	struct btrfs_fs_info *info = trans->fs_info;
+	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
+	int ret;
+
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	if (list_empty(&fs_devices->alloc_list)) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG))
+			btrfs_debug(info, "%s: no writable device", __func__);
+		return -ENOSPC;
+	}
+
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
+		BUG();
+	}
+
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
+
+	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
+			       GFP_NOFS);
+	if (!devices_info)
+		return -ENOMEM;
+
+	ret = gather_device_info(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto out;
+
+	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto out;
+
+	ret = create_chunk(trans, &ctl, devices_info);
+
+out:
 	kfree(devices_info);
 	return ret;
 }
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 10/21] btrfs: parameterize dev_extent_min
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (8 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 09/21] btrfs: factor out create_chunk() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-13 16:27   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 11/21] btrfs: introduce extent allocation policy Naohiro Aota
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Currently, we ignore a device whose available space is less than
"BTRFS_STRIPE_LEN * dev_stripes". This is a lower limit for current
allocation policy (to maximize the number of stripes). This commit
parameterizes dev_extent_min, so that other policies can set their own
lower limitation to ignore a device with an insufficient space.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/volumes.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 3e2e3896d72a..38c2c425b997 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4836,6 +4836,7 @@ struct alloc_chunk_ctl {
 				   store parity information */
 	u64 max_stripe_size;
 	u64 max_chunk_size;
+	u64 dev_extent_min;
 	u64 stripe_size;
 	u64 chunk_size;
 	int ndevs;
@@ -4869,6 +4870,7 @@ init_alloc_chunk_ctl_policy_regular(struct btrfs_fs_devices *fs_devices,
 	/* We don't want a chunk larger than 10% of writable space */
 	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
 				  ctl->max_chunk_size);
+	ctl->dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 }
 
 static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
@@ -4904,7 +4906,6 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *device;
 	u64 total_avail;
 	u64 dev_extent_want = ctl->max_stripe_size * ctl->dev_stripes;
-	u64 dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 	int ret;
 	int ndevs = 0;
 	u64 max_avail;
@@ -4932,7 +4933,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 			total_avail = 0;
 
 		/* If there is no space on this device, skip it. */
-		if (total_avail == 0)
+		if (total_avail < ctl->dev_extent_min)
 			continue;
 
 		ret = find_free_dev_extent(device, dev_extent_want, &dev_offset,
@@ -4943,12 +4944,12 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 		if (ret == 0)
 			max_avail = dev_extent_want;
 
-		if (max_avail < dev_extent_min) {
+		if (max_avail < ctl->dev_extent_min) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
 			"%s: devid %llu has no free space, have=%llu want=%llu",
 					    __func__, device->devid, max_avail,
-					    dev_extent_min);
+					    ctl->dev_extent_min);
 			continue;
 		}
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 11/21] btrfs: introduce extent allocation policy
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (9 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-13 19:55   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

This commit introduces extent allocation policy for btrfs. This policy
controls how btrfs allocate an extents from block groups.

There is no functional change introduced with this commit.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 227d4d628b90..247d68eb4735 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3438,6 +3438,10 @@ btrfs_release_block_group(struct btrfs_block_group *cache,
 	btrfs_put_block_group(cache);
 }
 
+enum btrfs_extent_allocation_policy {
+	BTRFS_EXTENT_ALLOC_CLUSTERED,
+};
+
 /*
  * Structure used internally for find_free_extent() function.  Wraps needed
  * parameters.
@@ -3489,6 +3493,9 @@ struct find_free_extent_ctl {
 
 	/* Found result */
 	u64 found_offset;
+
+	/* Allocation policy */
+	enum btrfs_extent_allocation_policy policy;
 };
 
 
@@ -3826,6 +3833,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.have_caching_bg = false;
 	ffe_ctl.orig_have_caching_bg = false;
 	ffe_ctl.found_offset = 0;
+	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
 	ins->objectid = 0;
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (10 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 11/21] btrfs: introduce extent allocation policy Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:11   ` Johannes Thumshirn
  2020-02-13 19:56   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
                   ` (8 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

This commit moves hint_byte into find_free_extent_ctl, so that we can
modify the hint_byte in the other functions. This will help us split
find_free_extent further. This commit also renames the function argument
"hint_byte" to "hint_byte_orig" to avoid misuse.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 247d68eb4735..b1f52eee24fe 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3494,6 +3494,9 @@ struct find_free_extent_ctl {
 	/* Found result */
 	u64 found_offset;
 
+	/* Hint byte to start looking for an empty space */
+	u64 hint_byte;
+
 	/* Allocation policy */
 	enum btrfs_extent_allocation_policy policy;
 };
@@ -3808,7 +3811,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
  */
 static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 				u64 ram_bytes, u64 num_bytes, u64 empty_size,
-				u64 hint_byte, struct btrfs_key *ins,
+				u64 hint_byte_orig, struct btrfs_key *ins,
 				u64 flags, int delalloc)
 {
 	int ret = 0;
@@ -3833,6 +3836,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.have_caching_bg = false;
 	ffe_ctl.orig_have_caching_bg = false;
 	ffe_ctl.found_offset = 0;
+	ffe_ctl.hint_byte = hint_byte_orig;
 	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
@@ -3875,14 +3879,14 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	if (last_ptr) {
 		spin_lock(&last_ptr->lock);
 		if (last_ptr->block_group)
-			hint_byte = last_ptr->window_start;
+			ffe_ctl.hint_byte = last_ptr->window_start;
 		if (last_ptr->fragmented) {
 			/*
 			 * We still set window_start so we can keep track of the
 			 * last place we found an allocation to try and save
 			 * some time.
 			 */
-			hint_byte = last_ptr->window_start;
+			ffe_ctl.hint_byte = last_ptr->window_start;
 			use_cluster = false;
 		}
 		spin_unlock(&last_ptr->lock);
@@ -3890,8 +3894,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 
 	ffe_ctl.search_start = max(ffe_ctl.search_start,
 				   first_logical_byte(fs_info, 0));
-	ffe_ctl.search_start = max(ffe_ctl.search_start, hint_byte);
-	if (ffe_ctl.search_start == hint_byte) {
+	ffe_ctl.search_start = max(ffe_ctl.search_start, ffe_ctl.hint_byte);
+	if (ffe_ctl.search_start == ffe_ctl.hint_byte) {
 		block_group = btrfs_lookup_block_group(fs_info,
 						       ffe_ctl.search_start);
 		/*
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation into find_free_extent_ctl
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (11 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
@ 2020-02-12  7:20 ` " Naohiro Aota
  2020-02-12 14:14   ` Johannes Thumshirn
                     ` (2 more replies)
  2020-02-12  7:20 ` [PATCH v2 14/21] btrfs: factor out do_allocation() Naohiro Aota
                   ` (7 subsequent siblings)
  20 siblings, 3 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Move "last_ptr" and "use_cluster" into struct find_free_extent_ctl, so that
hook functions for clustered allocator can use these variables.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index b1f52eee24fe..fb62842ff3e6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3458,6 +3458,8 @@ struct find_free_extent_ctl {
 
 	/* For clustered allocation */
 	u64 empty_cluster;
+	struct btrfs_free_cluster *last_ptr;
+	bool use_cluster;
 
 	bool have_caching_bg;
 	bool orig_have_caching_bg;
@@ -3816,11 +3818,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 {
 	int ret = 0;
 	int cache_block_group_error = 0;
-	struct btrfs_free_cluster *last_ptr = NULL;
 	struct btrfs_block_group *block_group = NULL;
 	struct find_free_extent_ctl ffe_ctl = {0};
 	struct btrfs_space_info *space_info;
-	bool use_cluster = true;
 	bool full_search = false;
 
 	WARN_ON(num_bytes < fs_info->sectorsize);
@@ -3829,8 +3829,6 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.empty_size = empty_size;
 	ffe_ctl.flags = flags;
 	ffe_ctl.search_start = 0;
-	ffe_ctl.retry_clustered = false;
-	ffe_ctl.retry_unclustered = false;
 	ffe_ctl.delalloc = delalloc;
 	ffe_ctl.index = btrfs_bg_flags_to_raid_index(flags);
 	ffe_ctl.have_caching_bg = false;
@@ -3839,6 +3837,12 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.hint_byte = hint_byte_orig;
 	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
+	/* For clustered allocation */
+	ffe_ctl.retry_clustered = false;
+	ffe_ctl.retry_unclustered = false;
+	ffe_ctl.last_ptr = NULL;
+	ffe_ctl.use_cluster = true;
+
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
 	ins->objectid = 0;
 	ins->offset = 0;
@@ -3869,14 +3873,16 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			spin_unlock(&space_info->lock);
 			return -ENOSPC;
 		} else if (space_info->max_extent_size) {
-			use_cluster = false;
+			ffe_ctl.use_cluster = false;
 		}
 		spin_unlock(&space_info->lock);
 	}
 
-	last_ptr = fetch_cluster_info(fs_info, space_info,
-				      &ffe_ctl.empty_cluster);
-	if (last_ptr) {
+	ffe_ctl.last_ptr = fetch_cluster_info(fs_info, space_info,
+					      &ffe_ctl.empty_cluster);
+	if (ffe_ctl.last_ptr) {
+		struct btrfs_free_cluster *last_ptr = ffe_ctl.last_ptr;
+
 		spin_lock(&last_ptr->lock);
 		if (last_ptr->block_group)
 			ffe_ctl.hint_byte = last_ptr->window_start;
@@ -3887,7 +3893,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			 * some time.
 			 */
 			ffe_ctl.hint_byte = last_ptr->window_start;
-			use_cluster = false;
+			ffe_ctl.use_cluster = false;
 		}
 		spin_unlock(&last_ptr->lock);
 	}
@@ -4000,10 +4006,11 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		 * Ok we want to try and use the cluster allocator, so
 		 * lets look there
 		 */
-		if (last_ptr && use_cluster) {
+		if (ffe_ctl.last_ptr && ffe_ctl.use_cluster) {
 			struct btrfs_block_group *cluster_bg = NULL;
 
-			ret = find_free_extent_clustered(block_group, last_ptr,
+			ret = find_free_extent_clustered(block_group,
+							 ffe_ctl.last_ptr,
 							 &ffe_ctl, &cluster_bg);
 
 			if (ret == 0) {
@@ -4021,8 +4028,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			/* ret == -ENOENT case falls through */
 		}
 
-		ret = find_free_extent_unclustered(block_group, last_ptr,
-						   &ffe_ctl);
+		ret = find_free_extent_unclustered(block_group,
+						   ffe_ctl.last_ptr, &ffe_ctl);
 		if (ret == -EAGAIN)
 			goto have_block_group;
 		else if (ret > 0)
@@ -4071,8 +4078,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	}
 	up_read(&space_info->groups_sem);
 
-	ret = find_free_extent_update_loop(fs_info, last_ptr, ins, &ffe_ctl,
-					   full_search, use_cluster);
+	ret = find_free_extent_update_loop(fs_info, ffe_ctl.last_ptr, ins,
+					   &ffe_ctl, full_search,
+					   ffe_ctl.use_cluster);
 	if (ret > 0)
 		goto search;
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 14/21] btrfs: factor out do_allocation()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (12 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:45   ` Johannes Thumshirn
  2020-02-13 19:57   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
                   ` (6 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out do_allocation() from find_free_extent(). This function do an
actual allocation in a given block group. The ffe_ctl->policy is used to
determine the actual allocator function to use.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 78 +++++++++++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 31 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index fb62842ff3e6..f9fadddf0144 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3675,6 +3675,39 @@ static int find_free_extent_unclustered(struct btrfs_block_group *bg,
 	return 0;
 }
 
+static int do_allocation_clustered(struct btrfs_block_group *block_group,
+				   struct find_free_extent_ctl *ffe_ctl,
+				   struct btrfs_block_group **bg_ret)
+{
+	int ret;
+
+	/*
+	 * Ok we want to try and use the cluster allocator, so lets look there
+	 */
+	if (ffe_ctl->last_ptr && ffe_ctl->use_cluster) {
+		ret = find_free_extent_clustered(block_group, ffe_ctl->last_ptr,
+						 ffe_ctl, bg_ret);
+		if (ret >= 0 || ret == -EAGAIN)
+			return ret;
+		/* ret == -ENOENT case falls through */
+	}
+
+	return find_free_extent_unclustered(block_group, ffe_ctl->last_ptr,
+					    ffe_ctl);
+}
+
+static int do_allocation(struct btrfs_block_group *block_group,
+			 struct find_free_extent_ctl *ffe_ctl,
+			 struct btrfs_block_group **bg_ret)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		return do_allocation_clustered(block_group, ffe_ctl, bg_ret);
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3942,6 +3975,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	down_read(&space_info->groups_sem);
 	list_for_each_entry(block_group,
 			    &space_info->block_groups[ffe_ctl.index], list) {
+		struct btrfs_block_group *bg_ret;
+
 		/* If the block group is read-only, we can skip it entirely. */
 		if (unlikely(block_group->ro))
 			continue;
@@ -4002,40 +4037,21 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		if (unlikely(block_group->cached == BTRFS_CACHE_ERROR))
 			goto loop;
 
-		/*
-		 * Ok we want to try and use the cluster allocator, so
-		 * lets look there
-		 */
-		if (ffe_ctl.last_ptr && ffe_ctl.use_cluster) {
-			struct btrfs_block_group *cluster_bg = NULL;
-
-			ret = find_free_extent_clustered(block_group,
-							 ffe_ctl.last_ptr,
-							 &ffe_ctl, &cluster_bg);
-
-			if (ret == 0) {
-				if (cluster_bg && cluster_bg != block_group) {
-					btrfs_release_block_group(block_group,
-								  delalloc);
-					block_group = cluster_bg;
-				}
-				goto checks;
-			} else if (ret == -EAGAIN) {
-				goto have_block_group;
-			} else if (ret > 0) {
-				goto loop;
+		bg_ret = NULL;
+		ret = do_allocation(block_group, &ffe_ctl, &bg_ret);
+		if (ret == 0) {
+			if (bg_ret && bg_ret != block_group) {
+				btrfs_release_block_group(block_group,
+							  delalloc);
+				block_group = bg_ret;
 			}
-			/* ret == -ENOENT case falls through */
-		}
-
-		ret = find_free_extent_unclustered(block_group,
-						   ffe_ctl.last_ptr, &ffe_ctl);
-		if (ret == -EAGAIN)
+		} else if (ret == -EAGAIN) {
 			goto have_block_group;
-		else if (ret > 0)
+		} else if (ret > 0) {
 			goto loop;
-		/* ret == 0 case falls through */
-checks:
+		}
+
+		/* checks */
 		ffe_ctl.search_start = round_up(ffe_ctl.found_offset,
 					     fs_info->stripesize);
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (13 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 14/21] btrfs: factor out do_allocation() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:47   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 16/21] btrfs: factor out release_block_group() Naohiro Aota
                   ` (5 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Now that, find_free_extent_clustered() and find_free_extent_unclustered()
can access "last_ptr" from the "clustered" variable. So, we can drop it
from the arguments.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f9fadddf0144..3dee6a385137 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3513,11 +3513,11 @@ struct find_free_extent_ctl {
  * Return 0 means we have found a location and set ffe_ctl->found_offset.
  */
 static int find_free_extent_clustered(struct btrfs_block_group *bg,
-		struct btrfs_free_cluster *last_ptr,
-		struct find_free_extent_ctl *ffe_ctl,
-		struct btrfs_block_group **cluster_bg_ret)
+				      struct find_free_extent_ctl *ffe_ctl,
+				      struct btrfs_block_group **cluster_bg_ret)
 {
 	struct btrfs_block_group *cluster_bg;
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
 	u64 aligned_cluster;
 	u64 offset;
 	int ret;
@@ -3617,9 +3617,9 @@ static int find_free_extent_clustered(struct btrfs_block_group *bg,
  * Return -EAGAIN to inform caller that we need to re-search this block group
  */
 static int find_free_extent_unclustered(struct btrfs_block_group *bg,
-		struct btrfs_free_cluster *last_ptr,
-		struct find_free_extent_ctl *ffe_ctl)
+					struct find_free_extent_ctl *ffe_ctl)
 {
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
 	u64 offset;
 
 	/*
@@ -3685,15 +3685,13 @@ static int do_allocation_clustered(struct btrfs_block_group *block_group,
 	 * Ok we want to try and use the cluster allocator, so lets look there
 	 */
 	if (ffe_ctl->last_ptr && ffe_ctl->use_cluster) {
-		ret = find_free_extent_clustered(block_group, ffe_ctl->last_ptr,
-						 ffe_ctl, bg_ret);
+		ret = find_free_extent_clustered(block_group, ffe_ctl, bg_ret);
 		if (ret >= 0 || ret == -EAGAIN)
 			return ret;
 		/* ret == -ENOENT case falls through */
 	}
 
-	return find_free_extent_unclustered(block_group, ffe_ctl->last_ptr,
-					    ffe_ctl);
+	return find_free_extent_unclustered(block_group, ffe_ctl);
 }
 
 static int do_allocation(struct btrfs_block_group *block_group,
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 16/21] btrfs: factor out release_block_group()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (14 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:48   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 17/21] btrfs: factor out found_extent() Naohiro Aota
                   ` (4 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out release_block_group() from find_free_extent(). This function is
called when it gives up an allocation from a block group. Each allocation
policy should reset their information for an allocation in the next block
group.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3dee6a385137..276c12392a85 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3706,6 +3706,24 @@ static int do_allocation(struct btrfs_block_group *block_group,
 	}
 }
 
+static void release_block_group(struct btrfs_block_group *block_group,
+				struct find_free_extent_ctl *ffe_ctl,
+				int delalloc)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		ffe_ctl->retry_clustered = false;
+		ffe_ctl->retry_unclustered = false;
+		break;
+	default:
+		BUG();
+	}
+
+	BUG_ON(btrfs_bg_flags_to_raid_index(block_group->flags) !=
+		ffe_ctl->index);
+	btrfs_release_block_group(block_group, delalloc);
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -4083,11 +4101,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		btrfs_release_block_group(block_group, delalloc);
 		break;
 loop:
-		ffe_ctl.retry_clustered = false;
-		ffe_ctl.retry_unclustered = false;
-		BUG_ON(btrfs_bg_flags_to_raid_index(block_group->flags) !=
-		       ffe_ctl.index);
-		btrfs_release_block_group(block_group, delalloc);
+		release_block_group(block_group, &ffe_ctl, delalloc);
 		cond_resched();
 	}
 	up_read(&space_info->groups_sem);
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 17/21] btrfs: factor out found_extent()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (15 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 16/21] btrfs: factor out release_block_group() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:50   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
                   ` (3 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out found_extent() from find_free_extent_update_loop(). This
function is called when a proper extent is found and before returning from
find_free_extent().  Hook functions like found_extent_clustered() should
save information for a next allocation.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 276c12392a85..f3fa7869389b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3724,6 +3724,30 @@ static void release_block_group(struct btrfs_block_group *block_group,
 	btrfs_release_block_group(block_group, delalloc);
 }
 
+static void found_extent_clustered(struct find_free_extent_ctl *ffe_ctl,
+				   struct btrfs_key *ins)
+{
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
+
+	if (!ffe_ctl->use_cluster && last_ptr) {
+		spin_lock(&last_ptr->lock);
+		last_ptr->window_start = ins->objectid;
+		spin_unlock(&last_ptr->lock);
+	}
+}
+
+static void found_extent(struct find_free_extent_ctl *ffe_ctl,
+			 struct btrfs_key *ins)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		found_extent_clustered(ffe_ctl, ins);
+		break;
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3750,11 +3774,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 		return 1;
 
 	if (ins->objectid) {
-		if (!use_cluster && last_ptr) {
-			spin_lock(&last_ptr->lock);
-			last_ptr->window_start = ins->objectid;
-			spin_unlock(&last_ptr->lock);
-		}
+		found_extent(ffe_ctl, ins);
 		return 0;
 	}
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (16 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 17/21] btrfs: factor out found_extent() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:51   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
                   ` (2 subsequent siblings)
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Now that, we don't use last_ptr and use_cluster in the function. Drop these
arguments from it.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f3fa7869389b..efc653e6be29 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3754,10 +3754,9 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
  * Return <0 means we failed to locate any free extent.
  */
 static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
-					struct btrfs_free_cluster *last_ptr,
 					struct btrfs_key *ins,
 					struct find_free_extent_ctl *ffe_ctl,
-					bool full_search, bool use_cluster)
+					bool full_search)
 {
 	struct btrfs_root *root = fs_info->extent_root;
 	int ret;
@@ -4126,9 +4125,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	}
 	up_read(&space_info->groups_sem);
 
-	ret = find_free_extent_update_loop(fs_info, ffe_ctl.last_ptr, ins,
-					   &ffe_ctl, full_search,
-					   ffe_ctl.use_cluster);
+	ret = find_free_extent_update_loop(fs_info, ins, &ffe_ctl, full_search);
 	if (ret > 0)
 		goto search;
 
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (17 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-12 14:56   ` Johannes Thumshirn
  2020-02-13 19:52   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
  2020-02-12  7:20 ` [PATCH v2 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
  20 siblings, 2 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

Factor out chunk_allocation_failed() from find_free_extent_update_loop().
This function is called when it failed to allocate a chunk. The function
can modify "ffe_ctl->loop" and return 0 to continue with the next stage.
Or, it can return -ENOSPC to give up here.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index efc653e6be29..8f0d489f76fa 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3748,6 +3748,21 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
 	}
 }
 
+static int chunk_allocation_failed(struct find_free_extent_ctl *ffe_ctl)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		/*
+		 * If we can't allocate a new chunk we've already looped through
+		 * at least once, move on to the NO_EMPTY_SIZE case.
+		 */
+		ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
+		return 0;
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3819,19 +3834,12 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 			ret = btrfs_chunk_alloc(trans, ffe_ctl->flags,
 						CHUNK_ALLOC_FORCE);
 
-			/*
-			 * If we can't allocate a new chunk we've already looped
-			 * through at least once, move on to the NO_EMPTY_SIZE
-			 * case.
-			 */
 			if (ret == -ENOSPC)
-				ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
+				ret = chunk_allocation_failed(ffe_ctl);
 
 			/* Do not bail out on ENOSPC since we can do more. */
 			if (ret < 0 && ret != -ENOSPC)
 				btrfs_abort_transaction(trans, ret);
-			else
-				ret = 0;
 			if (!exist)
 				btrfs_end_transaction(trans);
 			if (ret)
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (18 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-13 19:55   ` Josef Bacik
  2020-02-12  7:20 ` [PATCH v2 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So,
we can skip this stage and go to LOOP_GIVEUP stage to indicate we gave
up the allocation. This commit also moves the scope of the "clustered"
variable.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8f0d489f76fa..3ab0d2f5d718 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3373,6 +3373,7 @@ enum btrfs_loop_type {
 	LOOP_CACHING_WAIT,
 	LOOP_ALLOC_CHUNK,
 	LOOP_NO_EMPTY_SIZE,
+	LOOP_GIVEUP,
 };
 
 static inline void
@@ -3847,6 +3848,11 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 		}
 
 		if (ffe_ctl->loop == LOOP_NO_EMPTY_SIZE) {
+			if (ffe_ctl->policy != BTRFS_EXTENT_ALLOC_CLUSTERED) {
+				ffe_ctl->loop = LOOP_GIVEUP;
+				return -ENOSPC;
+			}
+
 			/*
 			 * Don't loop again if we already have no empty_size and
 			 * no empty_cluster.
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v2 21/21] btrfs: factor out prepare_allocation()
  2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (19 preceding siblings ...)
  2020-02-12  7:20 ` [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
@ 2020-02-12  7:20 ` Naohiro Aota
  2020-02-13 19:59   ` Josef Bacik
  20 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-12  7:20 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Johannes Thumshirn, Hannes Reinecke, Anand Jain, linux-fsdevel,
	Naohiro Aota

This function finally factor out prepare_allocation() form
find_free_extent(). This function is called before the allocation loop and
a specific allocator function like prepare_allocation_clustered() should
initialize their private information and can set proper hint_byte to
indicate where to start the allocation with.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 110 +++++++++++++++++++++++++----------------
 1 file changed, 68 insertions(+), 42 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3ab0d2f5d718..6da7f6e03767 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3868,6 +3868,71 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 	return -ENOSPC;
 }
 
+static int prepare_allocation_clustered(struct btrfs_fs_info *fs_info,
+					struct find_free_extent_ctl *ffe_ctl,
+					struct btrfs_space_info *space_info,
+					struct btrfs_key *ins)
+{
+	/*
+	 * If our free space is heavily fragmented we may not be able to make
+	 * big contiguous allocations, so instead of doing the expensive search
+	 * for free space, simply return ENOSPC with our max_extent_size so we
+	 * can go ahead and search for a more manageable chunk.
+	 *
+	 * If our max_extent_size is large enough for our allocation simply
+	 * disable clustering since we will likely not be able to find enough
+	 * space to create a cluster and induce latency trying.
+	 */
+	if (unlikely(space_info->max_extent_size)) {
+		spin_lock(&space_info->lock);
+		if (space_info->max_extent_size &&
+		    ffe_ctl->num_bytes > space_info->max_extent_size) {
+			ins->offset = space_info->max_extent_size;
+			spin_unlock(&space_info->lock);
+			return -ENOSPC;
+		} else if (space_info->max_extent_size) {
+			ffe_ctl->use_cluster = false;
+		}
+		spin_unlock(&space_info->lock);
+	}
+
+	ffe_ctl->last_ptr = fetch_cluster_info(fs_info, space_info,
+					       &ffe_ctl->empty_cluster);
+	if (ffe_ctl->last_ptr) {
+		struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
+
+		spin_lock(&last_ptr->lock);
+		if (last_ptr->block_group)
+			ffe_ctl->hint_byte = last_ptr->window_start;
+		if (last_ptr->fragmented) {
+			/*
+			 * We still set window_start so we can keep track of the
+			 * last place we found an allocation to try and save
+			 * some time.
+			 */
+			ffe_ctl->hint_byte = last_ptr->window_start;
+			ffe_ctl->use_cluster = false;
+		}
+		spin_unlock(&last_ptr->lock);
+	}
+
+	return 0;
+}
+
+static int prepare_allocation(struct btrfs_fs_info *fs_info,
+			      struct find_free_extent_ctl *ffe_ctl,
+			      struct btrfs_space_info *space_info,
+			      struct btrfs_key *ins)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		return prepare_allocation_clustered(fs_info, ffe_ctl,
+						    space_info, ins);
+	default:
+		BUG();
+	}
+}
+
 /*
  * walks the btree of allocated extents and find a hole of a given size.
  * The key ins is changed to record the hole:
@@ -3937,48 +4002,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		return -ENOSPC;
 	}
 
-	/*
-	 * If our free space is heavily fragmented we may not be able to make
-	 * big contiguous allocations, so instead of doing the expensive search
-	 * for free space, simply return ENOSPC with our max_extent_size so we
-	 * can go ahead and search for a more manageable chunk.
-	 *
-	 * If our max_extent_size is large enough for our allocation simply
-	 * disable clustering since we will likely not be able to find enough
-	 * space to create a cluster and induce latency trying.
-	 */
-	if (unlikely(space_info->max_extent_size)) {
-		spin_lock(&space_info->lock);
-		if (space_info->max_extent_size &&
-		    num_bytes > space_info->max_extent_size) {
-			ins->offset = space_info->max_extent_size;
-			spin_unlock(&space_info->lock);
-			return -ENOSPC;
-		} else if (space_info->max_extent_size) {
-			ffe_ctl.use_cluster = false;
-		}
-		spin_unlock(&space_info->lock);
-	}
-
-	ffe_ctl.last_ptr = fetch_cluster_info(fs_info, space_info,
-					      &ffe_ctl.empty_cluster);
-	if (ffe_ctl.last_ptr) {
-		struct btrfs_free_cluster *last_ptr = ffe_ctl.last_ptr;
-
-		spin_lock(&last_ptr->lock);
-		if (last_ptr->block_group)
-			ffe_ctl.hint_byte = last_ptr->window_start;
-		if (last_ptr->fragmented) {
-			/*
-			 * We still set window_start so we can keep track of the
-			 * last place we found an allocation to try and save
-			 * some time.
-			 */
-			ffe_ctl.hint_byte = last_ptr->window_start;
-			ffe_ctl.use_cluster = false;
-		}
-		spin_unlock(&last_ptr->lock);
-	}
+	ret = prepare_allocation(fs_info, &ffe_ctl, space_info, ins);
+	if (ret < 0)
+		return ret;
 
 	ffe_ctl.search_start = max(ffe_ctl.search_start,
 				   first_logical_byte(fs_info, 0));
-- 
2.25.0


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile
  2020-02-12  7:20 ` [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
@ 2020-02-12  7:58   ` Johannes Thumshirn
  2020-02-12 14:14   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12  7:58 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl
  2020-02-12  7:20 ` [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
@ 2020-02-12  8:00   ` Johannes Thumshirn
  2020-02-13 16:17   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12  8:00 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 07/21] btrfs: factor out gather_device_info()
  2020-02-12  7:20 ` [PATCH v2 07/21] btrfs: factor out gather_device_info() Naohiro Aota
@ 2020-02-12  8:04   ` Johannes Thumshirn
  0 siblings, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12  8:04 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 12/02/2020 08:21, Naohiro Aota wrote:
> +	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
> +		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
> +		BUG();
> +	}

This could be

if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
	btrfs_err(info, "invalid chunk type 0x%llx requested", type);
	ASSERT(0);
	return -EINVAL;
}

as well.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 09/21] btrfs: factor out create_chunk()
  2020-02-12  7:20 ` [PATCH v2 09/21] btrfs: factor out create_chunk() Naohiro Aota
@ 2020-02-12  8:07   ` Johannes Thumshirn
  2020-02-13 16:24   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12  8:07 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl
  2020-02-12  7:20 ` [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
@ 2020-02-12 14:11   ` Johannes Thumshirn
  2020-02-13 19:56   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:11 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation into find_free_extent_ctl
  2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
@ 2020-02-12 14:14   ` Johannes Thumshirn
  2020-02-13 19:57   ` Josef Bacik
  2020-02-20 14:27   ` Christoph Hellwig
  2 siblings, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:14 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile
  2020-02-12  7:20 ` [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
  2020-02-12  7:58   ` Johannes Thumshirn
@ 2020-02-12 14:14   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-12 14:14 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Do not BUG_ON() when an invalid profile is passed to __btrfs_alloc_chunk().
> Instead return -EINVAL with ASSERT() to catch a bug in the development
> stage.
> 
> Suggested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start()
  2020-02-12  7:20 ` [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
@ 2020-02-12 14:16   ` Josef Bacik
  2020-02-13 11:32   ` Johannes Thumshirn
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-12 14:16 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out two functions from find_free_dev_extent_start().
> dev_extent_search_start() decides the starting position of the search.
> dev_extent_hole_check() checks if a hole found is suitable for device
> extent allocation.
> 
> These functions also have the switch-cases to change the allocation
> behavior depending on the policy.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 14/21] btrfs: factor out do_allocation()
  2020-02-12  7:20 ` [PATCH v2 14/21] btrfs: factor out do_allocation() Naohiro Aota
@ 2020-02-12 14:45   ` Johannes Thumshirn
  2020-02-13 19:57   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:45 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions
  2020-02-12  7:20 ` [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
@ 2020-02-12 14:47   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:47 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 12/02/2020 08:21, Naohiro Aota wrote:
> Now that, find_free_extent_clustered() and find_free_extent_unclustered()
> can access "last_ptr" from the "clustered" variable. So, we can drop it
> from the arguments.

I'd drop the ". So", like:

[...] from the "clustered" variable, we can [...]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 16/21] btrfs: factor out release_block_group()
  2020-02-12  7:20 ` [PATCH v2 16/21] btrfs: factor out release_block_group() Naohiro Aota
@ 2020-02-12 14:48   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:48 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 17/21] btrfs: factor out found_extent()
  2020-02-12  7:20 ` [PATCH v2 17/21] btrfs: factor out found_extent() Naohiro Aota
@ 2020-02-12 14:50   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:50 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  2020-02-12  7:20 ` [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
@ 2020-02-12 14:51   ` Johannes Thumshirn
  2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:51 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed()
  2020-02-12  7:20 ` [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
@ 2020-02-12 14:56   ` Johannes Thumshirn
  2020-02-13 19:52   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-12 14:56 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start()
  2020-02-12  7:20 ` [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
  2020-02-12 14:16   ` Josef Bacik
@ 2020-02-13 11:32   ` Johannes Thumshirn
  1 sibling, 0 replies; 57+ messages in thread
From: Johannes Thumshirn @ 2020-02-13 11:32 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Josef Bacik, Nikolay Borisov, Damien Le Moal,
	Hannes Reinecke, Anand Jain, linux-fsdevel

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl
  2020-02-12  7:20 ` [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
  2020-02-12  8:00   ` Johannes Thumshirn
@ 2020-02-13 16:17   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 16:17 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Introduce "struct alloc_chunk_ctl" to wrap needed parameters for the chunk
> allocation.  This will be used to split __btrfs_alloc_chunk() into smaller
> functions.
> 
> This commit folds a number of local variables in __btrfs_alloc_chunk() into
> one "struct alloc_chunk_ctl ctl". There is no functional change.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl
  2020-02-12  7:20 ` [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
@ 2020-02-13 16:20   ` Josef Bacik
  0 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 16:20 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out init_alloc_chunk_ctl() from __btrfs_alloc_chunk(). This function
> initialises parameters of "struct alloc_chunk_ctl" for allocation.
> init_alloc_chunk_ctl() handles a common part of the initialisation to load
> the RAID parameters from btrfs_raid_array.
> init_alloc_chunk_ctl_policy_regular() decides some parameters for its
> allocation.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 09/21] btrfs: factor out create_chunk()
  2020-02-12  7:20 ` [PATCH v2 09/21] btrfs: factor out create_chunk() Naohiro Aota
  2020-02-12  8:07   ` Johannes Thumshirn
@ 2020-02-13 16:24   ` Josef Bacik
  2020-02-20 10:17     ` Naohiro Aota
  1 sibling, 1 reply; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 16:24 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out create_chunk() from __btrfs_alloc_chunk(). This function finally
> creates a chunk. There is no functional changes.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>   fs/btrfs/volumes.c | 130 ++++++++++++++++++++++++---------------------
>   1 file changed, 70 insertions(+), 60 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 00085943e4dd..3e2e3896d72a 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -5052,90 +5052,53 @@ static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
>   	}
>   }
>   
> -static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
> -			       u64 start, u64 type)
> +static int create_chunk(struct btrfs_trans_handle *trans,
> +			struct alloc_chunk_ctl *ctl,
> +			struct btrfs_device_info *devices_info)
>   {
>   	struct btrfs_fs_info *info = trans->fs_info;
> -	struct btrfs_fs_devices *fs_devices = info->fs_devices;
>   	struct map_lookup *map = NULL;
>   	struct extent_map_tree *em_tree;
>   	struct extent_map *em;
> -	struct btrfs_device_info *devices_info = NULL;
> -	struct alloc_chunk_ctl ctl;
> +	u64 start = ctl->start;
> +	u64 type = ctl->type;
>   	int ret;
>   	int i;
>   	int j;
>   
> -	if (!alloc_profile_is_valid(type, 0)) {
> -		ASSERT(0);
> -		return -EINVAL;
> -	}
> -
> -	if (list_empty(&fs_devices->alloc_list)) {
> -		if (btrfs_test_opt(info, ENOSPC_DEBUG))
> -			btrfs_debug(info, "%s: no writable device", __func__);
> -		return -ENOSPC;
> -	}
> -
> -	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
> -		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
> -		BUG();
> -	}
> -
> -	ctl.start = start;
> -	ctl.type = type;
> -	init_alloc_chunk_ctl(fs_devices, &ctl);
> -
> -	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
> -			       GFP_NOFS);
> -	if (!devices_info)
> +	map = kmalloc(map_lookup_size(ctl->num_stripes), GFP_NOFS);
> +	if (!map)
>   		return -ENOMEM;
> +	map->num_stripes = ctl->num_stripes;
>   
> -	ret = gather_device_info(fs_devices, &ctl, devices_info);
> -	if (ret < 0)
> -		goto error;
> -
> -	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
> -	if (ret < 0)
> -		goto error;
> -
> -	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
> -	if (!map) {
> -		ret = -ENOMEM;
> -		goto error;
> -	}
> -
> -	map->num_stripes = ctl.num_stripes;
> -
> -	for (i = 0; i < ctl.ndevs; ++i) {
> -		for (j = 0; j < ctl.dev_stripes; ++j) {
> -			int s = i * ctl.dev_stripes + j;
> +	for (i = 0; i < ctl->ndevs; ++i) {
> +		for (j = 0; j < ctl->dev_stripes; ++j) {
> +			int s = i * ctl->dev_stripes + j;
>   			map->stripes[s].dev = devices_info[i].dev;
>   			map->stripes[s].physical = devices_info[i].dev_offset +
> -						   j * ctl.stripe_size;
> +						   j * ctl->stripe_size;
>   		}
>   	}
>   	map->stripe_len = BTRFS_STRIPE_LEN;
>   	map->io_align = BTRFS_STRIPE_LEN;
>   	map->io_width = BTRFS_STRIPE_LEN;
>   	map->type = type;
> -	map->sub_stripes = ctl.sub_stripes;
> +	map->sub_stripes = ctl->sub_stripes;
>   
> -	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
> +	trace_btrfs_chunk_alloc(info, map, start, ctl->chunk_size);
>   
>   	em = alloc_extent_map();
>   	if (!em) {
>   		kfree(map);
> -		ret = -ENOMEM;
> -		goto error;
> +		return -ENOMEM;
>   	}
>   	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
>   	em->map_lookup = map;
>   	em->start = start;
> -	em->len = ctl.chunk_size;
> +	em->len = ctl->chunk_size;
>   	em->block_start = 0;
>   	em->block_len = em->len;
> -	em->orig_block_len = ctl.stripe_size;
> +	em->orig_block_len = ctl->stripe_size;
>   
>   	em_tree = &info->mapping_tree;
>   	write_lock(&em_tree->lock);
> @@ -5143,11 +5106,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>   	if (ret) {
>   		write_unlock(&em_tree->lock);
>   		free_extent_map(em);
> -		goto error;
> +		return ret;
>   	}
>   	write_unlock(&em_tree->lock);
>   
> -	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
> +	ret = btrfs_make_block_group(trans, 0, type, start, ctl->chunk_size);
>   	if (ret)
>   		goto error_del_extent;
>   
> @@ -5155,20 +5118,19 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>   		struct btrfs_device *dev = map->stripes[i].dev;
>   
>   		btrfs_device_set_bytes_used(dev,
> -					    dev->bytes_used + ctl.stripe_size);
> +					    dev->bytes_used + ctl->stripe_size);
>   		if (list_empty(&dev->post_commit_list))
>   			list_add_tail(&dev->post_commit_list,
>   				      &trans->transaction->dev_update_list);
>   	}
>   
> -	atomic64_sub(ctl.stripe_size * map->num_stripes,
> +	atomic64_sub(ctl->stripe_size * map->num_stripes,
>   		     &info->free_chunk_space);
>   
>   	free_extent_map(em);
>   	check_raid56_incompat_flag(info, type);
>   	check_raid1c34_incompat_flag(info, type);
>   
> -	kfree(devices_info);
>   	return 0;
>   
>   error_del_extent:
> @@ -5180,7 +5142,55 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>   	free_extent_map(em);
>   	/* One for the tree reference */
>   	free_extent_map(em);
> -error:
> +
> +	return ret;
> +}
> +
> +static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
> +			       u64 start, u64 type)
> +{
> +	struct btrfs_fs_info *info = trans->fs_info;
> +	struct btrfs_fs_devices *fs_devices = info->fs_devices;
> +	struct btrfs_device_info *devices_info = NULL;
> +	struct alloc_chunk_ctl ctl;
> +	int ret;
> +
> +	if (!alloc_profile_is_valid(type, 0)) {
> +		ASSERT(0);
> +		return -EINVAL;
> +	}
> +
> +	if (list_empty(&fs_devices->alloc_list)) {
> +		if (btrfs_test_opt(info, ENOSPC_DEBUG))
> +			btrfs_debug(info, "%s: no writable device", __func__);
> +		return -ENOSPC;
> +	}
> +
> +	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
> +		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
> +		BUG();
> +	}

This is superfluous, alloc_profile_is_valid() handles this check.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 10/21] btrfs: parameterize dev_extent_min
  2020-02-12  7:20 ` [PATCH v2 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
@ 2020-02-13 16:27   ` Josef Bacik
  0 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 16:27 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Currently, we ignore a device whose available space is less than
> "BTRFS_STRIPE_LEN * dev_stripes". This is a lower limit for current
> allocation policy (to maximize the number of stripes). This commit
> parameterizes dev_extent_min, so that other policies can set their own
> lower limitation to ignore a device with an insufficient space.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed()
  2020-02-12  7:20 ` [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
  2020-02-12 14:56   ` Johannes Thumshirn
@ 2020-02-13 19:52   ` Josef Bacik
  2020-02-20  9:57     ` Naohiro Aota
  1 sibling, 1 reply; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:52 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out chunk_allocation_failed() from find_free_extent_update_loop().
> This function is called when it failed to allocate a chunk. The function
> can modify "ffe_ctl->loop" and return 0 to continue with the next stage.
> Or, it can return -ENOSPC to give up here.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>   fs/btrfs/extent-tree.c | 24 ++++++++++++++++--------
>   1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index efc653e6be29..8f0d489f76fa 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3748,6 +3748,21 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
>   	}
>   }
>   
> +static int chunk_allocation_failed(struct find_free_extent_ctl *ffe_ctl)
> +{
> +	switch (ffe_ctl->policy) {
> +	case BTRFS_EXTENT_ALLOC_CLUSTERED:
> +		/*
> +		 * If we can't allocate a new chunk we've already looped through
> +		 * at least once, move on to the NO_EMPTY_SIZE case.
> +		 */
> +		ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
> +		return 0;
> +	default:
> +		BUG();
> +	}
> +}
> +
>   /*
>    * Return >0 means caller needs to re-search for free extent
>    * Return 0 means we have the needed free extent.
> @@ -3819,19 +3834,12 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
>   			ret = btrfs_chunk_alloc(trans, ffe_ctl->flags,
>   						CHUNK_ALLOC_FORCE);
>   
> -			/*
> -			 * If we can't allocate a new chunk we've already looped
> -			 * through at least once, move on to the NO_EMPTY_SIZE
> -			 * case.
> -			 */
>   			if (ret == -ENOSPC)
> -				ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
> +				ret = chunk_allocation_failed(ffe_ctl);
>   
>   			/* Do not bail out on ENOSPC since we can do more. */
>   			if (ret < 0 && ret != -ENOSPC)
>   				btrfs_abort_transaction(trans, ret);
> -			else
> -				ret = 0;

You can't delete this, btrfs_chunk_alloc() will return 1 if it succeeded, and 
we'll leak that out somewhere and bad things will happen.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-12  7:20 ` [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
@ 2020-02-13 19:55   ` Josef Bacik
  2020-02-20  9:56     ` Naohiro Aota
  0 siblings, 1 reply; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:55 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So,
> we can skip this stage and go to LOOP_GIVEUP stage to indicate we gave
> up the allocation. This commit also moves the scope of the "clustered"
> variable.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>   fs/btrfs/extent-tree.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 8f0d489f76fa..3ab0d2f5d718 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3373,6 +3373,7 @@ enum btrfs_loop_type {
>   	LOOP_CACHING_WAIT,
>   	LOOP_ALLOC_CHUNK,
>   	LOOP_NO_EMPTY_SIZE,
> +	LOOP_GIVEUP,

Why do we need a new loop definition here?  Can we just return ENOSPC and be 
done?  You don't appear to use it anywhere, so it doesn't seem like it's needed. 
  Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 11/21] btrfs: introduce extent allocation policy
  2020-02-12  7:20 ` [PATCH v2 11/21] btrfs: introduce extent allocation policy Naohiro Aota
@ 2020-02-13 19:55   ` Josef Bacik
  0 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:55 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> This commit introduces extent allocation policy for btrfs. This policy
> controls how btrfs allocate an extents from block groups.
> 
> There is no functional change introduced with this commit.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl
  2020-02-12  7:20 ` [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
  2020-02-12 14:11   ` Johannes Thumshirn
@ 2020-02-13 19:56   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:56 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> This commit moves hint_byte into find_free_extent_ctl, so that we can
> modify the hint_byte in the other functions. This will help us split
> find_free_extent further. This commit also renames the function argument
> "hint_byte" to "hint_byte_orig" to avoid misuse.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation into find_free_extent_ctl
  2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
  2020-02-12 14:14   ` Johannes Thumshirn
@ 2020-02-13 19:57   ` Josef Bacik
  2020-02-20 14:27   ` Christoph Hellwig
  2 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:57 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Move "last_ptr" and "use_cluster" into struct find_free_extent_ctl, so that
> hook functions for clustered allocator can use these variables.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 14/21] btrfs: factor out do_allocation()
  2020-02-12  7:20 ` [PATCH v2 14/21] btrfs: factor out do_allocation() Naohiro Aota
  2020-02-12 14:45   ` Johannes Thumshirn
@ 2020-02-13 19:57   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:57 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out do_allocation() from find_free_extent(). This function do an
> actual allocation in a given block group. The ffe_ctl->policy is used to
> determine the actual allocator function to use.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions
  2020-02-12  7:20 ` [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
  2020-02-12 14:47   ` Johannes Thumshirn
@ 2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:58 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Now that, find_free_extent_clustered() and find_free_extent_unclustered()
> can access "last_ptr" from the "clustered" variable. So, we can drop it
> from the arguments.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 16/21] btrfs: factor out release_block_group()
  2020-02-12  7:20 ` [PATCH v2 16/21] btrfs: factor out release_block_group() Naohiro Aota
  2020-02-12 14:48   ` Johannes Thumshirn
@ 2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:58 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out release_block_group() from find_free_extent(). This function is
> called when it gives up an allocation from a block group. Each allocation
> policy should reset their information for an allocation in the next block
> group.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 17/21] btrfs: factor out found_extent()
  2020-02-12  7:20 ` [PATCH v2 17/21] btrfs: factor out found_extent() Naohiro Aota
  2020-02-12 14:50   ` Johannes Thumshirn
@ 2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:58 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Factor out found_extent() from find_free_extent_update_loop(). This
> function is called when a proper extent is found and before returning from
> find_free_extent().  Hook functions like found_extent_clustered() should
> save information for a next allocation.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  2020-02-12  7:20 ` [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
  2020-02-12 14:51   ` Johannes Thumshirn
@ 2020-02-13 19:58   ` Josef Bacik
  1 sibling, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:58 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> Now that, we don't use last_ptr and use_cluster in the function. Drop these
> arguments from it.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 21/21] btrfs: factor out prepare_allocation()
  2020-02-12  7:20 ` [PATCH v2 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
@ 2020-02-13 19:59   ` Josef Bacik
  0 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-13 19:59 UTC (permalink / raw)
  To: Naohiro Aota, linux-btrfs, David Sterba
  Cc: Chris Mason, Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

On 2/12/20 2:20 AM, Naohiro Aota wrote:
> This function finally factor out prepare_allocation() form
> find_free_extent(). This function is called before the allocation loop and
> a specific allocator function like prepare_allocation_clustered() should
> initialize their private information and can set proper hint_byte to
> indicate where to start the allocation with.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-13 19:55   ` Josef Bacik
@ 2020-02-20  9:56     ` Naohiro Aota
  2020-02-20 15:40       ` Josef Bacik
  0 siblings, 1 reply; 57+ messages in thread
From: Naohiro Aota @ 2020-02-20  9:56 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, David Sterba, Chris Mason, Nikolay Borisov,
	Damien Le Moal, Johannes Thumshirn, Hannes Reinecke, Anand Jain,
	linux-fsdevel

On Thu, Feb 13, 2020 at 02:55:30PM -0500, Josef Bacik wrote:
>On 2/12/20 2:20 AM, Naohiro Aota wrote:
>>LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So,
>>we can skip this stage and go to LOOP_GIVEUP stage to indicate we gave
>>up the allocation. This commit also moves the scope of the "clustered"
>>variable.
>>
>>Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>>---
>>  fs/btrfs/extent-tree.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>>diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>index 8f0d489f76fa..3ab0d2f5d718 100644
>>--- a/fs/btrfs/extent-tree.c
>>+++ b/fs/btrfs/extent-tree.c
>>@@ -3373,6 +3373,7 @@ enum btrfs_loop_type {
>>  	LOOP_CACHING_WAIT,
>>  	LOOP_ALLOC_CHUNK,
>>  	LOOP_NO_EMPTY_SIZE,
>>+	LOOP_GIVEUP,
>
>Why do we need a new loop definition here?  Can we just return ENOSPC 
>and be done?  You don't appear to use it anywhere, so it doesn't seem 
>like it's needed.  Thanks,
>
>Josef

This is for other allocation policy to skip unnecessary loop stages
(e.g. LOOP_NO_EMPTY_SIZE) from an earlier stage. For example, zoned
allocation policy can implement the code below in
chunk_allocation_failed() to skip the following stages.

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4badfae0c932..0a18c09b078b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3775,6 +3854,10 @@ static int chunk_allocation_failed(struct find_free_extent_ctl *ffe_ctl)
                  */
                 ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
                 return 0;
+       case BTRFS_EXTENT_ALLOC_ZONED:
+               /* give up here */
+               ffe_ctl->loop = LOOP_GIVEUP;
+               return -ENOSPC;
         default:
                 BUG();
         }

But, I can keep this LOOP_GIVEUP introduction patch later with this
zoned allocator ones.

Thanks,

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed()
  2020-02-13 19:52   ` Josef Bacik
@ 2020-02-20  9:57     ` Naohiro Aota
  0 siblings, 0 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-20  9:57 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, David Sterba, Chris Mason, Nikolay Borisov,
	Damien Le Moal, Johannes Thumshirn, Hannes Reinecke, Anand Jain,
	linux-fsdevel

On Thu, Feb 13, 2020 at 02:52:08PM -0500, Josef Bacik wrote:
>On 2/12/20 2:20 AM, Naohiro Aota wrote:
>>Factor out chunk_allocation_failed() from find_free_extent_update_loop().
>>This function is called when it failed to allocate a chunk. The function
>>can modify "ffe_ctl->loop" and return 0 to continue with the next stage.
>>Or, it can return -ENOSPC to give up here.
>>
>>Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>>---
>>  fs/btrfs/extent-tree.c | 24 ++++++++++++++++--------
>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>
>>diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>index efc653e6be29..8f0d489f76fa 100644
>>--- a/fs/btrfs/extent-tree.c
>>+++ b/fs/btrfs/extent-tree.c
>>@@ -3748,6 +3748,21 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
>>  	}
>>  }
>>+static int chunk_allocation_failed(struct find_free_extent_ctl *ffe_ctl)
>>+{
>>+	switch (ffe_ctl->policy) {
>>+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
>>+		/*
>>+		 * If we can't allocate a new chunk we've already looped through
>>+		 * at least once, move on to the NO_EMPTY_SIZE case.
>>+		 */
>>+		ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
>>+		return 0;
>>+	default:
>>+		BUG();
>>+	}
>>+}
>>+
>>  /*
>>   * Return >0 means caller needs to re-search for free extent
>>   * Return 0 means we have the needed free extent.
>>@@ -3819,19 +3834,12 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
>>  			ret = btrfs_chunk_alloc(trans, ffe_ctl->flags,
>>  						CHUNK_ALLOC_FORCE);
>>-			/*
>>-			 * If we can't allocate a new chunk we've already looped
>>-			 * through at least once, move on to the NO_EMPTY_SIZE
>>-			 * case.
>>-			 */
>>  			if (ret == -ENOSPC)
>>-				ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
>>+				ret = chunk_allocation_failed(ffe_ctl);
>>  			/* Do not bail out on ENOSPC since we can do more. */
>>  			if (ret < 0 && ret != -ENOSPC)
>>  				btrfs_abort_transaction(trans, ret);
>>-			else
>>-				ret = 0;
>
>You can't delete this, btrfs_chunk_alloc() will return 1 if it 
>succeeded, and we'll leak that out somewhere and bad things will 
>happen.  Thanks,
>
>Josef

Ah, I missed that case... I'll fix in the next series.

Thanks,

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 09/21] btrfs: factor out create_chunk()
  2020-02-13 16:24   ` Josef Bacik
@ 2020-02-20 10:17     ` Naohiro Aota
  0 siblings, 0 replies; 57+ messages in thread
From: Naohiro Aota @ 2020-02-20 10:17 UTC (permalink / raw)
  To: Josef Bacik
  Cc: linux-btrfs, David Sterba, Chris Mason, Nikolay Borisov,
	Damien Le Moal, Johannes Thumshirn, Hannes Reinecke, Anand Jain,
	linux-fsdevel

On Thu, Feb 13, 2020 at 11:24:57AM -0500, Josef Bacik wrote:
>On 2/12/20 2:20 AM, Naohiro Aota wrote:
>>Factor out create_chunk() from __btrfs_alloc_chunk(). This function finally
>>creates a chunk. There is no functional changes.
>>
>>Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>>---
>>  fs/btrfs/volumes.c | 130 ++++++++++++++++++++++++---------------------
>>  1 file changed, 70 insertions(+), 60 deletions(-)
>>
>>diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>index 00085943e4dd..3e2e3896d72a 100644
>>--- a/fs/btrfs/volumes.c
>>+++ b/fs/btrfs/volumes.c
>>@@ -5052,90 +5052,53 @@ static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
>>  	}
>>  }
>>-static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>-			       u64 start, u64 type)
>>+static int create_chunk(struct btrfs_trans_handle *trans,
>>+			struct alloc_chunk_ctl *ctl,
>>+			struct btrfs_device_info *devices_info)
>>  {
>>  	struct btrfs_fs_info *info = trans->fs_info;
>>-	struct btrfs_fs_devices *fs_devices = info->fs_devices;
>>  	struct map_lookup *map = NULL;
>>  	struct extent_map_tree *em_tree;
>>  	struct extent_map *em;
>>-	struct btrfs_device_info *devices_info = NULL;
>>-	struct alloc_chunk_ctl ctl;
>>+	u64 start = ctl->start;
>>+	u64 type = ctl->type;
>>  	int ret;
>>  	int i;
>>  	int j;
>>-	if (!alloc_profile_is_valid(type, 0)) {
>>-		ASSERT(0);
>>-		return -EINVAL;
>>-	}
>>-
>>-	if (list_empty(&fs_devices->alloc_list)) {
>>-		if (btrfs_test_opt(info, ENOSPC_DEBUG))
>>-			btrfs_debug(info, "%s: no writable device", __func__);
>>-		return -ENOSPC;
>>-	}
>>-
>>-	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
>>-		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
>>-		BUG();
>>-	}
>>-
>>-	ctl.start = start;
>>-	ctl.type = type;
>>-	init_alloc_chunk_ctl(fs_devices, &ctl);
>>-
>>-	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
>>-			       GFP_NOFS);
>>-	if (!devices_info)
>>+	map = kmalloc(map_lookup_size(ctl->num_stripes), GFP_NOFS);
>>+	if (!map)
>>  		return -ENOMEM;
>>+	map->num_stripes = ctl->num_stripes;
>>-	ret = gather_device_info(fs_devices, &ctl, devices_info);
>>-	if (ret < 0)
>>-		goto error;
>>-
>>-	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
>>-	if (ret < 0)
>>-		goto error;
>>-
>>-	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
>>-	if (!map) {
>>-		ret = -ENOMEM;
>>-		goto error;
>>-	}
>>-
>>-	map->num_stripes = ctl.num_stripes;
>>-
>>-	for (i = 0; i < ctl.ndevs; ++i) {
>>-		for (j = 0; j < ctl.dev_stripes; ++j) {
>>-			int s = i * ctl.dev_stripes + j;
>>+	for (i = 0; i < ctl->ndevs; ++i) {
>>+		for (j = 0; j < ctl->dev_stripes; ++j) {
>>+			int s = i * ctl->dev_stripes + j;
>>  			map->stripes[s].dev = devices_info[i].dev;
>>  			map->stripes[s].physical = devices_info[i].dev_offset +
>>-						   j * ctl.stripe_size;
>>+						   j * ctl->stripe_size;
>>  		}
>>  	}
>>  	map->stripe_len = BTRFS_STRIPE_LEN;
>>  	map->io_align = BTRFS_STRIPE_LEN;
>>  	map->io_width = BTRFS_STRIPE_LEN;
>>  	map->type = type;
>>-	map->sub_stripes = ctl.sub_stripes;
>>+	map->sub_stripes = ctl->sub_stripes;
>>-	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
>>+	trace_btrfs_chunk_alloc(info, map, start, ctl->chunk_size);
>>  	em = alloc_extent_map();
>>  	if (!em) {
>>  		kfree(map);
>>-		ret = -ENOMEM;
>>-		goto error;
>>+		return -ENOMEM;
>>  	}
>>  	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
>>  	em->map_lookup = map;
>>  	em->start = start;
>>-	em->len = ctl.chunk_size;
>>+	em->len = ctl->chunk_size;
>>  	em->block_start = 0;
>>  	em->block_len = em->len;
>>-	em->orig_block_len = ctl.stripe_size;
>>+	em->orig_block_len = ctl->stripe_size;
>>  	em_tree = &info->mapping_tree;
>>  	write_lock(&em_tree->lock);
>>@@ -5143,11 +5106,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>  	if (ret) {
>>  		write_unlock(&em_tree->lock);
>>  		free_extent_map(em);
>>-		goto error;
>>+		return ret;
>>  	}
>>  	write_unlock(&em_tree->lock);
>>-	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
>>+	ret = btrfs_make_block_group(trans, 0, type, start, ctl->chunk_size);
>>  	if (ret)
>>  		goto error_del_extent;
>>@@ -5155,20 +5118,19 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>  		struct btrfs_device *dev = map->stripes[i].dev;
>>  		btrfs_device_set_bytes_used(dev,
>>-					    dev->bytes_used + ctl.stripe_size);
>>+					    dev->bytes_used + ctl->stripe_size);
>>  		if (list_empty(&dev->post_commit_list))
>>  			list_add_tail(&dev->post_commit_list,
>>  				      &trans->transaction->dev_update_list);
>>  	}
>>-	atomic64_sub(ctl.stripe_size * map->num_stripes,
>>+	atomic64_sub(ctl->stripe_size * map->num_stripes,
>>  		     &info->free_chunk_space);
>>  	free_extent_map(em);
>>  	check_raid56_incompat_flag(info, type);
>>  	check_raid1c34_incompat_flag(info, type);
>>-	kfree(devices_info);
>>  	return 0;
>>  error_del_extent:
>>@@ -5180,7 +5142,55 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>  	free_extent_map(em);
>>  	/* One for the tree reference */
>>  	free_extent_map(em);
>>-error:
>>+
>>+	return ret;
>>+}
>>+
>>+static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
>>+			       u64 start, u64 type)
>>+{
>>+	struct btrfs_fs_info *info = trans->fs_info;
>>+	struct btrfs_fs_devices *fs_devices = info->fs_devices;
>>+	struct btrfs_device_info *devices_info = NULL;
>>+	struct alloc_chunk_ctl ctl;
>>+	int ret;
>>+
>>+	if (!alloc_profile_is_valid(type, 0)) {
>>+		ASSERT(0);
>>+		return -EINVAL;
>>+	}
>>+
>>+	if (list_empty(&fs_devices->alloc_list)) {
>>+		if (btrfs_test_opt(info, ENOSPC_DEBUG))
>>+			btrfs_debug(info, "%s: no writable device", __func__);
>>+		return -ENOSPC;
>>+	}
>>+
>>+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
>>+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
>>+		BUG();
>>+	}
>
>This is superfluous, alloc_profile_is_valid() handles this check.  Thanks,
>
>Josef

This checks if at least one block group type (data, metadata or
system) flag is set. OTOH, alloc_profile_is_valid() checks if profile
bits are valid.

Maybe, we can move this check into alloc_profile_is_valid()?

Thanks,

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation into find_free_extent_ctl
  2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
  2020-02-12 14:14   ` Johannes Thumshirn
  2020-02-13 19:57   ` Josef Bacik
@ 2020-02-20 14:27   ` Christoph Hellwig
  2 siblings, 0 replies; 57+ messages in thread
From: Christoph Hellwig @ 2020-02-20 14:27 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: linux-btrfs, David Sterba, Chris Mason, Josef Bacik,
	Nikolay Borisov, Damien Le Moal, Johannes Thumshirn,
	Hannes Reinecke, Anand Jain, linux-fsdevel

s/vairalbes/variables/ in the Subject.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-20  9:56     ` Naohiro Aota
@ 2020-02-20 15:40       ` Josef Bacik
  0 siblings, 0 replies; 57+ messages in thread
From: Josef Bacik @ 2020-02-20 15:40 UTC (permalink / raw)
  To: Naohiro Aota
  Cc: linux-btrfs, David Sterba, Chris Mason, Nikolay Borisov,
	Damien Le Moal, Johannes Thumshirn, Hannes Reinecke, Anand Jain,
	linux-fsdevel

On 2/20/20 4:56 AM, Naohiro Aota wrote:
> On Thu, Feb 13, 2020 at 02:55:30PM -0500, Josef Bacik wrote:
>> On 2/12/20 2:20 AM, Naohiro Aota wrote:
>>> LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So,
>>> we can skip this stage and go to LOOP_GIVEUP stage to indicate we gave
>>> up the allocation. This commit also moves the scope of the "clustered"
>>> variable.
>>>
>>> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>>> ---
>>>  fs/btrfs/extent-tree.c | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>> index 8f0d489f76fa..3ab0d2f5d718 100644
>>> --- a/fs/btrfs/extent-tree.c
>>> +++ b/fs/btrfs/extent-tree.c
>>> @@ -3373,6 +3373,7 @@ enum btrfs_loop_type {
>>>      LOOP_CACHING_WAIT,
>>>      LOOP_ALLOC_CHUNK,
>>>      LOOP_NO_EMPTY_SIZE,
>>> +    LOOP_GIVEUP,
>>
>> Why do we need a new loop definition here?  Can we just return ENOSPC and be 
>> done?  You don't appear to use it anywhere, so it doesn't seem like it's 
>> needed.  Thanks,
>>
>> Josef
> 
> This is for other allocation policy to skip unnecessary loop stages
> (e.g. LOOP_NO_EMPTY_SIZE) from an earlier stage. For example, zoned
> allocation policy can implement the code below in
> chunk_allocation_failed() to skip the following stages.
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 4badfae0c932..0a18c09b078b 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3775,6 +3854,10 @@ static int chunk_allocation_failed(struct 
> find_free_extent_ctl *ffe_ctl)
>                   */
>                  ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
>                  return 0;
> +       case BTRFS_EXTENT_ALLOC_ZONED:
> +               /* give up here */
> +               ffe_ctl->loop = LOOP_GIVEUP;
> +               return -ENOSPC;
>          default:
>                  BUG();
>          }
> 
> But, I can keep this LOOP_GIVEUP introduction patch later with this
> zoned allocator ones.
> 

Yes I'd rather they be with the real user, otherwise it's just confusing.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, back to index

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12  7:20 [PATCH v2 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 01/21] btrfs: change type of full_search to bool Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
2020-02-12  7:58   ` Johannes Thumshirn
2020-02-12 14:14   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
2020-02-12 14:16   ` Josef Bacik
2020-02-13 11:32   ` Johannes Thumshirn
2020-02-12  7:20 ` [PATCH v2 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
2020-02-12  8:00   ` Johannes Thumshirn
2020-02-13 16:17   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
2020-02-13 16:20   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 07/21] btrfs: factor out gather_device_info() Naohiro Aota
2020-02-12  8:04   ` Johannes Thumshirn
2020-02-12  7:20 ` [PATCH v2 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 09/21] btrfs: factor out create_chunk() Naohiro Aota
2020-02-12  8:07   ` Johannes Thumshirn
2020-02-13 16:24   ` Josef Bacik
2020-02-20 10:17     ` Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
2020-02-13 16:27   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 11/21] btrfs: introduce extent allocation policy Naohiro Aota
2020-02-13 19:55   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
2020-02-12 14:11   ` Johannes Thumshirn
2020-02-13 19:56   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 13/21] btrfs: move vairalbes for clustered allocation " Naohiro Aota
2020-02-12 14:14   ` Johannes Thumshirn
2020-02-13 19:57   ` Josef Bacik
2020-02-20 14:27   ` Christoph Hellwig
2020-02-12  7:20 ` [PATCH v2 14/21] btrfs: factor out do_allocation() Naohiro Aota
2020-02-12 14:45   ` Johannes Thumshirn
2020-02-13 19:57   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
2020-02-12 14:47   ` Johannes Thumshirn
2020-02-13 19:58   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 16/21] btrfs: factor out release_block_group() Naohiro Aota
2020-02-12 14:48   ` Johannes Thumshirn
2020-02-13 19:58   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 17/21] btrfs: factor out found_extent() Naohiro Aota
2020-02-12 14:50   ` Johannes Thumshirn
2020-02-13 19:58   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
2020-02-12 14:51   ` Johannes Thumshirn
2020-02-13 19:58   ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
2020-02-12 14:56   ` Johannes Thumshirn
2020-02-13 19:52   ` Josef Bacik
2020-02-20  9:57     ` Naohiro Aota
2020-02-12  7:20 ` [PATCH v2 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
2020-02-13 19:55   ` Josef Bacik
2020-02-20  9:56     ` Naohiro Aota
2020-02-20 15:40       ` Josef Bacik
2020-02-12  7:20 ` [PATCH v2 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
2020-02-13 19:59   ` Josef Bacik

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git