All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation
@ 2020-02-25  3:56 Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 01/21] btrfs: change type of full_search to bool Naohiro Aota
                   ` (21 more replies)
  0 siblings, 22 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

This series refactors chunk allocation, device_extent allocation and
extent allocation functions and make them generalized to be able to
implement other allocation policy easily.

On top of this series, we can simplify some part of the "btrfs: zoned
block device support" series as adding a new type of chunk allocator
and extent allocator for zoned block devices. Furthermore, we will be
able to implement and test some other allocator in the idea page of
the wiki e.g. SSD caching, dedicated metadata drive, chunk allocation
groups, and so on.

This series has no functional changes except introducing "enum
btrfs_chunk_allocation_policy" and "enum
btrfs_extent_allocation_policy".

* Refactoring chunk/dev_extent allocator

Two functions are separated from find_free_dev_extent_start().
dev_extent_search_start() decides the starting position of the search.
dev_extent_hole_check() checks if a hole found is suitable for device
extent allocation.

__btrfs_alloc_chunk() is split into four functions. set_parameters()
initializes the parameters of an allocation. gather_device_info()
loops over devices and gather information of
them. decide_stripe_size() decides the size of chunk and
device_extent. And, create_chunk() creates a chunk and device extents.

* Refactoring extent allocator

Three functions are introduced in
find_free_extent(). prepare_allocation() initializes the parameters
and gives a hint byte to start the allocation with. do_allocation()
handles the actual allocation in a given block group.
release_block_group() is called when it gives up an allocation from a
block group, so the allocation context should be reset.

Two functions are introduced in find_free_extent_update_loop().
found_extent() is called when the allocator finally find a proper
extent. chunk_allocation_failed() is called when it failed to allocate
a new chunk. An allocator implementation can use this hook to set the
next stage to try e.g. LOOP_NO_EMPTY_SIZE.

Furthermore, LOOP_NO_EMPTY_SIZE stage is tweaked so that other
allocator than the current clustered allocator skips this stage.

* Patch organization

Patch 1 is a trivial patch to fix the type of an argument of
find_free_extent_update_loop().

Patch 2 removes a BUG_ON from __btrfs_alloc_chunk().

Patches 3-10 refactors chunk and device_extent allocation functions:
find_free_dev_extent_start() and __btrfs_alloc_chunk().

Patches 11-21 refactors extent allocation function: find_free_extent()
and find_free_extent_update_loop().

* Changelog

 - v3
   - Fix handling of btrfs_chunk_alloc()'s return value
   - Drop LOOP_GIVEUP, which is not currently useful
   - Convert another BUG_ON to ASSERT and -EINVAL in patch 6
   - Subtle typo and wording fix

 - v2
   - Stop separating "clustered_alloc_info" from find_free_extent_ctl
   - Change return type of dev_extent_hole_check() to bool
   - Rename set_parameters() to init_alloc_chunk_ctl()
   - Add a patch to remove BUG_ON from __btrfs_alloc_chunk()

Naohiro Aota (21):
  btrfs: change type of full_search to bool
  btrfs: do not BUG_ON with invalid profile
  btrfs: introduce chunk allocation policy
  btrfs: refactor find_free_dev_extent_start()
  btrfs: introduce alloc_chunk_ctl
  btrfs: factor out init_alloc_chunk_ctl
  btrfs: factor out gather_device_info()
  btrfs: factor out decide_stripe_size()
  btrfs: factor out create_chunk()
  btrfs: parameterize dev_extent_min
  btrfs: introduce extent allocation policy
  btrfs: move hint_byte into find_free_extent_ctl
  btrfs: move variables for clustered allocation into
    find_free_extent_ctl
  btrfs: factor out do_allocation()
  btrfs: drop unnecessary arguments from clustered allocation functions
  btrfs: factor out release_block_group()
  btrfs: factor out found_extent()
  btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  btrfs: factor out chunk_allocation_failed()
  btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  btrfs: factor out prepare_allocation()

 fs/btrfs/extent-tree.c | 313 +++++++++++++++++++++-----------
 fs/btrfs/volumes.c     | 399 +++++++++++++++++++++++++++--------------
 fs/btrfs/volumes.h     |   6 +
 3 files changed, 481 insertions(+), 237 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 01/21] btrfs: change type of full_search to bool
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

While the "full_search" variable defined in find_free_extent() is bool, but
the full_search argument of find_free_extent_update_loop() is defined as
int. Let's trivially fix the argument type.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a7bc66121330..ed646a13956d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3672,7 +3672,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 					struct btrfs_free_cluster *last_ptr,
 					struct btrfs_key *ins,
 					struct find_free_extent_ctl *ffe_ctl,
-					int full_search, bool use_cluster)
+					bool full_search, bool use_cluster)
 {
 	struct btrfs_root *root = fs_info->extent_root;
 	int ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 02/21] btrfs: do not BUG_ON with invalid profile
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 01/21] btrfs: change type of full_search to bool Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn,
	Johannes Thumshirn

Do not BUG_ON() when an invalid profile is passed to __btrfs_alloc_chunk().
Instead return -EINVAL with ASSERT() to catch a bug in the development
stage.

Suggested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9cfc668f91f4..911c6b7c650b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4809,7 +4809,10 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	int j;
 	int index;
 
-	BUG_ON(!alloc_profile_is_valid(type, 0));
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
 
 	if (list_empty(&fs_devices->alloc_list)) {
 		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 03/21] btrfs: introduce chunk allocation policy
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 01/21] btrfs: change type of full_search to bool Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

This commit introduces chunk allocation policy for btrfs. This policy
controls how btrfs allocate a chunk and device extents from devices.

There is no functional change introduced with this commit.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 1 +
 fs/btrfs/volumes.h | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 911c6b7c650b..9fcbed9b57f9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1209,6 +1209,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->opened = 1;
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
+	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
 out:
 	return ret;
 }
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 309cda477589..6f462b337fce 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -209,6 +209,10 @@ BTRFS_DEVICE_GETSET_FUNCS(total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(disk_total_bytes);
 BTRFS_DEVICE_GETSET_FUNCS(bytes_used);
 
+enum btrfs_chunk_allocation_policy {
+	BTRFS_CHUNK_ALLOC_REGULAR,
+};
+
 struct btrfs_fs_devices {
 	u8 fsid[BTRFS_FSID_SIZE]; /* FS specific uuid */
 	u8 metadata_uuid[BTRFS_FSID_SIZE];
@@ -260,6 +264,8 @@ struct btrfs_fs_devices {
 	struct kobject *devices_kobj;
 	struct kobject *devinfo_kobj;
 	struct completion kobj_unregister;
+
+	enum btrfs_chunk_allocation_policy chunk_alloc_policy;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE	64
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 04/21] btrfs: refactor find_free_dev_extent_start()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (2 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out two functions from find_free_dev_extent_start().
dev_extent_search_start() decides the starting position of the search.
dev_extent_hole_check() checks if a hole found is suitable for device
extent allocation.

These functions also have the switch-cases to change the allocation
behavior depending on the policy.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 80 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 60 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9fcbed9b57f9..5f4e875a42f6 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1384,6 +1384,61 @@ static bool contains_pending_extent(struct btrfs_device *device, u64 *start,
 	return false;
 }
 
+static u64 dev_extent_search_start(struct btrfs_device *device, u64 start)
+{
+	switch (device->fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		/*
+		 * We don't want to overwrite the superblock on the
+		 * drive nor any area used by the boot loader (grub
+		 * for example), so we make sure to start at an offset
+		 * of at least 1MB.
+		 */
+		return max_t(u64, start, SZ_1M);
+	default:
+		BUG();
+	}
+}
+
+/**
+ * dev_extent_hole_check - check if specified hole is suitable for allocation
+ * @device:	the device which we have the hole
+ * @hole_start: starting position of the hole
+ * @hole_size:	the size of the hole
+ * @num_bytes:	the size of the free space that we need
+ *
+ * This function may modify @hole_start and @hole_end to reflect the
+ * suitable position for allocation. Returns 1 if hole position is
+ * updated, 0 otherwise.
+ */
+static bool dev_extent_hole_check(struct btrfs_device *device, u64 *hole_start,
+				  u64 *hole_size, u64 num_bytes)
+{
+	bool changed = false;
+	u64 hole_end = *hole_start + *hole_size;
+
+	/*
+	 * Have to check before we set max_hole_start, otherwise we
+	 * could end up sending back this offset anyway.
+	 */
+	if (contains_pending_extent(device, hole_start, *hole_size)) {
+		if (hole_end >= *hole_start)
+			*hole_size = hole_end - *hole_start;
+		else
+			*hole_size = 0;
+		changed = true;
+	}
+
+	switch (device->fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		/* No extra check */
+		break;
+	default:
+		BUG();
+	}
+
+	return changed;
+}
 
 /*
  * find_free_dev_extent_start - find free space in the specified device
@@ -1430,12 +1485,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 	int slot;
 	struct extent_buffer *l;
 
-	/*
-	 * We don't want to overwrite the superblock on the drive nor any area
-	 * used by the boot loader (grub for example), so we make sure to start
-	 * at an offset of at least 1MB.
-	 */
-	search_start = max_t(u64, search_start, SZ_1M);
+	search_start = dev_extent_search_start(device, search_start);
 
 	path = btrfs_alloc_path();
 	if (!path)
@@ -1493,18 +1543,8 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 
 		if (key.offset > search_start) {
 			hole_size = key.offset - search_start;
-
-			/*
-			 * Have to check before we set max_hole_start, otherwise
-			 * we could end up sending back this offset anyway.
-			 */
-			if (contains_pending_extent(device, &search_start,
-						    hole_size)) {
-				if (key.offset >= search_start)
-					hole_size = key.offset - search_start;
-				else
-					hole_size = 0;
-			}
+			dev_extent_hole_check(device, &search_start, &hole_size,
+					      num_bytes);
 
 			if (hole_size > max_hole_size) {
 				max_hole_start = search_start;
@@ -1543,8 +1583,8 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
 	 */
 	if (search_end > search_start) {
 		hole_size = search_end - search_start;
-
-		if (contains_pending_extent(device, &search_start, hole_size)) {
+		if (dev_extent_hole_check(device, &search_start, &hole_size,
+					  num_bytes)) {
 			btrfs_release_path(path);
 			goto again;
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 05/21] btrfs: introduce alloc_chunk_ctl
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (3 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Introduce "struct alloc_chunk_ctl" to wrap needed parameters for the chunk
allocation.  This will be used to split __btrfs_alloc_chunk() into smaller
functions.

This commit folds a number of local variables in __btrfs_alloc_chunk() into
one "struct alloc_chunk_ctl ctl". There is no functional change.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 151 ++++++++++++++++++++++++++-------------------
 1 file changed, 86 insertions(+), 65 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5f4e875a42f6..ed90e9d2bd9b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4818,6 +4818,29 @@ static void check_raid1c34_incompat_flag(struct btrfs_fs_info *info, u64 type)
 	btrfs_set_fs_incompat(info, RAID1C34);
 }
 
+/*
+ * Structure used internally for __btrfs_alloc_chunk() function.
+ * Wraps needed parameters.
+ */
+struct alloc_chunk_ctl {
+	u64 start;
+	u64 type;
+	int num_stripes;	/* total number of stripes to allocate */
+	int sub_stripes;	/* sub_stripes info for map */
+	int dev_stripes;	/* stripes per dev */
+	int devs_max;		/* max devs to use */
+	int devs_min;		/* min devs needed */
+	int devs_increment;	/* ndevs has to be a multiple of this */
+	int ncopies;		/* how many copies to data has */
+	int nparity;		/* number of stripes worth of bytes to
+				   store parity information */
+	u64 max_stripe_size;
+	u64 max_chunk_size;
+	u64 stripe_size;
+	u64 chunk_size;
+	int ndevs;
+};
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4828,23 +4851,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	struct extent_map_tree *em_tree;
 	struct extent_map *em;
 	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
 	u64 total_avail;
-	int num_stripes;	/* total number of stripes to allocate */
 	int data_stripes;	/* number of stripes that count for
 				   block group size */
-	int sub_stripes;	/* sub_stripes info for map */
-	int dev_stripes;	/* stripes per dev */
-	int devs_max;		/* max devs to use */
-	int devs_min;		/* min devs needed */
-	int devs_increment;	/* ndevs has to be a multiple of this */
-	int ncopies;		/* how many copies to data has */
-	int nparity;		/* number of stripes worth of bytes to
-				   store parity information */
 	int ret;
-	u64 max_stripe_size;
-	u64 max_chunk_size;
-	u64 stripe_size;
-	u64 chunk_size;
 	int ndevs;
 	int i;
 	int j;
@@ -4861,32 +4872,36 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		return -ENOSPC;
 	}
 
+	ctl.start = start;
+	ctl.type = type;
+
 	index = btrfs_bg_flags_to_raid_index(type);
 
-	sub_stripes = btrfs_raid_array[index].sub_stripes;
-	dev_stripes = btrfs_raid_array[index].dev_stripes;
-	devs_max = btrfs_raid_array[index].devs_max;
-	if (!devs_max)
-		devs_max = BTRFS_MAX_DEVS(info);
-	devs_min = btrfs_raid_array[index].devs_min;
-	devs_increment = btrfs_raid_array[index].devs_increment;
-	ncopies = btrfs_raid_array[index].ncopies;
-	nparity = btrfs_raid_array[index].nparity;
+	ctl.sub_stripes = btrfs_raid_array[index].sub_stripes;
+	ctl.dev_stripes = btrfs_raid_array[index].dev_stripes;
+	ctl.devs_max = btrfs_raid_array[index].devs_max;
+	if (!ctl.devs_max)
+		ctl.devs_max = BTRFS_MAX_DEVS(info);
+	ctl.devs_min = btrfs_raid_array[index].devs_min;
+	ctl.devs_increment = btrfs_raid_array[index].devs_increment;
+	ctl.ncopies = btrfs_raid_array[index].ncopies;
+	ctl.nparity = btrfs_raid_array[index].nparity;
 
 	if (type & BTRFS_BLOCK_GROUP_DATA) {
-		max_stripe_size = SZ_1G;
-		max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
+		ctl.max_stripe_size = SZ_1G;
+		ctl.max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
 	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
 		/* for larger filesystems, use larger metadata chunks */
 		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-			max_stripe_size = SZ_1G;
+			ctl.max_stripe_size = SZ_1G;
 		else
-			max_stripe_size = SZ_256M;
-		max_chunk_size = max_stripe_size;
+			ctl.max_stripe_size = SZ_256M;
+		ctl.max_chunk_size = ctl.max_stripe_size;
 	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
-		max_stripe_size = SZ_32M;
-		max_chunk_size = 2 * max_stripe_size;
-		devs_max = min_t(int, devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
+		ctl.max_stripe_size = SZ_32M;
+		ctl.max_chunk_size = 2 * ctl.max_stripe_size;
+		ctl.devs_max = min_t(int, ctl.devs_max,
+				      BTRFS_MAX_DEVS_SYS_CHUNK);
 	} else {
 		btrfs_err(info, "invalid chunk type 0x%llx requested",
 		       type);
@@ -4894,8 +4909,8 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	}
 
 	/* We don't want a chunk larger than 10% of writable space */
-	max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
-			     max_chunk_size);
+	ctl.max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
+				  ctl.max_chunk_size);
 
 	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
 			       GFP_NOFS);
@@ -4931,21 +4946,21 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		if (total_avail == 0)
 			continue;
 
-		ret = find_free_dev_extent(device,
-					   max_stripe_size * dev_stripes,
-					   &dev_offset, &max_avail);
+		ret = find_free_dev_extent(
+			device, ctl.max_stripe_size * ctl.dev_stripes,
+			&dev_offset, &max_avail);
 		if (ret && ret != -ENOSPC)
 			goto error;
 
 		if (ret == 0)
-			max_avail = max_stripe_size * dev_stripes;
+			max_avail = ctl.max_stripe_size * ctl.dev_stripes;
 
-		if (max_avail < BTRFS_STRIPE_LEN * dev_stripes) {
+		if (max_avail < BTRFS_STRIPE_LEN * ctl.dev_stripes) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
 			"%s: devid %llu has no free space, have=%llu want=%u",
 					    __func__, device->devid, max_avail,
-					    BTRFS_STRIPE_LEN * dev_stripes);
+					    BTRFS_STRIPE_LEN * ctl.dev_stripes);
 			continue;
 		}
 
@@ -4960,30 +4975,31 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		devices_info[ndevs].dev = device;
 		++ndevs;
 	}
+	ctl.ndevs = ndevs;
 
 	/*
 	 * now sort the devices by hole size / available space
 	 */
-	sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
+	sort(devices_info, ctl.ndevs, sizeof(struct btrfs_device_info),
 	     btrfs_cmp_device_info, NULL);
 
 	/*
 	 * Round down to number of usable stripes, devs_increment can be any
 	 * number so we can't use round_down()
 	 */
-	ndevs -= ndevs % devs_increment;
+	ctl.ndevs -= ctl.ndevs % ctl.devs_increment;
 
-	if (ndevs < devs_min) {
+	if (ctl.ndevs < ctl.devs_min) {
 		ret = -ENOSPC;
 		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
 			btrfs_debug(info,
 	"%s: not enough devices with free space: have=%d minimum required=%d",
-				    __func__, ndevs, devs_min);
+				    __func__, ctl.ndevs, ctl.devs_min);
 		}
 		goto error;
 	}
 
-	ndevs = min(ndevs, devs_max);
+	ctl.ndevs = min(ctl.ndevs, ctl.devs_max);
 
 	/*
 	 * The primary goal is to maximize the number of stripes, so use as
@@ -4992,14 +5008,15 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	 * The DUP profile stores more than one stripe per device, the
 	 * max_avail is the total size so we have to adjust.
 	 */
-	stripe_size = div_u64(devices_info[ndevs - 1].max_avail, dev_stripes);
-	num_stripes = ndevs * dev_stripes;
+	ctl.stripe_size = div_u64(devices_info[ctl.ndevs - 1].max_avail,
+				   ctl.dev_stripes);
+	ctl.num_stripes = ctl.ndevs * ctl.dev_stripes;
 
 	/*
 	 * this will have to be fixed for RAID1 and RAID10 over
 	 * more drives
 	 */
-	data_stripes = (num_stripes - nparity) / ncopies;
+	data_stripes = (ctl.num_stripes - ctl.nparity) / ctl.ncopies;
 
 	/*
 	 * Use the number of data stripes to figure out how big this chunk
@@ -5007,44 +5024,46 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	 * and compare that answer with the max chunk size. If it's higher,
 	 * we try to reduce stripe_size.
 	 */
-	if (stripe_size * data_stripes > max_chunk_size) {
+	if (ctl.stripe_size * data_stripes > ctl.max_chunk_size) {
 		/*
 		 * Reduce stripe_size, round it up to a 16MB boundary again and
 		 * then use it, unless it ends up being even bigger than the
 		 * previous value we had already.
 		 */
-		stripe_size = min(round_up(div_u64(max_chunk_size,
-						   data_stripes), SZ_16M),
-				  stripe_size);
+		ctl.stripe_size =
+			min(round_up(div_u64(ctl.max_chunk_size, data_stripes),
+				     SZ_16M),
+			    ctl.stripe_size);
 	}
 
 	/* align to BTRFS_STRIPE_LEN */
-	stripe_size = round_down(stripe_size, BTRFS_STRIPE_LEN);
+	ctl.stripe_size = round_down(ctl.stripe_size, BTRFS_STRIPE_LEN);
 
-	map = kmalloc(map_lookup_size(num_stripes), GFP_NOFS);
+	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
 	if (!map) {
 		ret = -ENOMEM;
 		goto error;
 	}
-	map->num_stripes = num_stripes;
 
-	for (i = 0; i < ndevs; ++i) {
-		for (j = 0; j < dev_stripes; ++j) {
-			int s = i * dev_stripes + j;
+	map->num_stripes = ctl.num_stripes;
+
+	for (i = 0; i < ctl.ndevs; ++i) {
+		for (j = 0; j < ctl.dev_stripes; ++j) {
+			int s = i * ctl.dev_stripes + j;
 			map->stripes[s].dev = devices_info[i].dev;
 			map->stripes[s].physical = devices_info[i].dev_offset +
-						   j * stripe_size;
+						   j * ctl.stripe_size;
 		}
 	}
 	map->stripe_len = BTRFS_STRIPE_LEN;
 	map->io_align = BTRFS_STRIPE_LEN;
 	map->io_width = BTRFS_STRIPE_LEN;
 	map->type = type;
-	map->sub_stripes = sub_stripes;
+	map->sub_stripes = ctl.sub_stripes;
 
-	chunk_size = stripe_size * data_stripes;
+	ctl.chunk_size = ctl.stripe_size * data_stripes;
 
-	trace_btrfs_chunk_alloc(info, map, start, chunk_size);
+	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
 
 	em = alloc_extent_map();
 	if (!em) {
@@ -5055,10 +5074,10 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
 	em->map_lookup = map;
 	em->start = start;
-	em->len = chunk_size;
+	em->len = ctl.chunk_size;
 	em->block_start = 0;
 	em->block_len = em->len;
-	em->orig_block_len = stripe_size;
+	em->orig_block_len = ctl.stripe_size;
 
 	em_tree = &info->mapping_tree;
 	write_lock(&em_tree->lock);
@@ -5070,20 +5089,22 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	}
 	write_unlock(&em_tree->lock);
 
-	ret = btrfs_make_block_group(trans, 0, type, start, chunk_size);
+	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
 	if (ret)
 		goto error_del_extent;
 
 	for (i = 0; i < map->num_stripes; i++) {
 		struct btrfs_device *dev = map->stripes[i].dev;
 
-		btrfs_device_set_bytes_used(dev, dev->bytes_used + stripe_size);
+		btrfs_device_set_bytes_used(dev,
+					    dev->bytes_used + ctl.stripe_size);
 		if (list_empty(&dev->post_commit_list))
 			list_add_tail(&dev->post_commit_list,
 				      &trans->transaction->dev_update_list);
 	}
 
-	atomic64_sub(stripe_size * map->num_stripes, &info->free_chunk_space);
+	atomic64_sub(ctl.stripe_size * map->num_stripes,
+		     &info->free_chunk_space);
 
 	free_extent_map(em);
 	check_raid56_incompat_flag(info, type);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 06/21] btrfs: factor out init_alloc_chunk_ctl
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (4 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 07/21] btrfs: factor out gather_device_info() Naohiro Aota
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

Factor out init_alloc_chunk_ctl() from __btrfs_alloc_chunk(). This function
initialises parameters of "struct alloc_chunk_ctl" for allocation.
init_alloc_chunk_ctl() handles a common part of the initialisation to load
the RAID parameters from btrfs_raid_array.
init_alloc_chunk_ctl_policy_regular() decides some parameters for its
allocation.

The last "else" case in the original code is moved to __btrfs_alloc_chunk()
to handle the error case in the common code. Replace the BUG_ON with
ASSERT() and error return at the same time.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 100 ++++++++++++++++++++++++++++-----------------
 1 file changed, 62 insertions(+), 38 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index ed90e9d2bd9b..b15c2bb6f913 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4841,6 +4841,61 @@ struct alloc_chunk_ctl {
 	int ndevs;
 };
 
+static void
+init_alloc_chunk_ctl_policy_regular(struct btrfs_fs_devices *fs_devices,
+				    struct alloc_chunk_ctl *ctl)
+{
+	u64 type = ctl->type;
+
+	if (type & BTRFS_BLOCK_GROUP_DATA) {
+		ctl->max_stripe_size = SZ_1G;
+		ctl->max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
+	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
+		/* for larger filesystems, use larger metadata chunks */
+		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
+			ctl->max_stripe_size = SZ_1G;
+		else
+			ctl->max_stripe_size = SZ_256M;
+		ctl->max_chunk_size = ctl->max_stripe_size;
+	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
+		ctl->max_stripe_size = SZ_32M;
+		ctl->max_chunk_size = 2 * ctl->max_stripe_size;
+		ctl->devs_max = min_t(int, ctl->devs_max,
+				      BTRFS_MAX_DEVS_SYS_CHUNK);
+	} else {
+		BUG();
+	}
+
+	/* We don't want a chunk larger than 10% of writable space */
+	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
+				  ctl->max_chunk_size);
+}
+
+static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
+				 struct alloc_chunk_ctl *ctl)
+{
+	int index = btrfs_bg_flags_to_raid_index(ctl->type);
+
+	ctl->sub_stripes = btrfs_raid_array[index].sub_stripes;
+	ctl->dev_stripes = btrfs_raid_array[index].dev_stripes;
+	ctl->devs_max = btrfs_raid_array[index].devs_max;
+	if (!ctl->devs_max)
+		ctl->devs_max = BTRFS_MAX_DEVS(fs_devices->fs_info);
+	ctl->devs_min = btrfs_raid_array[index].devs_min;
+	ctl->devs_increment = btrfs_raid_array[index].devs_increment;
+	ctl->ncopies = btrfs_raid_array[index].ncopies;
+	ctl->nparity = btrfs_raid_array[index].nparity;
+	ctl->ndevs = 0;
+
+	switch (fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		init_alloc_chunk_ctl_policy_regular(fs_devices, ctl);
+		break;
+	default:
+		BUG();
+	}
+}
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4859,7 +4914,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	int ndevs;
 	int i;
 	int j;
-	int index;
 
 	if (!alloc_profile_is_valid(type, 0)) {
 		ASSERT(0);
@@ -4872,45 +4926,15 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		return -ENOSPC;
 	}
 
-	ctl.start = start;
-	ctl.type = type;
-
-	index = btrfs_bg_flags_to_raid_index(type);
-
-	ctl.sub_stripes = btrfs_raid_array[index].sub_stripes;
-	ctl.dev_stripes = btrfs_raid_array[index].dev_stripes;
-	ctl.devs_max = btrfs_raid_array[index].devs_max;
-	if (!ctl.devs_max)
-		ctl.devs_max = BTRFS_MAX_DEVS(info);
-	ctl.devs_min = btrfs_raid_array[index].devs_min;
-	ctl.devs_increment = btrfs_raid_array[index].devs_increment;
-	ctl.ncopies = btrfs_raid_array[index].ncopies;
-	ctl.nparity = btrfs_raid_array[index].nparity;
-
-	if (type & BTRFS_BLOCK_GROUP_DATA) {
-		ctl.max_stripe_size = SZ_1G;
-		ctl.max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE;
-	} else if (type & BTRFS_BLOCK_GROUP_METADATA) {
-		/* for larger filesystems, use larger metadata chunks */
-		if (fs_devices->total_rw_bytes > 50ULL * SZ_1G)
-			ctl.max_stripe_size = SZ_1G;
-		else
-			ctl.max_stripe_size = SZ_256M;
-		ctl.max_chunk_size = ctl.max_stripe_size;
-	} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
-		ctl.max_stripe_size = SZ_32M;
-		ctl.max_chunk_size = 2 * ctl.max_stripe_size;
-		ctl.devs_max = min_t(int, ctl.devs_max,
-				      BTRFS_MAX_DEVS_SYS_CHUNK);
-	} else {
-		btrfs_err(info, "invalid chunk type 0x%llx requested",
-		       type);
-		BUG();
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
+		ASSERT(0);
+		return -EINVAL;
 	}
 
-	/* We don't want a chunk larger than 10% of writable space */
-	ctl.max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
-				  ctl.max_chunk_size);
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
 
 	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
 			       GFP_NOFS);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 07/21] btrfs: factor out gather_device_info()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (5 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

Factor out gather_device_info() from __btrfs_alloc_chunk(). This function
iterates over devices list and gather information about devices. This
commit also introduces "max_avail" and "dev_extent_min" to fold the same
calculation to one variable.

This commit has no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 121 +++++++++++++++++++++++++--------------------
 1 file changed, 67 insertions(+), 54 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b15c2bb6f913..cd3a46a803d8 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4896,60 +4896,25 @@ static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
 	}
 }
 
-static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
-			       u64 start, u64 type)
+static int gather_device_info(struct btrfs_fs_devices *fs_devices,
+			      struct alloc_chunk_ctl *ctl,
+			      struct btrfs_device_info *devices_info)
 {
-	struct btrfs_fs_info *info = trans->fs_info;
-	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct btrfs_fs_info *info = fs_devices->fs_info;
 	struct btrfs_device *device;
-	struct map_lookup *map = NULL;
-	struct extent_map_tree *em_tree;
-	struct extent_map *em;
-	struct btrfs_device_info *devices_info = NULL;
-	struct alloc_chunk_ctl ctl;
 	u64 total_avail;
-	int data_stripes;	/* number of stripes that count for
-				   block group size */
+	u64 dev_extent_want = ctl->max_stripe_size * ctl->dev_stripes;
+	u64 dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 	int ret;
-	int ndevs;
-	int i;
-	int j;
-
-	if (!alloc_profile_is_valid(type, 0)) {
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	if (list_empty(&fs_devices->alloc_list)) {
-		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-			btrfs_debug(info, "%s: no writable device", __func__);
-		return -ENOSPC;
-	}
-
-	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
-		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	ctl.start = start;
-	ctl.type = type;
-	init_alloc_chunk_ctl(fs_devices, &ctl);
-
-	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
-			       GFP_NOFS);
-	if (!devices_info)
-		return -ENOMEM;
+	int ndevs = 0;
+	u64 max_avail;
+	u64 dev_offset;
 
 	/*
 	 * in the first pass through the devices list, we gather information
 	 * about the available holes on each device.
 	 */
-	ndevs = 0;
 	list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
-		u64 max_avail;
-		u64 dev_offset;
-
 		if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 			WARN(1, KERN_ERR
 			       "BTRFS: read-only device in alloc_list\n");
@@ -4970,21 +4935,20 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		if (total_avail == 0)
 			continue;
 
-		ret = find_free_dev_extent(
-			device, ctl.max_stripe_size * ctl.dev_stripes,
-			&dev_offset, &max_avail);
+		ret = find_free_dev_extent(device, dev_extent_want, &dev_offset,
+					   &max_avail);
 		if (ret && ret != -ENOSPC)
-			goto error;
+			return ret;
 
 		if (ret == 0)
-			max_avail = ctl.max_stripe_size * ctl.dev_stripes;
+			max_avail = dev_extent_want;
 
-		if (max_avail < BTRFS_STRIPE_LEN * ctl.dev_stripes) {
+		if (max_avail < dev_extent_min) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
-			"%s: devid %llu has no free space, have=%llu want=%u",
+			"%s: devid %llu has no free space, have=%llu want=%llu",
 					    __func__, device->devid, max_avail,
-					    BTRFS_STRIPE_LEN * ctl.dev_stripes);
+					    dev_extent_min);
 			continue;
 		}
 
@@ -4999,14 +4963,63 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		devices_info[ndevs].dev = device;
 		++ndevs;
 	}
-	ctl.ndevs = ndevs;
+	ctl->ndevs = ndevs;
 
 	/*
 	 * now sort the devices by hole size / available space
 	 */
-	sort(devices_info, ctl.ndevs, sizeof(struct btrfs_device_info),
+	sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
 	     btrfs_cmp_device_info, NULL);
 
+	return 0;
+}
+
+static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
+			       u64 start, u64 type)
+{
+	struct btrfs_fs_info *info = trans->fs_info;
+	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct map_lookup *map = NULL;
+	struct extent_map_tree *em_tree;
+	struct extent_map *em;
+	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
+	int data_stripes;	/* number of stripes that count for
+				   block group size */
+	int ret;
+	int i;
+	int j;
+
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	if (list_empty(&fs_devices->alloc_list)) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG))
+			btrfs_debug(info, "%s: no writable device", __func__);
+		return -ENOSPC;
+	}
+
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
+
+	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
+			       GFP_NOFS);
+	if (!devices_info)
+		return -ENOMEM;
+
+	ret = gather_device_info(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto error;
+
 	/*
 	 * Round down to number of usable stripes, devs_increment can be any
 	 * number so we can't use round_down()
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 08/21] btrfs: factor out decide_stripe_size()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (6 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 07/21] btrfs: factor out gather_device_info() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 09/21] btrfs: factor out create_chunk() Naohiro Aota
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out decide_stripe_size() from __btrfs_alloc_chunk(). This function
calculates the actual stripe size to allocate. decide_stripe_size() handles
the common case to round down the 'ndevs' to 'devs_increment' and check the
upper and lower limitation of 'ndevs'. decide_stripe_size_regular() decides
the size of a stripe and the size of a chunk. The policy is to maximize the
number of stripes.

This commit has no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 138 ++++++++++++++++++++++++++-------------------
 1 file changed, 80 insertions(+), 58 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index cd3a46a803d8..fe194bc0bce3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4974,6 +4974,84 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 	return 0;
 }
 
+static int decide_stripe_size_regular(struct alloc_chunk_ctl *ctl,
+				      struct btrfs_device_info *devices_info)
+{
+	int data_stripes;	/* number of stripes that count for
+				   block group size */
+
+	/*
+	 * The primary goal is to maximize the number of stripes, so use as
+	 * many devices as possible, even if the stripes are not maximum sized.
+	 *
+	 * The DUP profile stores more than one stripe per device, the
+	 * max_avail is the total size so we have to adjust.
+	 */
+	ctl->stripe_size = div_u64(devices_info[ctl->ndevs - 1].max_avail,
+				   ctl->dev_stripes);
+	ctl->num_stripes = ctl->ndevs * ctl->dev_stripes;
+
+	/*
+	 * this will have to be fixed for RAID1 and RAID10 over
+	 * more drives
+	 */
+	data_stripes = (ctl->num_stripes - ctl->nparity) / ctl->ncopies;
+
+	/*
+	 * Use the number of data stripes to figure out how big this chunk
+	 * is really going to be in terms of logical address space,
+	 * and compare that answer with the max chunk size. If it's higher,
+	 * we try to reduce stripe_size.
+	 */
+	if (ctl->stripe_size * data_stripes > ctl->max_chunk_size) {
+		/*
+		 * Reduce stripe_size, round it up to a 16MB boundary again and
+		 * then use it, unless it ends up being even bigger than the
+		 * previous value we had already.
+		 */
+		ctl->stripe_size = min(round_up(div_u64(ctl->max_chunk_size,
+							data_stripes), SZ_16M),
+				       ctl->stripe_size);
+	}
+
+	/* align to BTRFS_STRIPE_LEN */
+	ctl->stripe_size = round_down(ctl->stripe_size, BTRFS_STRIPE_LEN);
+	ctl->chunk_size = ctl->stripe_size * data_stripes;
+
+	return 0;
+}
+
+static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
+			      struct alloc_chunk_ctl *ctl,
+			      struct btrfs_device_info *devices_info)
+{
+	struct btrfs_fs_info *info = fs_devices->fs_info;
+
+	/*
+	 * Round down to number of usable stripes, devs_increment can be any
+	 * number so we can't use round_down()
+	 */
+	ctl->ndevs -= ctl->ndevs % ctl->devs_increment;
+
+	if (ctl->ndevs < ctl->devs_min) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
+			btrfs_debug(info,
+	"%s: not enough devices with free space: have=%d minimum required=%d",
+				    __func__, ctl->ndevs, ctl->devs_min);
+		}
+		return -ENOSPC;
+	}
+
+	ctl->ndevs = min(ctl->ndevs, ctl->devs_max);
+
+	switch (fs_devices->chunk_alloc_policy) {
+	case BTRFS_CHUNK_ALLOC_REGULAR:
+		return decide_stripe_size_regular(ctl, devices_info);
+	default:
+		BUG();
+	}
+}
+
 static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 			       u64 start, u64 type)
 {
@@ -4984,8 +5062,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	struct extent_map *em;
 	struct btrfs_device_info *devices_info = NULL;
 	struct alloc_chunk_ctl ctl;
-	int data_stripes;	/* number of stripes that count for
-				   block group size */
 	int ret;
 	int i;
 	int j;
@@ -5020,61 +5096,9 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	if (ret < 0)
 		goto error;
 
-	/*
-	 * Round down to number of usable stripes, devs_increment can be any
-	 * number so we can't use round_down()
-	 */
-	ctl.ndevs -= ctl.ndevs % ctl.devs_increment;
-
-	if (ctl.ndevs < ctl.devs_min) {
-		ret = -ENOSPC;
-		if (btrfs_test_opt(info, ENOSPC_DEBUG)) {
-			btrfs_debug(info,
-	"%s: not enough devices with free space: have=%d minimum required=%d",
-				    __func__, ctl.ndevs, ctl.devs_min);
-		}
+	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
+	if (ret < 0)
 		goto error;
-	}
-
-	ctl.ndevs = min(ctl.ndevs, ctl.devs_max);
-
-	/*
-	 * The primary goal is to maximize the number of stripes, so use as
-	 * many devices as possible, even if the stripes are not maximum sized.
-	 *
-	 * The DUP profile stores more than one stripe per device, the
-	 * max_avail is the total size so we have to adjust.
-	 */
-	ctl.stripe_size = div_u64(devices_info[ctl.ndevs - 1].max_avail,
-				   ctl.dev_stripes);
-	ctl.num_stripes = ctl.ndevs * ctl.dev_stripes;
-
-	/*
-	 * this will have to be fixed for RAID1 and RAID10 over
-	 * more drives
-	 */
-	data_stripes = (ctl.num_stripes - ctl.nparity) / ctl.ncopies;
-
-	/*
-	 * Use the number of data stripes to figure out how big this chunk
-	 * is really going to be in terms of logical address space,
-	 * and compare that answer with the max chunk size. If it's higher,
-	 * we try to reduce stripe_size.
-	 */
-	if (ctl.stripe_size * data_stripes > ctl.max_chunk_size) {
-		/*
-		 * Reduce stripe_size, round it up to a 16MB boundary again and
-		 * then use it, unless it ends up being even bigger than the
-		 * previous value we had already.
-		 */
-		ctl.stripe_size =
-			min(round_up(div_u64(ctl.max_chunk_size, data_stripes),
-				     SZ_16M),
-			    ctl.stripe_size);
-	}
-
-	/* align to BTRFS_STRIPE_LEN */
-	ctl.stripe_size = round_down(ctl.stripe_size, BTRFS_STRIPE_LEN);
 
 	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
 	if (!map) {
@@ -5098,8 +5122,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	map->type = type;
 	map->sub_stripes = ctl.sub_stripes;
 
-	ctl.chunk_size = ctl.stripe_size * data_stripes;
-
 	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
 
 	em = alloc_extent_map();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 09/21] btrfs: factor out create_chunk()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (7 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out create_chunk() from __btrfs_alloc_chunk(). This function finally
creates a chunk. There is no functional changes.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/volumes.c | 132 ++++++++++++++++++++++++---------------------
 1 file changed, 71 insertions(+), 61 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fe194bc0bce3..cd854d6b8208 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5052,91 +5052,53 @@ static int decide_stripe_size(struct btrfs_fs_devices *fs_devices,
 	}
 }
 
-static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
-			       u64 start, u64 type)
+static int create_chunk(struct btrfs_trans_handle *trans,
+			struct alloc_chunk_ctl *ctl,
+			struct btrfs_device_info *devices_info)
 {
 	struct btrfs_fs_info *info = trans->fs_info;
-	struct btrfs_fs_devices *fs_devices = info->fs_devices;
 	struct map_lookup *map = NULL;
 	struct extent_map_tree *em_tree;
 	struct extent_map *em;
-	struct btrfs_device_info *devices_info = NULL;
-	struct alloc_chunk_ctl ctl;
+	u64 start = ctl->start;
+	u64 type = ctl->type;
 	int ret;
 	int i;
 	int j;
 
-	if (!alloc_profile_is_valid(type, 0)) {
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	if (list_empty(&fs_devices->alloc_list)) {
-		if (btrfs_test_opt(info, ENOSPC_DEBUG))
-			btrfs_debug(info, "%s: no writable device", __func__);
-		return -ENOSPC;
-	}
-
-	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
-		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
-		ASSERT(0);
-		return -EINVAL;
-	}
-
-	ctl.start = start;
-	ctl.type = type;
-	init_alloc_chunk_ctl(fs_devices, &ctl);
-
-	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
-			       GFP_NOFS);
-	if (!devices_info)
+	map = kmalloc(map_lookup_size(ctl->num_stripes), GFP_NOFS);
+	if (!map)
 		return -ENOMEM;
+	map->num_stripes = ctl->num_stripes;
 
-	ret = gather_device_info(fs_devices, &ctl, devices_info);
-	if (ret < 0)
-		goto error;
-
-	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
-	if (ret < 0)
-		goto error;
-
-	map = kmalloc(map_lookup_size(ctl.num_stripes), GFP_NOFS);
-	if (!map) {
-		ret = -ENOMEM;
-		goto error;
-	}
-
-	map->num_stripes = ctl.num_stripes;
-
-	for (i = 0; i < ctl.ndevs; ++i) {
-		for (j = 0; j < ctl.dev_stripes; ++j) {
-			int s = i * ctl.dev_stripes + j;
+	for (i = 0; i < ctl->ndevs; ++i) {
+		for (j = 0; j < ctl->dev_stripes; ++j) {
+			int s = i * ctl->dev_stripes + j;
 			map->stripes[s].dev = devices_info[i].dev;
 			map->stripes[s].physical = devices_info[i].dev_offset +
-						   j * ctl.stripe_size;
+						   j * ctl->stripe_size;
 		}
 	}
 	map->stripe_len = BTRFS_STRIPE_LEN;
 	map->io_align = BTRFS_STRIPE_LEN;
 	map->io_width = BTRFS_STRIPE_LEN;
 	map->type = type;
-	map->sub_stripes = ctl.sub_stripes;
+	map->sub_stripes = ctl->sub_stripes;
 
-	trace_btrfs_chunk_alloc(info, map, start, ctl.chunk_size);
+	trace_btrfs_chunk_alloc(info, map, start, ctl->chunk_size);
 
 	em = alloc_extent_map();
 	if (!em) {
 		kfree(map);
-		ret = -ENOMEM;
-		goto error;
+		return -ENOMEM;
 	}
 	set_bit(EXTENT_FLAG_FS_MAPPING, &em->flags);
 	em->map_lookup = map;
 	em->start = start;
-	em->len = ctl.chunk_size;
+	em->len = ctl->chunk_size;
 	em->block_start = 0;
 	em->block_len = em->len;
-	em->orig_block_len = ctl.stripe_size;
+	em->orig_block_len = ctl->stripe_size;
 
 	em_tree = &info->mapping_tree;
 	write_lock(&em_tree->lock);
@@ -5144,11 +5106,11 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	if (ret) {
 		write_unlock(&em_tree->lock);
 		free_extent_map(em);
-		goto error;
+		return ret;
 	}
 	write_unlock(&em_tree->lock);
 
-	ret = btrfs_make_block_group(trans, 0, type, start, ctl.chunk_size);
+	ret = btrfs_make_block_group(trans, 0, type, start, ctl->chunk_size);
 	if (ret)
 		goto error_del_extent;
 
@@ -5156,20 +5118,19 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		struct btrfs_device *dev = map->stripes[i].dev;
 
 		btrfs_device_set_bytes_used(dev,
-					    dev->bytes_used + ctl.stripe_size);
+					    dev->bytes_used + ctl->stripe_size);
 		if (list_empty(&dev->post_commit_list))
 			list_add_tail(&dev->post_commit_list,
 				      &trans->transaction->dev_update_list);
 	}
 
-	atomic64_sub(ctl.stripe_size * map->num_stripes,
+	atomic64_sub(ctl->stripe_size * map->num_stripes,
 		     &info->free_chunk_space);
 
 	free_extent_map(em);
 	check_raid56_incompat_flag(info, type);
 	check_raid1c34_incompat_flag(info, type);
 
-	kfree(devices_info);
 	return 0;
 
 error_del_extent:
@@ -5181,7 +5142,56 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 	free_extent_map(em);
 	/* One for the tree reference */
 	free_extent_map(em);
-error:
+
+	return ret;
+}
+
+static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
+			       u64 start, u64 type)
+{
+	struct btrfs_fs_info *info = trans->fs_info;
+	struct btrfs_fs_devices *fs_devices = info->fs_devices;
+	struct btrfs_device_info *devices_info = NULL;
+	struct alloc_chunk_ctl ctl;
+	int ret;
+
+	if (!alloc_profile_is_valid(type, 0)) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	if (list_empty(&fs_devices->alloc_list)) {
+		if (btrfs_test_opt(info, ENOSPC_DEBUG))
+			btrfs_debug(info, "%s: no writable device", __func__);
+		return -ENOSPC;
+	}
+
+	if (!(type & BTRFS_BLOCK_GROUP_TYPE_MASK)) {
+		btrfs_err(info, "invalid chunk type 0x%llx requested", type);
+		ASSERT(0);
+		return -EINVAL;
+	}
+
+	ctl.start = start;
+	ctl.type = type;
+	init_alloc_chunk_ctl(fs_devices, &ctl);
+
+	devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
+			       GFP_NOFS);
+	if (!devices_info)
+		return -ENOMEM;
+
+	ret = gather_device_info(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto out;
+
+	ret = decide_stripe_size(fs_devices, &ctl, devices_info);
+	if (ret < 0)
+		goto out;
+
+	ret = create_chunk(trans, &ctl, devices_info);
+
+out:
 	kfree(devices_info);
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 10/21] btrfs: parameterize dev_extent_min
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (8 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 09/21] btrfs: factor out create_chunk() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 11/21] btrfs: introduce extent allocation policy Naohiro Aota
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

Currently, we ignore a device whose available space is less than
"BTRFS_STRIPE_LEN * dev_stripes". This is a lower limit for current
allocation policy (to maximize the number of stripes). This commit
parameterizes dev_extent_min, so that other policies can set their own
lower limitation to ignore a device with an insufficient space.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index cd854d6b8208..c5c64252d1f3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4836,6 +4836,7 @@ struct alloc_chunk_ctl {
 				   store parity information */
 	u64 max_stripe_size;
 	u64 max_chunk_size;
+	u64 dev_extent_min;
 	u64 stripe_size;
 	u64 chunk_size;
 	int ndevs;
@@ -4869,6 +4870,7 @@ init_alloc_chunk_ctl_policy_regular(struct btrfs_fs_devices *fs_devices,
 	/* We don't want a chunk larger than 10% of writable space */
 	ctl->max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
 				  ctl->max_chunk_size);
+	ctl->dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 }
 
 static void init_alloc_chunk_ctl(struct btrfs_fs_devices *fs_devices,
@@ -4904,7 +4906,6 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 	struct btrfs_device *device;
 	u64 total_avail;
 	u64 dev_extent_want = ctl->max_stripe_size * ctl->dev_stripes;
-	u64 dev_extent_min = BTRFS_STRIPE_LEN * ctl->dev_stripes;
 	int ret;
 	int ndevs = 0;
 	u64 max_avail;
@@ -4932,7 +4933,7 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 			total_avail = 0;
 
 		/* If there is no space on this device, skip it. */
-		if (total_avail == 0)
+		if (total_avail < ctl->dev_extent_min)
 			continue;
 
 		ret = find_free_dev_extent(device, dev_extent_want, &dev_offset,
@@ -4943,12 +4944,12 @@ static int gather_device_info(struct btrfs_fs_devices *fs_devices,
 		if (ret == 0)
 			max_avail = dev_extent_want;
 
-		if (max_avail < dev_extent_min) {
+		if (max_avail < ctl->dev_extent_min) {
 			if (btrfs_test_opt(info, ENOSPC_DEBUG))
 				btrfs_debug(info,
 			"%s: devid %llu has no free space, have=%llu want=%llu",
 					    __func__, device->devid, max_avail,
-					    dev_extent_min);
+					    ctl->dev_extent_min);
 			continue;
 		}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 11/21] btrfs: introduce extent allocation policy
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (9 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

This commit introduces extent allocation policy for btrfs. This policy
controls how btrfs allocate an extents from block groups.

There is no functional change introduced with this commit.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ed646a13956d..edc72e768119 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3438,6 +3438,10 @@ btrfs_release_block_group(struct btrfs_block_group *cache,
 	btrfs_put_block_group(cache);
 }
 
+enum btrfs_extent_allocation_policy {
+	BTRFS_EXTENT_ALLOC_CLUSTERED,
+};
+
 /*
  * Structure used internally for find_free_extent() function.  Wraps needed
  * parameters.
@@ -3489,6 +3493,9 @@ struct find_free_extent_ctl {
 
 	/* Found result */
 	u64 found_offset;
+
+	/* Allocation policy */
+	enum btrfs_extent_allocation_policy policy;
 };
 
 
@@ -3826,6 +3833,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.have_caching_bg = false;
 	ffe_ctl.orig_have_caching_bg = false;
 	ffe_ctl.found_offset = 0;
+	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
 	ins->objectid = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 12/21] btrfs: move hint_byte into find_free_extent_ctl
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (10 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 11/21] btrfs: introduce extent allocation policy Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 13/21] btrfs: move variables for clustered allocation " Naohiro Aota
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

This commit moves hint_byte into find_free_extent_ctl, so that we can
modify the hint_byte in the other functions. This will help us split
find_free_extent further. This commit also renames the function argument
"hint_byte" to "hint_byte_orig" to avoid misuse.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index edc72e768119..93f07988d480 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3494,6 +3494,9 @@ struct find_free_extent_ctl {
 	/* Found result */
 	u64 found_offset;
 
+	/* Hint byte to start looking for an empty space */
+	u64 hint_byte;
+
 	/* Allocation policy */
 	enum btrfs_extent_allocation_policy policy;
 };
@@ -3808,7 +3811,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
  */
 static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 				u64 ram_bytes, u64 num_bytes, u64 empty_size,
-				u64 hint_byte, struct btrfs_key *ins,
+				u64 hint_byte_orig, struct btrfs_key *ins,
 				u64 flags, int delalloc)
 {
 	int ret = 0;
@@ -3833,6 +3836,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.have_caching_bg = false;
 	ffe_ctl.orig_have_caching_bg = false;
 	ffe_ctl.found_offset = 0;
+	ffe_ctl.hint_byte = hint_byte_orig;
 	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
@@ -3875,14 +3879,14 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	if (last_ptr) {
 		spin_lock(&last_ptr->lock);
 		if (last_ptr->block_group)
-			hint_byte = last_ptr->window_start;
+			ffe_ctl.hint_byte = last_ptr->window_start;
 		if (last_ptr->fragmented) {
 			/*
 			 * We still set window_start so we can keep track of the
 			 * last place we found an allocation to try and save
 			 * some time.
 			 */
-			hint_byte = last_ptr->window_start;
+			ffe_ctl.hint_byte = last_ptr->window_start;
 			use_cluster = false;
 		}
 		spin_unlock(&last_ptr->lock);
@@ -3890,8 +3894,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 
 	ffe_ctl.search_start = max(ffe_ctl.search_start,
 				   first_logical_byte(fs_info, 0));
-	ffe_ctl.search_start = max(ffe_ctl.search_start, hint_byte);
-	if (ffe_ctl.search_start == hint_byte) {
+	ffe_ctl.search_start = max(ffe_ctl.search_start, ffe_ctl.hint_byte);
+	if (ffe_ctl.search_start == ffe_ctl.hint_byte) {
 		block_group = btrfs_lookup_block_group(fs_info,
 						       ffe_ctl.search_start);
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 13/21] btrfs: move variables for clustered allocation into find_free_extent_ctl
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (11 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 14/21] btrfs: factor out do_allocation() Naohiro Aota
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Move "last_ptr" and "use_cluster" into struct find_free_extent_ctl, so that
hook functions for clustered allocator can use these variables.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 93f07988d480..cbb89ac6cda3 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3458,6 +3458,8 @@ struct find_free_extent_ctl {
 
 	/* For clustered allocation */
 	u64 empty_cluster;
+	struct btrfs_free_cluster *last_ptr;
+	bool use_cluster;
 
 	bool have_caching_bg;
 	bool orig_have_caching_bg;
@@ -3816,11 +3818,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 {
 	int ret = 0;
 	int cache_block_group_error = 0;
-	struct btrfs_free_cluster *last_ptr = NULL;
 	struct btrfs_block_group *block_group = NULL;
 	struct find_free_extent_ctl ffe_ctl = {0};
 	struct btrfs_space_info *space_info;
-	bool use_cluster = true;
 	bool full_search = false;
 
 	WARN_ON(num_bytes < fs_info->sectorsize);
@@ -3829,8 +3829,6 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.empty_size = empty_size;
 	ffe_ctl.flags = flags;
 	ffe_ctl.search_start = 0;
-	ffe_ctl.retry_clustered = false;
-	ffe_ctl.retry_unclustered = false;
 	ffe_ctl.delalloc = delalloc;
 	ffe_ctl.index = btrfs_bg_flags_to_raid_index(flags);
 	ffe_ctl.have_caching_bg = false;
@@ -3839,6 +3837,12 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	ffe_ctl.hint_byte = hint_byte_orig;
 	ffe_ctl.policy = BTRFS_EXTENT_ALLOC_CLUSTERED;
 
+	/* For clustered allocation */
+	ffe_ctl.retry_clustered = false;
+	ffe_ctl.retry_unclustered = false;
+	ffe_ctl.last_ptr = NULL;
+	ffe_ctl.use_cluster = true;
+
 	ins->type = BTRFS_EXTENT_ITEM_KEY;
 	ins->objectid = 0;
 	ins->offset = 0;
@@ -3869,14 +3873,16 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			spin_unlock(&space_info->lock);
 			return -ENOSPC;
 		} else if (space_info->max_extent_size) {
-			use_cluster = false;
+			ffe_ctl.use_cluster = false;
 		}
 		spin_unlock(&space_info->lock);
 	}
 
-	last_ptr = fetch_cluster_info(fs_info, space_info,
-				      &ffe_ctl.empty_cluster);
-	if (last_ptr) {
+	ffe_ctl.last_ptr = fetch_cluster_info(fs_info, space_info,
+					      &ffe_ctl.empty_cluster);
+	if (ffe_ctl.last_ptr) {
+		struct btrfs_free_cluster *last_ptr = ffe_ctl.last_ptr;
+
 		spin_lock(&last_ptr->lock);
 		if (last_ptr->block_group)
 			ffe_ctl.hint_byte = last_ptr->window_start;
@@ -3887,7 +3893,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			 * some time.
 			 */
 			ffe_ctl.hint_byte = last_ptr->window_start;
-			use_cluster = false;
+			ffe_ctl.use_cluster = false;
 		}
 		spin_unlock(&last_ptr->lock);
 	}
@@ -4000,10 +4006,11 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		 * Ok we want to try and use the cluster allocator, so
 		 * lets look there
 		 */
-		if (last_ptr && use_cluster) {
+		if (ffe_ctl.last_ptr && ffe_ctl.use_cluster) {
 			struct btrfs_block_group *cluster_bg = NULL;
 
-			ret = find_free_extent_clustered(block_group, last_ptr,
+			ret = find_free_extent_clustered(block_group,
+							 ffe_ctl.last_ptr,
 							 &ffe_ctl, &cluster_bg);
 
 			if (ret == 0) {
@@ -4021,8 +4028,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 			/* ret == -ENOENT case falls through */
 		}
 
-		ret = find_free_extent_unclustered(block_group, last_ptr,
-						   &ffe_ctl);
+		ret = find_free_extent_unclustered(block_group,
+						   ffe_ctl.last_ptr, &ffe_ctl);
 		if (ret == -EAGAIN)
 			goto have_block_group;
 		else if (ret > 0)
@@ -4071,8 +4078,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	}
 	up_read(&space_info->groups_sem);
 
-	ret = find_free_extent_update_loop(fs_info, last_ptr, ins, &ffe_ctl,
-					   full_search, use_cluster);
+	ret = find_free_extent_update_loop(fs_info, ffe_ctl.last_ptr, ins,
+					   &ffe_ctl, full_search,
+					   ffe_ctl.use_cluster);
 	if (ret > 0)
 		goto search;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 14/21] btrfs: factor out do_allocation()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (12 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 13/21] btrfs: move variables for clustered allocation " Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out do_allocation() from find_free_extent(). This function do an
actual allocation in a given block group. The ffe_ctl->policy is used to
determine the actual allocator function to use.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/extent-tree.c | 78 +++++++++++++++++++++++++-----------------
 1 file changed, 47 insertions(+), 31 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index cbb89ac6cda3..1ed12a033ba9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3675,6 +3675,39 @@ static int find_free_extent_unclustered(struct btrfs_block_group *bg,
 	return 0;
 }
 
+static int do_allocation_clustered(struct btrfs_block_group *block_group,
+				   struct find_free_extent_ctl *ffe_ctl,
+				   struct btrfs_block_group **bg_ret)
+{
+	int ret;
+
+	/*
+	 * Ok we want to try and use the cluster allocator, so lets look there
+	 */
+	if (ffe_ctl->last_ptr && ffe_ctl->use_cluster) {
+		ret = find_free_extent_clustered(block_group, ffe_ctl->last_ptr,
+						 ffe_ctl, bg_ret);
+		if (ret >= 0 || ret == -EAGAIN)
+			return ret;
+		/* ret == -ENOENT case falls through */
+	}
+
+	return find_free_extent_unclustered(block_group, ffe_ctl->last_ptr,
+					    ffe_ctl);
+}
+
+static int do_allocation(struct btrfs_block_group *block_group,
+			 struct find_free_extent_ctl *ffe_ctl,
+			 struct btrfs_block_group **bg_ret)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		return do_allocation_clustered(block_group, ffe_ctl, bg_ret);
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3942,6 +3975,8 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	down_read(&space_info->groups_sem);
 	list_for_each_entry(block_group,
 			    &space_info->block_groups[ffe_ctl.index], list) {
+		struct btrfs_block_group *bg_ret;
+
 		/* If the block group is read-only, we can skip it entirely. */
 		if (unlikely(block_group->ro))
 			continue;
@@ -4002,40 +4037,21 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		if (unlikely(block_group->cached == BTRFS_CACHE_ERROR))
 			goto loop;
 
-		/*
-		 * Ok we want to try and use the cluster allocator, so
-		 * lets look there
-		 */
-		if (ffe_ctl.last_ptr && ffe_ctl.use_cluster) {
-			struct btrfs_block_group *cluster_bg = NULL;
-
-			ret = find_free_extent_clustered(block_group,
-							 ffe_ctl.last_ptr,
-							 &ffe_ctl, &cluster_bg);
-
-			if (ret == 0) {
-				if (cluster_bg && cluster_bg != block_group) {
-					btrfs_release_block_group(block_group,
-								  delalloc);
-					block_group = cluster_bg;
-				}
-				goto checks;
-			} else if (ret == -EAGAIN) {
-				goto have_block_group;
-			} else if (ret > 0) {
-				goto loop;
+		bg_ret = NULL;
+		ret = do_allocation(block_group, &ffe_ctl, &bg_ret);
+		if (ret == 0) {
+			if (bg_ret && bg_ret != block_group) {
+				btrfs_release_block_group(block_group,
+							  delalloc);
+				block_group = bg_ret;
 			}
-			/* ret == -ENOENT case falls through */
-		}
-
-		ret = find_free_extent_unclustered(block_group,
-						   ffe_ctl.last_ptr, &ffe_ctl);
-		if (ret == -EAGAIN)
+		} else if (ret == -EAGAIN) {
 			goto have_block_group;
-		else if (ret > 0)
+		} else if (ret > 0) {
 			goto loop;
-		/* ret == 0 case falls through */
-checks:
+		}
+
+		/* checks */
 		ffe_ctl.search_start = round_up(ffe_ctl.found_offset,
 					     fs_info->stripesize);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 15/21] btrfs: drop unnecessary arguments from clustered allocation functions
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (13 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 14/21] btrfs: factor out do_allocation() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 16/21] btrfs: factor out release_block_group() Naohiro Aota
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

Now that, find_free_extent_clustered() and find_free_extent_unclustered() can
access "last_ptr" from the "clustered" variable, we can drop it from the
arguments.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1ed12a033ba9..1c0c94a2a8e0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3513,11 +3513,11 @@ struct find_free_extent_ctl {
  * Return 0 means we have found a location and set ffe_ctl->found_offset.
  */
 static int find_free_extent_clustered(struct btrfs_block_group *bg,
-		struct btrfs_free_cluster *last_ptr,
-		struct find_free_extent_ctl *ffe_ctl,
-		struct btrfs_block_group **cluster_bg_ret)
+				      struct find_free_extent_ctl *ffe_ctl,
+				      struct btrfs_block_group **cluster_bg_ret)
 {
 	struct btrfs_block_group *cluster_bg;
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
 	u64 aligned_cluster;
 	u64 offset;
 	int ret;
@@ -3617,9 +3617,9 @@ static int find_free_extent_clustered(struct btrfs_block_group *bg,
  * Return -EAGAIN to inform caller that we need to re-search this block group
  */
 static int find_free_extent_unclustered(struct btrfs_block_group *bg,
-		struct btrfs_free_cluster *last_ptr,
-		struct find_free_extent_ctl *ffe_ctl)
+					struct find_free_extent_ctl *ffe_ctl)
 {
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
 	u64 offset;
 
 	/*
@@ -3685,15 +3685,13 @@ static int do_allocation_clustered(struct btrfs_block_group *block_group,
 	 * Ok we want to try and use the cluster allocator, so lets look there
 	 */
 	if (ffe_ctl->last_ptr && ffe_ctl->use_cluster) {
-		ret = find_free_extent_clustered(block_group, ffe_ctl->last_ptr,
-						 ffe_ctl, bg_ret);
+		ret = find_free_extent_clustered(block_group, ffe_ctl, bg_ret);
 		if (ret >= 0 || ret == -EAGAIN)
 			return ret;
 		/* ret == -ENOENT case falls through */
 	}
 
-	return find_free_extent_unclustered(block_group, ffe_ctl->last_ptr,
-					    ffe_ctl);
+	return find_free_extent_unclustered(block_group, ffe_ctl);
 }
 
 static int do_allocation(struct btrfs_block_group *block_group,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 16/21] btrfs: factor out release_block_group()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (14 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 17/21] btrfs: factor out found_extent() Naohiro Aota
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out release_block_group() from find_free_extent(). This function is
called when it gives up an allocation from a block group. Each allocation
policy should reset their information for an allocation in the next block
group.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1c0c94a2a8e0..76e05ed33f1f 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3706,6 +3706,24 @@ static int do_allocation(struct btrfs_block_group *block_group,
 	}
 }
 
+static void release_block_group(struct btrfs_block_group *block_group,
+				struct find_free_extent_ctl *ffe_ctl,
+				int delalloc)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		ffe_ctl->retry_clustered = false;
+		ffe_ctl->retry_unclustered = false;
+		break;
+	default:
+		BUG();
+	}
+
+	BUG_ON(btrfs_bg_flags_to_raid_index(block_group->flags) !=
+		ffe_ctl->index);
+	btrfs_release_block_group(block_group, delalloc);
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -4083,11 +4101,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		btrfs_release_block_group(block_group, delalloc);
 		break;
 loop:
-		ffe_ctl.retry_clustered = false;
-		ffe_ctl.retry_unclustered = false;
-		BUG_ON(btrfs_bg_flags_to_raid_index(block_group->flags) !=
-		       ffe_ctl.index);
-		btrfs_release_block_group(block_group, delalloc);
+		release_block_group(block_group, &ffe_ctl, delalloc);
 		cond_resched();
 	}
 	up_read(&space_info->groups_sem);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 17/21] btrfs: factor out found_extent()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (15 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 16/21] btrfs: factor out release_block_group() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out found_extent() from find_free_extent_update_loop(). This
function is called when a proper extent is found and before returning from
find_free_extent().  Hook functions like found_extent_clustered() should
save information for a next allocation.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 76e05ed33f1f..46d0338659d7 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3724,6 +3724,30 @@ static void release_block_group(struct btrfs_block_group *block_group,
 	btrfs_release_block_group(block_group, delalloc);
 }
 
+static void found_extent_clustered(struct find_free_extent_ctl *ffe_ctl,
+				   struct btrfs_key *ins)
+{
+	struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
+
+	if (!ffe_ctl->use_cluster && last_ptr) {
+		spin_lock(&last_ptr->lock);
+		last_ptr->window_start = ins->objectid;
+		spin_unlock(&last_ptr->lock);
+	}
+}
+
+static void found_extent(struct find_free_extent_ctl *ffe_ctl,
+			 struct btrfs_key *ins)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		found_extent_clustered(ffe_ctl, ins);
+		break;
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3750,11 +3774,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 		return 1;
 
 	if (ins->objectid) {
-		if (!use_cluster && last_ptr) {
-			spin_lock(&last_ptr->lock);
-			last_ptr->window_start = ins->objectid;
-			spin_unlock(&last_ptr->lock);
-		}
+		found_extent(ffe_ctl, ins);
 		return 0;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (16 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 17/21] btrfs: factor out found_extent() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Now that, we don't use last_ptr and use_cluster in the function. Drop these
arguments from it.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 46d0338659d7..05edd69f5fe1 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3754,10 +3754,9 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
  * Return <0 means we failed to locate any free extent.
  */
 static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
-					struct btrfs_free_cluster *last_ptr,
 					struct btrfs_key *ins,
 					struct find_free_extent_ctl *ffe_ctl,
-					bool full_search, bool use_cluster)
+					bool full_search)
 {
 	struct btrfs_root *root = fs_info->extent_root;
 	int ret;
@@ -4126,9 +4125,7 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 	}
 	up_read(&space_info->groups_sem);
 
-	ret = find_free_extent_update_loop(fs_info, ffe_ctl.last_ptr, ins,
-					   &ffe_ctl, full_search,
-					   ffe_ctl.use_cluster);
+	ret = find_free_extent_update_loop(fs_info, ins, &ffe_ctl, full_search);
 	if (ret > 0)
 		goto search;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 19/21] btrfs: factor out chunk_allocation_failed()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (17 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25  3:56 ` [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba
  Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota, Johannes Thumshirn

Factor out chunk_allocation_failed() from find_free_extent_update_loop().
This function is called when it failed to allocate a chunk. The function
can modify "ffe_ctl->loop" and return 0 to continue with the next stage.
Or, it can return -ENOSPC to give up here.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/extent-tree.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 05edd69f5fe1..cb82eaf28033 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3748,6 +3748,21 @@ static void found_extent(struct find_free_extent_ctl *ffe_ctl,
 	}
 }
 
+static int chunk_allocation_failed(struct find_free_extent_ctl *ffe_ctl)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		/*
+		 * If we can't allocate a new chunk we've already looped through
+		 * at least once, move on to the NO_EMPTY_SIZE case.
+		 */
+		ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
+		return 0;
+	default:
+		BUG();
+	}
+}
+
 /*
  * Return >0 means caller needs to re-search for free extent
  * Return 0 means we have the needed free extent.
@@ -3819,16 +3834,10 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 			ret = btrfs_chunk_alloc(trans, ffe_ctl->flags,
 						CHUNK_ALLOC_FORCE);
 
-			/*
-			 * If we can't allocate a new chunk we've already looped
-			 * through at least once, move on to the NO_EMPTY_SIZE
-			 * case.
-			 */
-			if (ret == -ENOSPC)
-				ffe_ctl->loop = LOOP_NO_EMPTY_SIZE;
-
 			/* Do not bail out on ENOSPC since we can do more. */
-			if (ret < 0 && ret != -ENOSPC)
+			if (ret == -ENOSPC)
+				ret = chunk_allocation_failed(ffe_ctl);
+			else if (ret < 0)
 				btrfs_abort_transaction(trans, ret);
 			else
 				ret = 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (18 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-28 14:20   ` David Sterba
  2020-02-25  3:56 ` [PATCH v3 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
  2020-02-25 14:34 ` [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation David Sterba
  21 siblings, 1 reply; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So, we can
skip this stage and give up the allocation.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/extent-tree.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index cb82eaf28033..055097bff12b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3848,6 +3848,9 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 		}
 
 		if (ffe_ctl->loop == LOOP_NO_EMPTY_SIZE) {
+			if (ffe_ctl->policy != BTRFS_EXTENT_ALLOC_CLUSTERED)
+				return -ENOSPC;
+
 			/*
 			 * Don't loop again if we already have no empty_size and
 			 * no empty_cluster.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 21/21] btrfs: factor out prepare_allocation()
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (19 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
@ 2020-02-25  3:56 ` Naohiro Aota
  2020-02-25 14:34 ` [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation David Sterba
  21 siblings, 0 replies; 25+ messages in thread
From: Naohiro Aota @ 2020-02-25  3:56 UTC (permalink / raw)
  To: linux-btrfs, David Sterba; +Cc: Josef Bacik, Nikolay Borisov, Naohiro Aota

This function finally factor out prepare_allocation() form
find_free_extent(). This function is called before the allocation loop and
a specific allocator function like prepare_allocation_clustered() should
initialize their private information and can set proper hint_byte to
indicate where to start the allocation with.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 110 +++++++++++++++++++++++++----------------
 1 file changed, 68 insertions(+), 42 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 055097bff12b..1340485b392b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3866,6 +3866,71 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
 	return -ENOSPC;
 }
 
+static int prepare_allocation_clustered(struct btrfs_fs_info *fs_info,
+					struct find_free_extent_ctl *ffe_ctl,
+					struct btrfs_space_info *space_info,
+					struct btrfs_key *ins)
+{
+	/*
+	 * If our free space is heavily fragmented we may not be able to make
+	 * big contiguous allocations, so instead of doing the expensive search
+	 * for free space, simply return ENOSPC with our max_extent_size so we
+	 * can go ahead and search for a more manageable chunk.
+	 *
+	 * If our max_extent_size is large enough for our allocation simply
+	 * disable clustering since we will likely not be able to find enough
+	 * space to create a cluster and induce latency trying.
+	 */
+	if (unlikely(space_info->max_extent_size)) {
+		spin_lock(&space_info->lock);
+		if (space_info->max_extent_size &&
+		    ffe_ctl->num_bytes > space_info->max_extent_size) {
+			ins->offset = space_info->max_extent_size;
+			spin_unlock(&space_info->lock);
+			return -ENOSPC;
+		} else if (space_info->max_extent_size) {
+			ffe_ctl->use_cluster = false;
+		}
+		spin_unlock(&space_info->lock);
+	}
+
+	ffe_ctl->last_ptr = fetch_cluster_info(fs_info, space_info,
+					       &ffe_ctl->empty_cluster);
+	if (ffe_ctl->last_ptr) {
+		struct btrfs_free_cluster *last_ptr = ffe_ctl->last_ptr;
+
+		spin_lock(&last_ptr->lock);
+		if (last_ptr->block_group)
+			ffe_ctl->hint_byte = last_ptr->window_start;
+		if (last_ptr->fragmented) {
+			/*
+			 * We still set window_start so we can keep track of the
+			 * last place we found an allocation to try and save
+			 * some time.
+			 */
+			ffe_ctl->hint_byte = last_ptr->window_start;
+			ffe_ctl->use_cluster = false;
+		}
+		spin_unlock(&last_ptr->lock);
+	}
+
+	return 0;
+}
+
+static int prepare_allocation(struct btrfs_fs_info *fs_info,
+			      struct find_free_extent_ctl *ffe_ctl,
+			      struct btrfs_space_info *space_info,
+			      struct btrfs_key *ins)
+{
+	switch (ffe_ctl->policy) {
+	case BTRFS_EXTENT_ALLOC_CLUSTERED:
+		return prepare_allocation_clustered(fs_info, ffe_ctl,
+						    space_info, ins);
+	default:
+		BUG();
+	}
+}
+
 /*
  * walks the btree of allocated extents and find a hole of a given size.
  * The key ins is changed to record the hole:
@@ -3935,48 +4000,9 @@ static noinline int find_free_extent(struct btrfs_fs_info *fs_info,
 		return -ENOSPC;
 	}
 
-	/*
-	 * If our free space is heavily fragmented we may not be able to make
-	 * big contiguous allocations, so instead of doing the expensive search
-	 * for free space, simply return ENOSPC with our max_extent_size so we
-	 * can go ahead and search for a more manageable chunk.
-	 *
-	 * If our max_extent_size is large enough for our allocation simply
-	 * disable clustering since we will likely not be able to find enough
-	 * space to create a cluster and induce latency trying.
-	 */
-	if (unlikely(space_info->max_extent_size)) {
-		spin_lock(&space_info->lock);
-		if (space_info->max_extent_size &&
-		    num_bytes > space_info->max_extent_size) {
-			ins->offset = space_info->max_extent_size;
-			spin_unlock(&space_info->lock);
-			return -ENOSPC;
-		} else if (space_info->max_extent_size) {
-			ffe_ctl.use_cluster = false;
-		}
-		spin_unlock(&space_info->lock);
-	}
-
-	ffe_ctl.last_ptr = fetch_cluster_info(fs_info, space_info,
-					      &ffe_ctl.empty_cluster);
-	if (ffe_ctl.last_ptr) {
-		struct btrfs_free_cluster *last_ptr = ffe_ctl.last_ptr;
-
-		spin_lock(&last_ptr->lock);
-		if (last_ptr->block_group)
-			ffe_ctl.hint_byte = last_ptr->window_start;
-		if (last_ptr->fragmented) {
-			/*
-			 * We still set window_start so we can keep track of the
-			 * last place we found an allocation to try and save
-			 * some time.
-			 */
-			ffe_ctl.hint_byte = last_ptr->window_start;
-			ffe_ctl.use_cluster = false;
-		}
-		spin_unlock(&last_ptr->lock);
-	}
+	ret = prepare_allocation(fs_info, &ffe_ctl, space_info, ins);
+	if (ret < 0)
+		return ret;
 
 	ffe_ctl.search_start = max(ffe_ctl.search_start,
 				   first_logical_byte(fs_info, 0));
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation
  2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
                   ` (20 preceding siblings ...)
  2020-02-25  3:56 ` [PATCH v3 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
@ 2020-02-25 14:34 ` David Sterba
  2020-02-28 14:32   ` David Sterba
  21 siblings, 1 reply; 25+ messages in thread
From: David Sterba @ 2020-02-25 14:34 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, David Sterba, Josef Bacik, Nikolay Borisov

Hi,

On Tue, Feb 25, 2020 at 12:56:05PM +0900, Naohiro Aota wrote:
> This series refactors chunk allocation, device_extent allocation and
> extent allocation functions and make them generalized to be able to
> implement other allocation policy easily.

I went through the patches and haven't seen anything serious so will do
another pass and then move it to misc-next as it's all a fairly
straightforward.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation
  2020-02-25  3:56 ` [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
@ 2020-02-28 14:20   ` David Sterba
  0 siblings, 0 replies; 25+ messages in thread
From: David Sterba @ 2020-02-28 14:20 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs, David Sterba, Josef Bacik, Nikolay Borisov

On Tue, Feb 25, 2020 at 12:56:25PM +0900, Naohiro Aota wrote:
> LOOP_NO_EMPTY_SIZE is solely dedicated for clustered allocation. So, we can
> skip this stage and give up the allocation.
> 
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  fs/btrfs/extent-tree.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index cb82eaf28033..055097bff12b 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3848,6 +3848,9 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
>  		}
>  
>  		if (ffe_ctl->loop == LOOP_NO_EMPTY_SIZE) {
> +			if (ffe_ctl->policy != BTRFS_EXTENT_ALLOC_CLUSTERED)
> +				return -ENOSPC;

This looks like functional change, unlike the rest so will need some
review still.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation
  2020-02-25 14:34 ` [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation David Sterba
@ 2020-02-28 14:32   ` David Sterba
  0 siblings, 0 replies; 25+ messages in thread
From: David Sterba @ 2020-02-28 14:32 UTC (permalink / raw)
  To: dsterba, Naohiro Aota, linux-btrfs, David Sterba, Josef Bacik,
	Nikolay Borisov

On Tue, Feb 25, 2020 at 03:34:04PM +0100, David Sterba wrote:
> Hi,
> 
> On Tue, Feb 25, 2020 at 12:56:05PM +0900, Naohiro Aota wrote:
> > This series refactors chunk allocation, device_extent allocation and
> > extent allocation functions and make them generalized to be able to
> > implement other allocation policy easily.
> 
> I went through the patches and haven't seen anything serious so will do
> another pass and then move it to misc-next as it's all a fairly
> straightforward.

Now pushed to misc-next. I made some changes but mostly updating
comments of the code that got moved. While it's ok to keep it as is I
take this opportunity to do that in one go instead of another patch.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-02-28 14:32 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-25  3:56 [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 01/21] btrfs: change type of full_search to bool Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 02/21] btrfs: do not BUG_ON with invalid profile Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 03/21] btrfs: introduce chunk allocation policy Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 04/21] btrfs: refactor find_free_dev_extent_start() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 05/21] btrfs: introduce alloc_chunk_ctl Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 06/21] btrfs: factor out init_alloc_chunk_ctl Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 07/21] btrfs: factor out gather_device_info() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 08/21] btrfs: factor out decide_stripe_size() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 09/21] btrfs: factor out create_chunk() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 10/21] btrfs: parameterize dev_extent_min Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 11/21] btrfs: introduce extent allocation policy Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 12/21] btrfs: move hint_byte into find_free_extent_ctl Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 13/21] btrfs: move variables for clustered allocation " Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 14/21] btrfs: factor out do_allocation() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 15/21] btrfs: drop unnecessary arguments from clustered allocation functions Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 16/21] btrfs: factor out release_block_group() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 17/21] btrfs: factor out found_extent() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 18/21] btrfs: drop unnecessary arguments from find_free_extent_update_loop() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 19/21] btrfs: factor out chunk_allocation_failed() Naohiro Aota
2020-02-25  3:56 ` [PATCH v3 20/21] btrfs: skip LOOP_NO_EMPTY_SIZE if not clustered allocation Naohiro Aota
2020-02-28 14:20   ` David Sterba
2020-02-25  3:56 ` [PATCH v3 21/21] btrfs: factor out prepare_allocation() Naohiro Aota
2020-02-25 14:34 ` [PATCH v3 00/21] btrfs: refactor and generalize chunk/dev_extent/extent allocation David Sterba
2020-02-28 14:32   ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.