All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure
@ 2020-01-31 22:35 Josef Bacik
  2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
                   ` (22 more replies)
  0 siblings, 23 replies; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v1->v2:
- dropped the RFC
- realized that I mis-translated the transaction commit logic from the old way
  to the new way, so I've reworked a bunch of patches to clearly pull that
  behavior into the generic flushing code.  I've then cleaned it up later to
  make it easy to bisect down to behavior changes.
- Cleaned up the priority flushing, there's no need for an explicit state array.
- Removed the CHUNK_FORCE_ALLOC from the normal flushing as well, simply keep
  the logic of allocating chunks until we've made our reservation or we are
  full, then fall back on the normal flushing mechanism.

-------------- Original email --------------
Nikolay reported a problem where we'd occasionally fail generic/371.  This test
has two tasks running in a loop, one that fallocate and rm's, and one that
pwrite's and rm's.  There is enough space for both of these files to exist at
one time, but sometimes the pwrite would fail.

It would fail because we do not serialize data reseravtions.  If one task is
stuck doing the reclaim work, and another task comes in and steals it's
reservation enough times, we'll give up and return ENOSPC.  We validated this by
adding a printk to the data reservation path to tell us that it was succeeding
at making a reservation while another task was flushing.

To solve this problem I've converted data reservations over to the ticketing
system that metadata uses.  There are some cleanups and some fixes that have to
come before we could do that.  The following are simply cleanups

  [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  [PATCH 02/20] btrfs: remove orig from shrink_delalloc
  [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc

The following are fixes that are needed to handle data space infos properly.

  [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg
  [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags
  [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing
  [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning
  [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving
  [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use

I then converted the data reservation path over to the ticketing infrastructure,
but I did it in a way that mirrored exactly what we currently have.  The idea is
that I want to be able to bisect regressions that happen from behavior change,
and doing that would be hard if I just had a single patch doing the whole
conversion at once.  So the following patches are only moving code around
logically, but preserve the same behavior as before

  [PATCH 10/20] btrfs: add flushing states for handling data
  [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it
  [PATCH 12/20] btrfs: use ticketing for data space reservations

And then the following patches were changing the behavior of how we flush space
for data reservations.

  [PATCH 13/20] btrfs: run delayed iputs before committing the
  [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data
  [PATCH 15/20] btrfs: serialize data reservations if we are flushing
  [PATCH 16/20] btrfs: rework chunk allocate for data reservations
  [PATCH 17/20] btrfs: drop the commit_cycles stuff for data

And then a cleanup now that the data reservation code is the same as the
metadata reservation code.

  [PATCH 18/20] btrfs: use the same helper for data and metadata

Finally a patch to change the flushing from direct to asynchronous, mirroring
what we do for metadata for better latency.

  [PATCH 19/20] btrfs: do async reclaim for data reservations

And then a final cleanup now that we're where we want to be with the data
reservation path.

  [PATCH 20/20] btrfs: kill the priority_reclaim_space helper

I've marked this as an RFC because I've only tested this with generic/371.  I'm
starting my overnight runs of xfstests now and will likely find regressions, but
I'd like to get review started so this can get merged ASAP.  Thanks,

Josef



^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 10:45   ` Johannes Thumshirn
  2020-02-03 11:15   ` Nikolay Borisov
  2020-01-31 22:35 ` [PATCH 02/23] btrfs: remove orig from shrink_delalloc Josef Bacik
                   ` (21 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have btrfs_wait_ordered_roots() which takes a u64 for nr, but
btrfs_start_delalloc_roots() that takes an int for nr, which makes using
them in conjunction, especially for something like (u64)-1, annoying and
inconsistent.  Fix btrfs_start_delalloc_roots() to take a u64 for nr and
adjust start_delalloc_inodes() and it's callers appropriately.

This means we've adjusted start_delalloc_inodes() to take a pointer of
nr since we want to preserve the ability for start-delalloc_inodes() to
return an error, so simply make it do the nr adjusting as necessary.

Part of adjusting the callers to this means changing
btrfs_writeback_inodes_sb_nr() to take a u64 for items.  This may be
confusing because it seems unrelated, but the caller of
btrfs_writeback_inodes_sb_nr() already passes in a u64, it's just the
function variable that needs to be changed.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h       |  2 +-
 fs/btrfs/dev-replace.c |  2 +-
 fs/btrfs/inode.c       | 27 +++++++++++----------------
 fs/btrfs/ioctl.c       |  2 +-
 fs/btrfs/space-info.c  |  2 +-
 5 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8a2c1665baad..6afa0885a9bb 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2884,7 +2884,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 			       u32 min_type);
 
 int btrfs_start_delalloc_snapshot(struct btrfs_root *root);
-int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr);
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, u64 nr);
 int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end,
 			      unsigned int extra_bits,
 			      struct extent_state **cached_state);
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index f639dde2a679..6ff08eb3c35d 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -593,7 +593,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	 * flush all outstanding I/O and inode extent mappings before the
 	 * copy operation is declared as being finished
 	 */
-	ret = btrfs_start_delalloc_roots(fs_info, -1);
+	ret = btrfs_start_delalloc_roots(fs_info, U64_MAX);
 	if (ret) {
 		mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
 		return ret;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9320f13778ce..5c6ce78bff1d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9619,7 +9619,8 @@ static struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode
  * some fairly slow code that needs optimization. This walks the list
  * of all the inodes with pending delalloc and forces them to disk.
  */
-static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
+static int start_delalloc_inodes(struct btrfs_root *root, u64 *nr,
+				 bool snapshot)
 {
 	struct btrfs_inode *binode;
 	struct inode *inode;
@@ -9659,9 +9660,11 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
 		list_add_tail(&work->list, &works);
 		btrfs_queue_work(root->fs_info->flush_workers,
 				 &work->work);
-		ret++;
-		if (nr != -1 && ret >= nr)
-			goto out;
+		if (*nr != U64_MAX) {
+			(*nr)--;
+			if (*nr == 0)
+				goto out;
+		}
 		cond_resched();
 		spin_lock(&root->delalloc_lock);
 	}
@@ -9686,18 +9689,15 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
 int btrfs_start_delalloc_snapshot(struct btrfs_root *root)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	int ret;
+	u64 nr = U64_MAX;
 
 	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
 		return -EROFS;
 
-	ret = start_delalloc_inodes(root, -1, true);
-	if (ret > 0)
-		ret = 0;
-	return ret;
+	return start_delalloc_inodes(root, &nr, true);
 }
 
-int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, u64 nr)
 {
 	struct btrfs_root *root;
 	struct list_head splice;
@@ -9720,15 +9720,10 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
 			       &fs_info->delalloc_roots);
 		spin_unlock(&fs_info->delalloc_root_lock);
 
-		ret = start_delalloc_inodes(root, nr, false);
+		ret = start_delalloc_inodes(root, &nr, false);
 		btrfs_put_root(root);
 		if (ret < 0)
 			goto out;
-
-		if (nr != -1) {
-			nr -= ret;
-			WARN_ON(nr < 0);
-		}
 		spin_lock(&fs_info->delalloc_root_lock);
 	}
 	spin_unlock(&fs_info->delalloc_root_lock);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ecb6b188df15..442c89502f06 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5510,7 +5510,7 @@ long btrfs_ioctl(struct file *file, unsigned int
 	case BTRFS_IOC_SYNC: {
 		int ret;
 
-		ret = btrfs_start_delalloc_roots(fs_info, -1);
+		ret = btrfs_start_delalloc_roots(fs_info, U64_MAX);
 		if (ret)
 			return ret;
 		ret = btrfs_sync_fs(inode->i_sb, 1);
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 01297c5b2666..edda1ee0455e 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -310,7 +310,7 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 }
 
 static void btrfs_writeback_inodes_sb_nr(struct btrfs_fs_info *fs_info,
-					 unsigned long nr_pages, int nr_items)
+					 unsigned long nr_pages, u64 nr_items)
 {
 	struct super_block *sb = fs_info->sb;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 02/23] btrfs: remove orig from shrink_delalloc
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
  2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 10:46   ` Johannes Thumshirn
  2020-01-31 22:35 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

We don't use this anywhere inside of shrink_delalloc, remove it.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index edda1ee0455e..671c3a379224 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -350,7 +350,7 @@ static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
  * shrink metadata reservation for delalloc
  */
 static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
-			    u64 orig, bool wait_ordered)
+			    bool wait_ordered)
 {
 	struct btrfs_space_info *space_info;
 	struct btrfs_trans_handle *trans;
@@ -569,7 +569,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes * 2, num_bytes,
+		shrink_delalloc(fs_info, num_bytes * 2,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
  2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
  2020-01-31 22:35 ` [PATCH 02/23] btrfs: remove orig from shrink_delalloc Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 11:01   ` Johannes Thumshirn
  2020-02-03 11:05   ` Johannes Thumshirn
  2020-01-31 22:35 ` [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
                   ` (19 subsequent siblings)
  22 siblings, 2 replies; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Data allocations are going to want to pass in U64_MAX for flushing
space, adjust shrink_delalloc to handle this properly.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 671c3a379224..924cee245e4a 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -363,8 +363,19 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
 	int loops;
 
 	/* Calc the number of the pages we need flush for space reservation */
-	items = calc_reclaim_items_nr(fs_info, to_reclaim);
-	to_reclaim = items * EXTENT_SIZE_PER_ITEM;
+	if (to_reclaim == U64_MAX) {
+		items = U64_MAX;
+	} else {
+		/*
+		 * to_reclaim is set to however much metadata we need to
+		 * reclaim, but reclaiming that much data doesn't really track
+		 * exactly, so increase the amount to reclaim by 2x in order to
+		 * make sure we're flushing enough delalloc to hopefully reclaim
+		 * some metadata reservations.
+		 */
+		items = calc_reclaim_items_nr(fs_info, to_reclaim) * 2;
+		to_reclaim = items * EXTENT_SIZE_PER_ITEM;
+	}
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
@@ -569,7 +580,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes * 2,
+		shrink_delalloc(fs_info, num_bytes,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (2 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 10:53   ` Johannes Thumshirn
  2020-01-31 22:35 ` [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
                   ` (18 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Currently shrink_delalloc just looks up the metadata space info, but
this won't work if we're trying to reclaim space for data chunks.  We
get the right space_info we want passed into flush_space, so simply pass
that along to shrink_delalloc.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 924cee245e4a..17e2b5a53cb5 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -349,10 +349,10 @@ static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
 /*
  * shrink metadata reservation for delalloc
  */
-static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
-			    bool wait_ordered)
+static void shrink_delalloc(struct btrfs_fs_info *fs_info,
+			    struct btrfs_space_info *space_info,
+			    u64 to_reclaim, bool wait_ordered)
 {
-	struct btrfs_space_info *space_info;
 	struct btrfs_trans_handle *trans;
 	u64 delalloc_bytes;
 	u64 dio_bytes;
@@ -378,7 +378,6 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
 	}
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
-	space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
 
 	delalloc_bytes = percpu_counter_sum_positive(
 						&fs_info->delalloc_bytes);
@@ -580,7 +579,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes,
+		shrink_delalloc(fs_info, space_info, num_bytes,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (3 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 11:19   ` Johannes Thumshirn
  2020-01-31 22:35 ` [PATCH 06/23] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
                   ` (17 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

We have traditionally used flush_space() to flush metadata space, so
we've been unconditionally using btrfs_metadata_alloc_profile() for our
profile to allocate a chunk.  However if we're going to use this for
data we need to use btrfs_get_alloc_profile() on the space_info we pass
in.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 17e2b5a53cb5..5a92851af2b3 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -604,7 +604,8 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 			break;
 		}
 		ret = btrfs_chunk_alloc(trans,
-				btrfs_metadata_alloc_profile(fs_info),
+				btrfs_get_alloc_profile(fs_info,
+							space_info->flags),
 				(state == ALLOC_CHUNK) ? CHUNK_ALLOC_NO_FORCE :
 					CHUNK_ALLOC_FORCE);
 		btrfs_end_transaction(trans);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 06/23] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (4 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-01-31 22:35 ` [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

We were missing a call to btrfs_try_granting_tickets in
btrfs_free_reserved_bytes, so add it to handle the case where we're able
to satisfy an allocation because we've freed a pending reservation.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 77ec0597bd17..616d0dd69394 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2932,6 +2932,8 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache,
 	if (delalloc)
 		cache->delalloc_bytes -= num_bytes;
 	spin_unlock(&cache->lock);
+
+	btrfs_try_granting_tickets(cache->fs_info, space_info);
 	spin_unlock(&space_info->lock);
 }
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (5 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 06/23] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 11:24   ` Johannes Thumshirn
  2020-01-31 22:35 ` [PATCH 08/23] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
                   ` (15 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

When unpinning we were only calling btrfs_try_granting_tickets() if
global_rsv->space_info == space_info, which is problematic because we
use ticketing for SYSTEM chunks, and want to use it for DATA as well.
Fix this by moving this call outside of that if statement.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0783341ef2e7..dfa810854850 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2885,11 +2885,10 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 				len -= to_add;
 			}
 			spin_unlock(&global_rsv->lock);
-			/* Add to any tickets we may have */
-			if (len)
-				btrfs_try_granting_tickets(fs_info,
-							   space_info);
 		}
+		/* Add to any tickets we may have */
+		if (!readonly && return_free_space && len)
+			btrfs_try_granting_tickets(fs_info, space_info);
 		spin_unlock(&space_info->lock);
 	}
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 08/23] btrfs: call btrfs_try_granting_tickets when reserving space
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (6 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-01-31 22:35 ` [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

If we have compression on we could free up more space than we reserved,
and thus be able to make a space reservation.  Add the call for this
scenario.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 616d0dd69394..aca4510f1f2d 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2899,6 +2899,13 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
 						      space_info, -ram_bytes);
 		if (delalloc)
 			cache->delalloc_bytes += num_bytes;
+
+		/*
+		 * Compression can use less space than we reserved, so wake
+		 * tickets if that happens.
+		 */
+		if (num_bytes < ram_bytes)
+			btrfs_try_granting_tickets(cache->fs_info, space_info);
 	}
 	spin_unlock(&cache->lock);
 	spin_unlock(&space_info->lock);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (7 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 08/23] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
@ 2020-01-31 22:35 ` Josef Bacik
  2020-02-03 14:46   ` Johannes Thumshirn
  2020-01-31 22:36 ` [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc Josef Bacik
                   ` (13 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:35 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

We are going to use the ticket infrastructure for data, so use the
btrfs_space_info_free_bytes_may_use() helper in
btrfs_free_reserved_data_space_noquota() so we get the
try_granting_tickets call when we free our reservation.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/delalloc-space.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 4cdac4d834f5..08cfef49f88b 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -179,9 +179,7 @@ void btrfs_free_reserved_data_space_noquota(struct inode *inode, u64 start,
 	start = round_down(start, fs_info->sectorsize);
 
 	data_sinfo = fs_info->data_sinfo;
-	spin_lock(&data_sinfo->lock);
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, -len);
-	spin_unlock(&data_sinfo->lock);
+	btrfs_space_info_free_bytes_may_use(fs_info, data_sinfo, len);
 }
 
 /*
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (8 preceding siblings ...)
  2020-01-31 22:35 ` [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 12:03   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 11/23] btrfs: check tickets after waiting on ordered extents Josef Bacik
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The original iteration of flushing had us flushing delalloc and then
checking to see if we could make our reservation, thus we were very
careful about how many pages we would flush at once.

But now that everything is async and we satisfy tickets as the space
becomes available we don't have to keep track of any of this, simply try
and flush the number of dirty inodes we may have in order to reclaim
space to make our reservation.  This cleans up our delalloc flushing
significantly.

The async_pages stuff is dropped because btrfs_start_delalloc_roots()
handles the case that we generate async extents for us, so we no longer
require this extra logic.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 55 +------------------------------------------
 1 file changed, 1 insertion(+), 54 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 5a92851af2b3..a69e3d9057ff 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -309,28 +309,6 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 	up_read(&info->groups_sem);
 }
 
-static void btrfs_writeback_inodes_sb_nr(struct btrfs_fs_info *fs_info,
-					 unsigned long nr_pages, u64 nr_items)
-{
-	struct super_block *sb = fs_info->sb;
-
-	if (down_read_trylock(&sb->s_umount)) {
-		writeback_inodes_sb_nr(sb, nr_pages, WB_REASON_FS_FREE_SPACE);
-		up_read(&sb->s_umount);
-	} else {
-		/*
-		 * We needn't worry the filesystem going from r/w to r/o though
-		 * we don't acquire ->s_umount mutex, because the filesystem
-		 * should guarantee the delalloc inodes list be empty after
-		 * the filesystem is readonly(all dirty pages are written to
-		 * the disk).
-		 */
-		btrfs_start_delalloc_roots(fs_info, nr_items);
-		if (!current->journal_info)
-			btrfs_wait_ordered_roots(fs_info, nr_items, 0, (u64)-1);
-	}
-}
-
 static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
 					u64 to_reclaim)
 {
@@ -356,10 +334,8 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 	struct btrfs_trans_handle *trans;
 	u64 delalloc_bytes;
 	u64 dio_bytes;
-	u64 async_pages;
 	u64 items;
 	long time_left;
-	unsigned long nr_pages;
 	int loops;
 
 	/* Calc the number of the pages we need flush for space reservation */
@@ -400,37 +376,8 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 
 	loops = 0;
 	while ((delalloc_bytes || dio_bytes) && loops < 3) {
-		nr_pages = min(delalloc_bytes, to_reclaim) >> PAGE_SHIFT;
-
-		/*
-		 * Triggers inode writeback for up to nr_pages. This will invoke
-		 * ->writepages callback and trigger delalloc filling
-		 *  (btrfs_run_delalloc_range()).
-		 */
-		btrfs_writeback_inodes_sb_nr(fs_info, nr_pages, items);
-
-		/*
-		 * We need to wait for the compressed pages to start before
-		 * we continue.
-		 */
-		async_pages = atomic_read(&fs_info->async_delalloc_pages);
-		if (!async_pages)
-			goto skip_async;
-
-		/*
-		 * Calculate how many compressed pages we want to be written
-		 * before we continue. I.e if there are more async pages than we
-		 * require wait_event will wait until nr_pages are written.
-		 */
-		if (async_pages <= nr_pages)
-			async_pages = 0;
-		else
-			async_pages -= nr_pages;
+		btrfs_start_delalloc_roots(fs_info, items);
 
-		wait_event(fs_info->async_submit_wait,
-			   atomic_read(&fs_info->async_delalloc_pages) <=
-			   (int)async_pages);
-skip_async:
 		spin_lock(&space_info->lock);
 		if (list_empty(&space_info->tickets) &&
 		    list_empty(&space_info->priority_tickets)) {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 11/23] btrfs: check tickets after waiting on ordered extents
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (9 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 13:10   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction Josef Bacik
                   ` (11 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Right now if the space is free'd up after the ordered extents complete
(which is likely since the reservations are held until they complete),
we would do extra delalloc flushing before we'd notice that we didn't
have any more tickets.  Fix this by moving the tickets check after our
wait_ordered_extents check.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a69e3d9057ff..955f59f4b1d0 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -378,14 +378,6 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 	while ((delalloc_bytes || dio_bytes) && loops < 3) {
 		btrfs_start_delalloc_roots(fs_info, items);
 
-		spin_lock(&space_info->lock);
-		if (list_empty(&space_info->tickets) &&
-		    list_empty(&space_info->priority_tickets)) {
-			spin_unlock(&space_info->lock);
-			break;
-		}
-		spin_unlock(&space_info->lock);
-
 		loops++;
 		if (wait_ordered && !trans) {
 			btrfs_wait_ordered_roots(fs_info, items, 0, (u64)-1);
@@ -394,6 +386,15 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 			if (time_left)
 				break;
 		}
+
+		spin_lock(&space_info->lock);
+		if (list_empty(&space_info->tickets) &&
+		    list_empty(&space_info->priority_tickets)) {
+			spin_unlock(&space_info->lock);
+			break;
+		}
+		spin_unlock(&space_info->lock);
+
 		delalloc_bytes = percpu_counter_sum_positive(
 						&fs_info->delalloc_bytes);
 		dio_bytes = percpu_counter_sum_positive(&fs_info->dio_bytes);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (10 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 11/23] btrfs: check tickets after waiting on ordered extents Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 14:00   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 13/23] btrfs: add flushing states for handling data reservations Josef Bacik
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Data space flushing currently unconditionally commits the transaction
twice in a row, and the last time it checks if there's enough pinned
extents to satisfy it's reservation before deciding to commit the
transaction for the 3rd and final time.

Encode this logic into may_commit_transaction().  In the next patch we
will pass in U64_MAX for bytes_needed the first two times, and the final
time we will pass in the actual bytes we need so the normal logic will
apply.

This patch exists soley to make the logical changes I will make to the
flushing state machine separate to make it easier to bisect any
performance related regressions.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 955f59f4b1d0..9f8ef2a09ad9 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -412,20 +412,32 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
  * will return -ENOSPC.
  */
 static int may_commit_transaction(struct btrfs_fs_info *fs_info,
-				  struct btrfs_space_info *space_info)
+				  struct btrfs_space_info *space_info,
+				  u64 bytes_needed)
 {
 	struct reserve_ticket *ticket = NULL;
 	struct btrfs_block_rsv *delayed_rsv = &fs_info->delayed_block_rsv;
 	struct btrfs_block_rsv *delayed_refs_rsv = &fs_info->delayed_refs_rsv;
 	struct btrfs_trans_handle *trans;
-	u64 bytes_needed;
 	u64 reclaim_bytes = 0;
 	u64 cur_free_bytes = 0;
+	bool do_commit = false;
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	if (trans)
 		return -EAGAIN;
 
+	/*
+	 * If we are data and have passed in U64_MAX we just want to
+	 * unconditionally commit the transaction to match the previous data
+	 * flushing behavior.
+	 */
+	if ((space_info->flags & BTRFS_BLOCK_GROUP_DATA) &&
+	   bytes_needed == U64_MAX) {
+		do_commit = true;
+		goto check_pinned;
+	}
+
 	spin_lock(&space_info->lock);
 	cur_free_bytes = btrfs_space_info_used(space_info, true);
 	if (cur_free_bytes < space_info->total_bytes)
@@ -439,7 +451,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	else if (!list_empty(&space_info->tickets))
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
-	bytes_needed = (ticket) ? ticket->bytes : 0;
+	bytes_needed = (ticket) ? ticket->bytes : bytes_needed;
 
 	if (bytes_needed > cur_free_bytes)
 		bytes_needed -= cur_free_bytes;
@@ -450,6 +462,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	if (!bytes_needed)
 		return 0;
 
+check_pinned:
 	trans = btrfs_join_transaction(fs_info->extent_root);
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
@@ -459,7 +472,8 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	 * we have block groups that are going to be freed, allowing us to
 	 * possibly do a chunk allocation the next loop through.
 	 */
-	if (test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) ||
+	if (do_commit ||
+	    test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) ||
 	    __percpu_counter_compare(&space_info->total_bytes_pinned,
 				     bytes_needed,
 				     BTRFS_TOTAL_BYTES_PINNED_BATCH) >= 0)
@@ -570,7 +584,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		btrfs_wait_on_delayed_iputs(fs_info);
 		break;
 	case COMMIT_TRANS:
-		ret = may_commit_transaction(fs_info, space_info);
+		ret = may_commit_transaction(fs_info, space_info, num_bytes);
 		break;
 	default:
 		ret = -ENOSPC;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 13/23] btrfs: add flushing states for handling data reservations
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (11 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 14:00   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Currently the way we do data reservations is by seeing if we have enough
space in our space_info.  If we do not and we're a normal inode we'll

1) Attempt to force a chunk allocation until we can't anymore.
2) If that fails we'll flush delalloc, then commit the transaction, then
   run the delayed iputs.

If we are a free space inode we're only allowed to force a chunk
allocation.  In order to use the normal flushing mechanism we need to
encode this into a flush state array for normal inodes.  Since both will
start with allocating chunks until the space info is full there is no
need to add this as a flush state, this will be handled specially.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h      | 2 ++
 fs/btrfs/space-info.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6afa0885a9bb..865b24a1759e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2526,6 +2526,8 @@ enum btrfs_reserve_flush_enum {
 	 */
 	BTRFS_RESERVE_FLUSH_LIMIT,
 	BTRFS_RESERVE_FLUSH_EVICT,
+	BTRFS_RESERVE_FLUSH_DATA,
+	BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE,
 	BTRFS_RESERVE_FLUSH_ALL,
 };
 
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 9f8ef2a09ad9..ad203717269c 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -816,6 +816,12 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	COMMIT_TRANS,
 };
 
+static const enum btrfs_flush_state data_flush_states[] = {
+	FLUSH_DELALLOC_WAIT,
+	COMMIT_TRANS,
+	RUN_DELAYED_IPUTS,
+};
+
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (12 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 13/23] btrfs: add flushing states for handling data reservations Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 14:20   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 15/23] btrfs: use ticketing for data space reservations Josef Bacik
                   ` (8 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Create a new function btrfs_reserve_data_bytes() in order to handle data
reservations.  This uses the new flush types and flush states to handle
making data reservations.

This patch specifically does not change any functionality, and is
purposefully not cleaned up in order to make bisection easier for the
future patches.  The new helper is identical to the old helper in how it
handles data reservations.  We first try to force a chunk allocation,
and then we run through the flush states all at once and in the same
order that they were done with the old helper.

Subsequent patches will clean this up and change the behavior of the
flushing, and it is important to keep those changes separate so we can
easily bisect down to the patch that caused the regression, rather than
the patch that made us start using the new infrastructure.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/delalloc-space.c | 119 ++------------------------------------
 fs/btrfs/space-info.c     |  92 +++++++++++++++++++++++++++++
 fs/btrfs/space-info.h     |   2 +
 3 files changed, 98 insertions(+), 115 deletions(-)

diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 08cfef49f88b..c13d8609cc99 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -13,126 +13,15 @@ int btrfs_alloc_data_chunk_ondemand(struct btrfs_inode *inode, u64 bytes)
 {
 	struct btrfs_root *root = inode->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	u64 used;
-	int ret = 0;
-	int need_commit = 2;
-	int have_pinned_space;
+	enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_DATA;
 
 	/* Make sure bytes are sectorsize aligned */
 	bytes = ALIGN(bytes, fs_info->sectorsize);
 
-	if (btrfs_is_free_space_inode(inode)) {
-		need_commit = 0;
-		ASSERT(current->journal_info);
-	}
-
-again:
-	/* Make sure we have enough space to handle the data first */
-	spin_lock(&data_sinfo->lock);
-	used = btrfs_space_info_used(data_sinfo, true);
-
-	if (used + bytes > data_sinfo->total_bytes) {
-		struct btrfs_trans_handle *trans;
-
-		/*
-		 * If we don't have enough free bytes in this space then we need
-		 * to alloc a new chunk.
-		 */
-		if (!data_sinfo->full) {
-			u64 alloc_target;
-
-			data_sinfo->force_alloc = CHUNK_ALLOC_FORCE;
-			spin_unlock(&data_sinfo->lock);
-
-			alloc_target = btrfs_data_alloc_profile(fs_info);
-			/*
-			 * It is ugly that we don't call nolock join
-			 * transaction for the free space inode case here.
-			 * But it is safe because we only do the data space
-			 * reservation for the free space cache in the
-			 * transaction context, the common join transaction
-			 * just increase the counter of the current transaction
-			 * handler, doesn't try to acquire the trans_lock of
-			 * the fs.
-			 */
-			trans = btrfs_join_transaction(root);
-			if (IS_ERR(trans))
-				return PTR_ERR(trans);
-
-			ret = btrfs_chunk_alloc(trans, alloc_target,
-						CHUNK_ALLOC_NO_FORCE);
-			btrfs_end_transaction(trans);
-			if (ret < 0) {
-				if (ret != -ENOSPC)
-					return ret;
-				else {
-					have_pinned_space = 1;
-					goto commit_trans;
-				}
-			}
-
-			goto again;
-		}
-
-		/*
-		 * If we don't have enough pinned space to deal with this
-		 * allocation, and no removed chunk in current transaction,
-		 * don't bother committing the transaction.
-		 */
-		have_pinned_space = __percpu_counter_compare(
-			&data_sinfo->total_bytes_pinned,
-			used + bytes - data_sinfo->total_bytes,
-			BTRFS_TOTAL_BYTES_PINNED_BATCH);
-		spin_unlock(&data_sinfo->lock);
-
-		/* Commit the current transaction and try again */
-commit_trans:
-		if (need_commit) {
-			need_commit--;
-
-			if (need_commit > 0) {
-				btrfs_start_delalloc_roots(fs_info, -1);
-				btrfs_wait_ordered_roots(fs_info, U64_MAX, 0,
-							 (u64)-1);
-			}
-
-			trans = btrfs_join_transaction(root);
-			if (IS_ERR(trans))
-				return PTR_ERR(trans);
-			if (have_pinned_space >= 0 ||
-			    test_bit(BTRFS_TRANS_HAVE_FREE_BGS,
-				     &trans->transaction->flags) ||
-			    need_commit > 0) {
-				ret = btrfs_commit_transaction(trans);
-				if (ret)
-					return ret;
-				/*
-				 * The cleaner kthread might still be doing iput
-				 * operations. Wait for it to finish so that
-				 * more space is released.  We don't need to
-				 * explicitly run the delayed iputs here because
-				 * the commit_transaction would have woken up
-				 * the cleaner.
-				 */
-				ret = btrfs_wait_on_delayed_iputs(fs_info);
-				if (ret)
-					return ret;
-				goto again;
-			} else {
-				btrfs_end_transaction(trans);
-			}
-		}
-
-		trace_btrfs_space_reservation(fs_info,
-					      "space_info:enospc",
-					      data_sinfo->flags, bytes, 1);
-		return -ENOSPC;
-	}
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-	spin_unlock(&data_sinfo->lock);
+	if (btrfs_is_free_space_inode(inode))
+		flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
 
-	return 0;
+	return btrfs_reserve_data_bytes(fs_info, bytes, flush);
 }
 
 int btrfs_check_data_free_space(struct inode *inode,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index ad203717269c..6b71f6d3a348 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1082,3 +1082,95 @@ int btrfs_reserve_metadata_bytes(struct btrfs_root *root,
 	}
 	return ret;
 }
+
+/**
+ * btrfs_reserve_data_bytes - try to reserve data bytes for an allocation
+ * @root - the root we are allocating for
+ * @bytes - the number of bytes we need
+ * @flush - how we are allowed to flush
+ *
+ * This will reserve bytes from the data space info.  If there is not enough
+ * space then we will attempt to flush space as specified ty flush.
+ */
+int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+			     enum btrfs_reserve_flush_enum flush)
+{
+	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
+	const enum btrfs_flush_state *states = NULL;
+	u64 used;
+	int states_nr = 0;
+	int commit_cycles = 2;
+	int ret = -ENOSPC;
+
+	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
+
+	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
+		states = data_flush_states;
+		states_nr = ARRAY_SIZE(data_flush_states);
+	}
+
+	spin_lock(&data_sinfo->lock);
+again:
+	used = btrfs_space_info_used(data_sinfo, true);
+
+	if (used + bytes > data_sinfo->total_bytes) {
+		u64 prev_total_bytes = data_sinfo->total_bytes;
+		int flush_state = 0;
+
+		spin_unlock(&data_sinfo->lock);
+
+		/*
+		 * Everybody can force chunk allocation, so try this first to
+		 * see if we can just bail here and make our reservation.
+		 */
+		flush_space(fs_info, data_sinfo, bytes, ALLOC_CHUNK_FORCE);
+		spin_lock(&data_sinfo->lock);
+		if (prev_total_bytes < data_sinfo->total_bytes)
+			goto again;
+		spin_unlock(&data_sinfo->lock);
+
+		/*
+		 * Cycle through the rest of the flushing options for our flush
+		 * type, then try again.
+		 */
+		while (flush_state < states_nr) {
+			u64 flush_bytes = U64_MAX;
+
+			/*
+			 * Previously we unconditionally committed the
+			 * transaction twice before finally checking against
+			 * pinned space before committing the final time.  We
+			 * also skipped flushing delalloc the final pass
+			 * through.
+			 */
+			if (!commit_cycles) {
+				if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
+					flush_state++;
+					continue;
+				}
+				if (states[flush_state] == COMMIT_TRANS)
+					flush_bytes = bytes;
+			}
+
+			flush_space(fs_info, data_sinfo, flush_bytes,
+				    states[flush_state]);
+			flush_state++;
+		}
+
+		if (!commit_cycles)
+			goto out;
+
+		commit_cycles--;
+		spin_lock(&data_sinfo->lock);
+		goto again;
+	}
+	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
+	ret = 0;
+	spin_unlock(&data_sinfo->lock);
+out:
+	if (ret)
+		trace_btrfs_space_reservation(fs_info,
+					      "space_info:enospc",
+					      data_sinfo->flags, bytes, 1);
+	return ret;
+}
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 24514cd2c6c1..179f757c4a6b 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -141,5 +141,7 @@ static inline void btrfs_space_info_free_bytes_may_use(
 	btrfs_try_granting_tickets(fs_info, space_info);
 	spin_unlock(&space_info->lock);
 }
+int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+			     enum btrfs_reserve_flush_enum flush);
 
 #endif /* BTRFS_SPACE_INFO_H */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 15/23] btrfs: use ticketing for data space reservations
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (13 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 14:29   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 16/23] btrfs: serialize data reservations if we are flushing Josef Bacik
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have all the infrastructure in place, use the ticketing
infrastructure to make data allocations.  This still maintains the exact
same flushing behavior, but now we're using tickets to get our
reservations satisfied.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 128 ++++++++++++++++++++++--------------------
 1 file changed, 68 insertions(+), 60 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 6b71f6d3a348..43c5775bcbc6 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -829,7 +829,7 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				int states_nr)
 {
 	u64 to_reclaim;
-	int flush_state;
+	int flush_state = 0;
 
 	spin_lock(&space_info->lock);
 	to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info);
@@ -839,7 +839,6 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 	}
 	spin_unlock(&space_info->lock);
 
-	flush_state = 0;
 	do {
 		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
 		flush_state++;
@@ -852,6 +851,54 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 	} while (flush_state < states_nr);
 }
 
+static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
+					struct btrfs_space_info *space_info,
+					struct reserve_ticket *ticket,
+					const enum btrfs_flush_state *states,
+					int states_nr)
+{
+	int flush_state = 0;
+	int commit_cycles = 2;
+
+	while (!space_info->full) {
+		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
+		spin_lock(&space_info->lock);
+		if (ticket->bytes == 0) {
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		spin_unlock(&space_info->lock);
+	}
+again:
+	while (flush_state < states_nr) {
+		u64 flush_bytes = U64_MAX;
+
+		if (!commit_cycles) {
+			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
+				flush_state++;
+				continue;
+			}
+			if (states[flush_state] == COMMIT_TRANS)
+				flush_bytes = ticket->bytes;
+		}
+
+		flush_space(fs_info, space_info, flush_bytes,
+			    states[flush_state]);
+		spin_lock(&space_info->lock);
+		if (ticket->bytes == 0) {
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		spin_unlock(&space_info->lock);
+		flush_state++;
+	}
+	if (commit_cycles) {
+		commit_cycles--;
+		flush_state = 0;
+		goto again;
+	}
+}
+
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket)
@@ -917,6 +964,15 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						evict_flush_states,
 						ARRAY_SIZE(evict_flush_states));
 		break;
+	case BTRFS_RESERVE_FLUSH_DATA:
+		priority_reclaim_data_space(fs_info, space_info, ticket,
+					data_flush_states,
+					ARRAY_SIZE(data_flush_states));
+		break;
+	case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
+		priority_reclaim_data_space(fs_info, space_info, ticket,
+					    NULL, 0);
+		break;
 	default:
 		ASSERT(0);
 		break;
@@ -1096,78 +1152,30 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 			     enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	const enum btrfs_flush_state *states = NULL;
 	u64 used;
-	int states_nr = 0;
-	int commit_cycles = 2;
 	int ret = -ENOSPC;
 
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
-		states = data_flush_states;
-		states_nr = ARRAY_SIZE(data_flush_states);
-	}
-
 	spin_lock(&data_sinfo->lock);
-again:
 	used = btrfs_space_info_used(data_sinfo, true);
 
 	if (used + bytes > data_sinfo->total_bytes) {
-		u64 prev_total_bytes = data_sinfo->total_bytes;
-		int flush_state = 0;
+		struct reserve_ticket ticket;
 
+		init_waitqueue_head(&ticket.wait);
+		ticket.bytes = bytes;
+		ticket.error = 0;
+		list_add_tail(&ticket.list, &data_sinfo->priority_tickets);
 		spin_unlock(&data_sinfo->lock);
 
-		/*
-		 * Everybody can force chunk allocation, so try this first to
-		 * see if we can just bail here and make our reservation.
-		 */
-		flush_space(fs_info, data_sinfo, bytes, ALLOC_CHUNK_FORCE);
-		spin_lock(&data_sinfo->lock);
-		if (prev_total_bytes < data_sinfo->total_bytes)
-			goto again;
+		ret = handle_reserve_ticket(fs_info, data_sinfo, &ticket,
+					    flush);
+	} else {
+		btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
+		ret = 0;
 		spin_unlock(&data_sinfo->lock);
-
-		/*
-		 * Cycle through the rest of the flushing options for our flush
-		 * type, then try again.
-		 */
-		while (flush_state < states_nr) {
-			u64 flush_bytes = U64_MAX;
-
-			/*
-			 * Previously we unconditionally committed the
-			 * transaction twice before finally checking against
-			 * pinned space before committing the final time.  We
-			 * also skipped flushing delalloc the final pass
-			 * through.
-			 */
-			if (!commit_cycles) {
-				if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-					flush_state++;
-					continue;
-				}
-				if (states[flush_state] == COMMIT_TRANS)
-					flush_bytes = bytes;
-			}
-
-			flush_space(fs_info, data_sinfo, flush_bytes,
-				    states[flush_state]);
-			flush_state++;
-		}
-
-		if (!commit_cycles)
-			goto out;
-
-		commit_cycles--;
-		spin_lock(&data_sinfo->lock);
-		goto again;
 	}
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-	ret = 0;
-	spin_unlock(&data_sinfo->lock);
-out:
 	if (ret)
 		trace_btrfs_space_reservation(fs_info,
 					      "space_info:enospc",
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 16/23] btrfs: serialize data reservations if we are flushing
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (14 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 15/23] btrfs: use ticketing for data space reservations Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 15:06   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 17/23] btrfs: use the same helper for data and metadata reservations Josef Bacik
                   ` (6 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Nikolay reported a problem where generic/371 would fail sometimes with a
slow drive.  The gist of the test is that we fallocate a file in
parallel with a pwrite of a different file.  These two files combined
are smaller than the file system, but sometimes the pwrite would ENOSPC.

A fair bit of investigation uncovered the fact that the fallocate
workload was racing in and grabbing the free space that the pwrite
workload was trying to free up so it could make its own reservation.
After a few loops of this eventually the pwrite workload would error out
with an ENOSPC.

We've had the same problem with metadata as well, and we serialized all
metadata allocations to satisfy this problem.  This wasn't usually a
problem with data because data reservations are more straightforward,
but obviously could still happen.

Fix this by not allowing reservations to occur if there are any pending
tickets waiting to be satisfied on the space info.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 43c5775bcbc6..97379524bac8 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1154,13 +1154,17 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
 	u64 used;
 	int ret = -ENOSPC;
+	bool pending_tickets;
 
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
 	spin_lock(&data_sinfo->lock);
 	used = btrfs_space_info_used(data_sinfo, true);
+	pending_tickets = !list_empty(&data_sinfo->tickets) ||
+		!list_empty(&data_sinfo->priority_tickets);
 
-	if (used + bytes > data_sinfo->total_bytes) {
+	if (pending_tickets ||
+	    used + bytes > data_sinfo->total_bytes) {
 		struct reserve_ticket ticket;
 
 		init_waitqueue_head(&ticket.wait);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 17/23] btrfs: use the same helper for data and metadata reservations
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (15 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 16/23] btrfs: serialize data reservations if we are flushing Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 15:47   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
                   ` (5 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that data reservations follow the same pattern as metadata
reservations we can simply rename __reserve_metadata_bytes to
__reserve_bytes and use that helper for data reservations.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 48 +++++++++++++------------------------------
 1 file changed, 14 insertions(+), 34 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 97379524bac8..13a3692a0122 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1016,10 +1016,9 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
  * regain reservations will be made and this will fail if there is not enough
  * space already.
  */
-static int __reserve_metadata_bytes(struct btrfs_fs_info *fs_info,
-				    struct btrfs_space_info *space_info,
-				    u64 orig_bytes,
-				    enum btrfs_reserve_flush_enum flush)
+static int __reserve_bytes(struct btrfs_fs_info *fs_info,
+			   struct btrfs_space_info *space_info, u64 orig_bytes,
+			   enum btrfs_reserve_flush_enum flush)
 {
 	struct reserve_ticket ticket;
 	u64 used;
@@ -1119,8 +1118,8 @@ int btrfs_reserve_metadata_bytes(struct btrfs_root *root,
 	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
 	int ret;
 
-	ret = __reserve_metadata_bytes(fs_info, block_rsv->space_info,
-				       orig_bytes, flush);
+	ret = __reserve_bytes(fs_info, block_rsv->space_info, orig_bytes,
+			      flush);
 	if (ret == -ENOSPC &&
 	    unlikely(root->orphan_cleanup_state == ORPHAN_CLEANUP_STARTED)) {
 		if (block_rsv != global_rsv &&
@@ -1152,37 +1151,18 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 			     enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	u64 used;
-	int ret = -ENOSPC;
-	bool pending_tickets;
+	int ret;
 
+	ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
+	       flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE);
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	spin_lock(&data_sinfo->lock);
-	used = btrfs_space_info_used(data_sinfo, true);
-	pending_tickets = !list_empty(&data_sinfo->tickets) ||
-		!list_empty(&data_sinfo->priority_tickets);
-
-	if (pending_tickets ||
-	    used + bytes > data_sinfo->total_bytes) {
-		struct reserve_ticket ticket;
-
-		init_waitqueue_head(&ticket.wait);
-		ticket.bytes = bytes;
-		ticket.error = 0;
-		list_add_tail(&ticket.list, &data_sinfo->priority_tickets);
-		spin_unlock(&data_sinfo->lock);
-
-		ret = handle_reserve_ticket(fs_info, data_sinfo, &ticket,
-					    flush);
-	} else {
-		btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-		ret = 0;
-		spin_unlock(&data_sinfo->lock);
-	}
-	if (ret)
-		trace_btrfs_space_reservation(fs_info,
-					      "space_info:enospc",
+	ret = __reserve_bytes(fs_info, data_sinfo, bytes, flush);
+	if (ret == -ENOSPC) {
+		trace_btrfs_space_reservation(fs_info, "space_info:enospc",
 					      data_sinfo->flags, bytes, 1);
+		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+			btrfs_dump_space_info(fs_info, data_sinfo, bytes, 0);
+	}
 	return ret;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (16 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 17/23] btrfs: use the same helper for data and metadata reservations Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 16:02   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction Josef Bacik
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 13a3692a0122..3060754a3341 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -858,7 +858,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -869,21 +868,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -892,11 +879,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (17 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 16:10   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 20/23] btrfs: don't force commit if we are data Josef Bacik
                   ` (3 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This was put into place in order to mirror the way data flushing handled
committing the transaction.  Now that we do not loop on committing the
transaction simply force a transaction commit if we are data.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 3060754a3341..cef14a4d4167 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -412,14 +412,14 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
  * will return -ENOSPC.
  */
 static int may_commit_transaction(struct btrfs_fs_info *fs_info,
-				  struct btrfs_space_info *space_info,
-				  u64 bytes_needed)
+				  struct btrfs_space_info *space_info)
 {
 	struct reserve_ticket *ticket = NULL;
 	struct btrfs_block_rsv *delayed_rsv = &fs_info->delayed_block_rsv;
 	struct btrfs_block_rsv *delayed_refs_rsv = &fs_info->delayed_refs_rsv;
 	struct btrfs_trans_handle *trans;
 	u64 reclaim_bytes = 0;
+	u64 bytes_needed;
 	u64 cur_free_bytes = 0;
 	bool do_commit = false;
 
@@ -428,12 +428,10 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 		return -EAGAIN;
 
 	/*
-	 * If we are data and have passed in U64_MAX we just want to
-	 * unconditionally commit the transaction to match the previous data
-	 * flushing behavior.
+	 * If we are data just force the commit, we aren't likely to do this
+	 * over and over again.
 	 */
-	if ((space_info->flags & BTRFS_BLOCK_GROUP_DATA) &&
-	   bytes_needed == U64_MAX) {
+	if (space_info->flags & BTRFS_BLOCK_GROUP_DATA) {
 		do_commit = true;
 		goto check_pinned;
 	}
@@ -451,7 +449,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	else if (!list_empty(&space_info->tickets))
 		ticket = list_first_entry(&space_info->tickets,
 					  struct reserve_ticket, list);
-	bytes_needed = (ticket) ? ticket->bytes : bytes_needed;
+	bytes_needed = (ticket) ? ticket->bytes : 0;
 
 	if (bytes_needed > cur_free_bytes)
 		bytes_needed -= cur_free_bytes;
@@ -584,7 +582,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		btrfs_wait_on_delayed_iputs(fs_info);
 		break;
 	case COMMIT_TRANS:
-		ret = may_commit_transaction(fs_info, space_info, num_bytes);
+		ret = may_commit_transaction(fs_info, space_info);
 		break;
 	default:
 		ret = -ENOSPC;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 20/23] btrfs: don't force commit if we are data
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (18 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 16:12   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We used to unconditionally commit the transaction at least 2 times and
then on the 3rd try check against pinned space to make sure committing
the transaction was worth the effort.  This is overkill, we know nobody
is going to steal our reservation, and if we can't make our reservation
with the pinned amount simply bail out.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index cef14a4d4167..0c2d8e66cf5e 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -421,21 +421,11 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	u64 reclaim_bytes = 0;
 	u64 bytes_needed;
 	u64 cur_free_bytes = 0;
-	bool do_commit = false;
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	if (trans)
 		return -EAGAIN;
 
-	/*
-	 * If we are data just force the commit, we aren't likely to do this
-	 * over and over again.
-	 */
-	if (space_info->flags & BTRFS_BLOCK_GROUP_DATA) {
-		do_commit = true;
-		goto check_pinned;
-	}
-
 	spin_lock(&space_info->lock);
 	cur_free_bytes = btrfs_space_info_used(space_info, true);
 	if (cur_free_bytes < space_info->total_bytes)
@@ -460,7 +450,6 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	if (!bytes_needed)
 		return 0;
 
-check_pinned:
 	trans = btrfs_join_transaction(fs_info->extent_root);
 	if (IS_ERR(trans))
 		return PTR_ERR(trans);
@@ -470,8 +459,7 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 	 * we have block groups that are going to be freed, allowing us to
 	 * possibly do a chunk allocation the next loop through.
 	 */
-	if (do_commit ||
-	    test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) ||
+	if (test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) ||
 	    __percpu_counter_compare(&space_info->total_bytes_pinned,
 				     bytes_needed,
 				     BTRFS_TOTAL_BYTES_PINNED_BATCH) >= 0)
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (19 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 20/23] btrfs: don't force commit if we are data Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 16:13   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
  2020-01-31 22:36 ` [PATCH 23/23] btrfs: do async reclaim for data reservations Josef Bacik
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Before we were waiting on iputs after we committed the transaction, but
this doesn't really make much sense.  We want to reclaim any space we
may have in order to be more likely to commit the transaction, due to
pinned space being added by running the delayed iputs.  Fix this by
making delayed iputs run before committing the transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 0c2d8e66cf5e..c86fad4174f1 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -804,8 +804,8 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 
 static const enum btrfs_flush_state data_flush_states[] = {
 	FLUSH_DELALLOC_WAIT,
-	COMMIT_TRANS,
 	RUN_DELAYED_IPUTS,
+	COMMIT_TRANS,
 };
 
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (20 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 16:16   ` Nikolay Borisov
  2020-01-31 22:36 ` [PATCH 23/23] btrfs: do async reclaim for data reservations Josef Bacik
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can end up with free'd extents in the delayed refs, and thus
may_commit_transaction() may not think we have enough pinned space to
commit the transaction and we'll ENOSPC early.  Handle this by running
the delayed refs in order to make sure pinned is uptodate before we try
to commit the transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index c86fad4174f1..5b0dc1046daa 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -805,6 +805,7 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 static const enum btrfs_flush_state data_flush_states[] = {
 	FLUSH_DELALLOC_WAIT,
 	RUN_DELAYED_IPUTS,
+	FLUSH_DELAYED_REFS,
 	COMMIT_TRANS,
 };
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 23/23] btrfs: do async reclaim for data reservations
  2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (21 preceding siblings ...)
  2020-01-31 22:36 ` [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
@ 2020-01-31 22:36 ` Josef Bacik
  2020-02-03 17:19   ` Nikolay Borisov
  22 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-01-31 22:36 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have the data ticketing stuff in place, move normal data
reservations to use an async reclaim helper to satisfy tickets.  Before
we could have multiple tasks race in and both allocate chunks, resulting
in more data chunks than we would necessarily need.  Serializing these
allocations and making a single thread responsible for flushing will
only allocate chunks as needed, as well as cut down on transaction
commits and other flush related activities.

Priority reservations will still work as they have before, simply
trying to allocate a chunk until they can make their reservation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h      |   3 +-
 fs/btrfs/disk-io.c    |   2 +-
 fs/btrfs/space-info.c | 123 ++++++++++++++++++++++++++++++------------
 3 files changed, 91 insertions(+), 37 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 865b24a1759e..709823a23c62 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -493,7 +493,7 @@ enum btrfs_orphan_cleanup_state {
 	ORPHAN_CLEANUP_DONE	= 2,
 };
 
-void btrfs_init_async_reclaim_work(struct work_struct *work);
+void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
 
 /* fs_info */
 struct reloc_control;
@@ -917,6 +917,7 @@ struct btrfs_fs_info {
 
 	/* Used to reclaim the metadata space in the background. */
 	struct work_struct async_reclaim_work;
+	struct work_struct async_data_reclaim_work;
 
 	spinlock_t unused_bgs_lock;
 	struct list_head unused_bgs;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 56d0a24aec74..825242f5c3f7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2753,7 +2753,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
 	fs_info->check_integrity_print_mask = 0;
 #endif
 	btrfs_init_balance(fs_info);
-	btrfs_init_async_reclaim_work(&fs_info->async_reclaim_work);
+	btrfs_init_async_reclaim_work(fs_info);
 
 	spin_lock_init(&fs_info->block_group_cache_lock);
 	fs_info->block_group_cache_tree = RB_ROOT;
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 5b0dc1046daa..4249a6711200 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -780,9 +780,83 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
 	} while (flush_state <= COMMIT_TRANS);
 }
 
-void btrfs_init_async_reclaim_work(struct work_struct *work)
+static const enum btrfs_flush_state data_flush_states[] = {
+	FLUSH_DELALLOC_WAIT,
+	RUN_DELAYED_IPUTS,
+	FLUSH_DELAYED_REFS,
+	COMMIT_TRANS,
+};
+
+static void btrfs_async_reclaim_data_space(struct work_struct *work)
 {
-	INIT_WORK(work, btrfs_async_reclaim_metadata_space);
+	struct btrfs_fs_info *fs_info;
+	struct btrfs_space_info *space_info;
+	u64 last_tickets_id;
+	int flush_state = 0;
+
+	fs_info = container_of(work, struct btrfs_fs_info,
+			       async_data_reclaim_work);
+	space_info = fs_info->data_sinfo;
+
+	spin_lock(&space_info->lock);
+	if (list_empty(&space_info->tickets)) {
+		space_info->flush = 0;
+		spin_unlock(&space_info->lock);
+		return;
+	}
+	last_tickets_id = space_info->tickets_id;
+	spin_unlock(&space_info->lock);
+
+	while (!space_info->full) {
+		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
+		spin_lock(&space_info->lock);
+		if (list_empty(&space_info->tickets)) {
+			space_info->flush = 0;
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		last_tickets_id = space_info->tickets_id;
+		spin_unlock(&space_info->lock);
+	}
+
+	while (flush_state < ARRAY_SIZE(data_flush_states)) {
+		flush_space(fs_info, space_info, U64_MAX,
+			    data_flush_states[flush_state]);
+		spin_lock(&space_info->lock);
+		if (list_empty(&space_info->tickets)) {
+			space_info->flush = 0;
+			spin_unlock(&space_info->lock);
+			return;
+		}
+
+		if (last_tickets_id == space_info->tickets_id) {
+			flush_state++;
+		} else {
+			last_tickets_id = space_info->tickets_id;
+			flush_state = 0;
+		}
+
+		if (flush_state >= ARRAY_SIZE(data_flush_states)) {
+			if (space_info->full) {
+				if (maybe_fail_all_tickets(fs_info,
+							   space_info))
+					flush_state = 0;
+				else
+					space_info->flush = 0;
+			} else {
+				flush_state = 0;
+			}
+		}
+		spin_unlock(&space_info->lock);
+	}
+}
+
+void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info)
+{
+	INIT_WORK(&fs_info->async_reclaim_work,
+		  btrfs_async_reclaim_metadata_space);
+	INIT_WORK(&fs_info->async_data_reclaim_work,
+		  btrfs_async_reclaim_data_space);
 }
 
 static const enum btrfs_flush_state priority_flush_states[] = {
@@ -802,13 +876,6 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	COMMIT_TRANS,
 };
 
-static const enum btrfs_flush_state data_flush_states[] = {
-	FLUSH_DELALLOC_WAIT,
-	RUN_DELAYED_IPUTS,
-	FLUSH_DELAYED_REFS,
-	COMMIT_TRANS,
-};
-
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket,
@@ -840,12 +907,8 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 
 static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					struct btrfs_space_info *space_info,
-					struct reserve_ticket *ticket,
-					const enum btrfs_flush_state *states,
-					int states_nr)
+					struct reserve_ticket *ticket)
 {
-	int flush_state = 0;
-
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
 		spin_lock(&space_info->lock);
@@ -855,17 +918,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-
-	while (flush_state < states_nr) {
-		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
-		spin_lock(&space_info->lock);
-		if (ticket->bytes == 0) {
-			spin_unlock(&space_info->lock);
-			return;
-		}
-		spin_unlock(&space_info->lock);
-		flush_state++;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
@@ -920,6 +972,7 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 	int ret;
 
 	switch (flush) {
+	case BTRFS_RESERVE_FLUSH_DATA:
 	case BTRFS_RESERVE_FLUSH_ALL:
 		wait_reserve_ticket(fs_info, space_info, ticket);
 		break;
@@ -933,14 +986,8 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						evict_flush_states,
 						ARRAY_SIZE(evict_flush_states));
 		break;
-	case BTRFS_RESERVE_FLUSH_DATA:
-		priority_reclaim_data_space(fs_info, space_info, ticket,
-					data_flush_states,
-					ARRAY_SIZE(data_flush_states));
-		break;
 	case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
-		priority_reclaim_data_space(fs_info, space_info, ticket,
-					    NULL, 0);
+		priority_reclaim_data_space(fs_info, space_info, ticket);
 		break;
 	default:
 		ASSERT(0);
@@ -989,6 +1036,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 			   struct btrfs_space_info *space_info, u64 orig_bytes,
 			   enum btrfs_reserve_flush_enum flush)
 {
+	struct work_struct *async_work;
 	struct reserve_ticket ticket;
 	u64 used;
 	int ret = 0;
@@ -997,6 +1045,11 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 	ASSERT(orig_bytes);
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_ALL);
 
+	if (flush == BTRFS_RESERVE_FLUSH_DATA)
+		async_work = &fs_info->async_data_reclaim_work;
+	else
+		async_work = &fs_info->async_reclaim_work;
+
 	spin_lock(&space_info->lock);
 	ret = -ENOSPC;
 	used = btrfs_space_info_used(space_info, true);
@@ -1026,7 +1079,8 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 		ticket.bytes = orig_bytes;
 		ticket.error = 0;
 		init_waitqueue_head(&ticket.wait);
-		if (flush == BTRFS_RESERVE_FLUSH_ALL) {
+		if (flush == BTRFS_RESERVE_FLUSH_ALL ||
+		    flush == BTRFS_RESERVE_FLUSH_DATA) {
 			list_add_tail(&ticket.list, &space_info->tickets);
 			if (!space_info->flush) {
 				space_info->flush = 1;
@@ -1034,8 +1088,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 							  space_info->flags,
 							  orig_bytes, flush,
 							  "enospc");
-				queue_work(system_unbound_wq,
-					   &fs_info->async_reclaim_work);
+				queue_work(system_unbound_wq, async_work);
 			}
 		} else {
 			list_add_tail(&ticket.list,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
@ 2020-02-03 10:45   ` Johannes Thumshirn
  2020-02-03 11:15   ` Nikolay Borisov
  1 sibling, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 10:45 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 02/23] btrfs: remove orig from shrink_delalloc
  2020-01-31 22:35 ` [PATCH 02/23] btrfs: remove orig from shrink_delalloc Josef Bacik
@ 2020-02-03 10:46   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 10:46 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg
  2020-01-31 22:35 ` [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
@ 2020-02-03 10:53   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 10:53 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc
  2020-01-31 22:35 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
@ 2020-02-03 11:01   ` Johannes Thumshirn
  2020-02-03 11:05   ` Johannes Thumshirn
  1 sibling, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 11:01 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc
  2020-01-31 22:35 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
  2020-02-03 11:01   ` Johannes Thumshirn
@ 2020-02-03 11:05   ` Johannes Thumshirn
  1 sibling, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 11:05 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
  2020-02-03 10:45   ` Johannes Thumshirn
@ 2020-02-03 11:15   ` Nikolay Borisov
  1 sibling, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 11:15 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:35 ч., Josef Bacik wrote:
> We have btrfs_wait_ordered_roots() which takes a u64 for nr, but
> btrfs_start_delalloc_roots() that takes an int for nr, which makes using
> them in conjunction, especially for something like (u64)-1, annoying and
> inconsistent.  Fix btrfs_start_delalloc_roots() to take a u64 for nr and
> adjust start_delalloc_inodes() and it's callers appropriately.
> 
> This means we've adjusted start_delalloc_inodes() to take a pointer of
> nr since we want to preserve the ability for start-delalloc_inodes() to
> return an error, so simply make it do the nr adjusting as necessary.
> 
> Part of adjusting the callers to this means changing
> btrfs_writeback_inodes_sb_nr() to take a u64 for items.  This may be
> confusing because it seems unrelated, but the caller of
> btrfs_writeback_inodes_sb_nr() already passes in a u64, it's just the
> function variable that needs to be changed.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags
  2020-01-31 22:35 ` [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
@ 2020-02-03 11:19   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 11:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything
  2020-01-31 22:35 ` [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
@ 2020-02-03 11:24   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 11:24 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc
  2020-01-31 22:36 ` [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc Josef Bacik
@ 2020-02-03 12:03   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 12:03 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> The original iteration of flushing had us flushing delalloc and then
> checking to see if we could make our reservation, thus we were very
> careful about how many pages we would flush at once.
> 
> But now that everything is async and we satisfy tickets as the space
> becomes available we don't have to keep track of any of this, simply try
> and flush the number of dirty inodes we may have in order to reclaim
> space to make our reservation.  This cleans up our delalloc flushing
> significantly.
> 
> The async_pages stuff is dropped because btrfs_start_delalloc_roots()
> handles the case that we generate async extents for us, so we no longer
> require this extra logic.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>


This change really equates to calling ->writepages() for every delalloc
inode. The old code in btrfs_writeback_inodes_sb_nr was more or less
doing the same. So :

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/23] btrfs: check tickets after waiting on ordered extents
  2020-01-31 22:36 ` [PATCH 11/23] btrfs: check tickets after waiting on ordered extents Josef Bacik
@ 2020-02-03 13:10   ` Nikolay Borisov
  2020-02-03 15:57     ` Josef Bacik
  0 siblings, 1 reply; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 13:10 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Right now if the space is free'd up after the ordered extents complete
> (which is likely since the reservations are held until they complete),
> we would do extra delalloc flushing before we'd notice that we didn't
> have any more tickets.  Fix this by moving the tickets check after our
> wait_ordered_extents check.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

This patch makes sense only for metadata. Is this your intention -
tweaking the metadata change behavior? Correct me if I'm wrong but

btrfs_start_delalloc_roots from previous patch will essentially call
btrfs_run_delalloc_range for the roots which will create ordered extents in:
btrfs_run_delalloc_range
  cow_file_range
   add_ordered_extents

Following page writeout, from the endio routines we'll eventually do:

finish_ordered_fn
 btrfs_finish_ordered_io
  insert_reserved_file_extent
   btrfs_alloc_reserved_file_extent
    create delayed ref  <---- after the delayed extent is run this will
free some data space. But this happens in transaction commit context and
not when runnig ordered extents
  btrfs_remove_ordered_extent
   btrfs_delalloc_release_metadata <- this is only for metadata
    btrfs_inode_rsv_release
     __btrfs_block_rsv_release <-- frees metadata but not data?


I'm looking those patches thinking every change should be pertinent to
data space but apparently it's not. If so I think it will be best if you
update the cover letter for V2 to mention which patches can go in
independently or give more context why this particular patch is
pertinent to data flush.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction
  2020-01-31 22:36 ` [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction Josef Bacik
@ 2020-02-03 14:00   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 14:00 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Data space flushing currently unconditionally commits the transaction
> twice in a row, and the last time it checks if there's enough pinned
> extents to satisfy it's reservation before deciding to commit the
> transaction for the 3rd and final time.
> 
> Encode this logic into may_commit_transaction().  In the next patch we

This is incorrect since the next patch simply introduces some states. So
this patch and the next one should be transposed. I guess even David can
do this but he needs to be explicitly aware of this.

> will pass in U64_MAX for bytes_needed the first two times, and the final
> time we will pass in the actual bytes we need so the normal logic will
> apply.
> 
> This patch exists soley to make the logical changes I will make to the
> flushing state machine separate to make it easier to bisect any
> performance related regressions.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Other than the nit in the changelog this LGTM:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>


<snip>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 13/23] btrfs: add flushing states for handling data reservations
  2020-01-31 22:36 ` [PATCH 13/23] btrfs: add flushing states for handling data reservations Josef Bacik
@ 2020-02-03 14:00   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 14:00 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Currently the way we do data reservations is by seeing if we have enough
> space in our space_info.  If we do not and we're a normal inode we'll
> 
> 1) Attempt to force a chunk allocation until we can't anymore.
> 2) If that fails we'll flush delalloc, then commit the transaction, then
>    run the delayed iputs.
> 
> If we are a free space inode we're only allowed to force a chunk
> allocation.  In order to use the normal flushing mechanism we need to
> encode this into a flush state array for normal inodes.  Since both will
> start with allocating chunks until the space info is full there is no
> need to add this as a flush state, this will be handled specially.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it
  2020-01-31 22:36 ` [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
@ 2020-02-03 14:20   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 14:20 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Create a new function btrfs_reserve_data_bytes() in order to handle data
> reservations.  This uses the new flush types and flush states to handle
> making data reservations.
> 
> This patch specifically does not change any functionality, and is
> purposefully not cleaned up in order to make bisection easier for the
> future patches.  The new helper is identical to the old helper in how it
> handles data reservations.  We first try to force a chunk allocation,
> and then we run through the flush states all at once and in the same
> order that they were done with the old helper.
> 
> Subsequent patches will clean this up and change the behavior of the
> flushing, and it is important to keep those changes separate so we can
> easily bisect down to the patch that caused the regression, rather than
> the patch that made us start using the new infrastructure.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>



Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 15/23] btrfs: use ticketing for data space reservations
  2020-01-31 22:36 ` [PATCH 15/23] btrfs: use ticketing for data space reservations Josef Bacik
@ 2020-02-03 14:29   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 14:29 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Now that we have all the infrastructure in place, use the ticketing
> infrastructure to make data allocations.  This still maintains the exact
> same flushing behavior, but now we're using tickets to get our
> reservations satisfied.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/space-info.c | 128 ++++++++++++++++++++++--------------------
>  1 file changed, 68 insertions(+), 60 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 6b71f6d3a348..43c5775bcbc6 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -829,7 +829,7 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
>  				int states_nr)
>  {
>  	u64 to_reclaim;
> -	int flush_state;
> +	int flush_state = 0;
>  
>  	spin_lock(&space_info->lock);
>  	to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info);
> @@ -839,7 +839,6 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
>  	}
>  	spin_unlock(&space_info->lock);
>  
> -	flush_state = 0;
>  	do {
>  		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
>  		flush_state++;

nit: Those 2 hunks are unrelated and can be dropped.

<snip>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc
  2020-01-31 22:35 ` [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
@ 2020-02-03 14:46   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-03 14:46 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 16/23] btrfs: serialize data reservations if we are flushing
  2020-01-31 22:36 ` [PATCH 16/23] btrfs: serialize data reservations if we are flushing Josef Bacik
@ 2020-02-03 15:06   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 15:06 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Nikolay reported a problem where generic/371 would fail sometimes with a
> slow drive.  The gist of the test is that we fallocate a file in
> parallel with a pwrite of a different file.  These two files combined
> are smaller than the file system, but sometimes the pwrite would ENOSPC.
> 
> A fair bit of investigation uncovered the fact that the fallocate
> workload was racing in and grabbing the free space that the pwrite
> workload was trying to free up so it could make its own reservation.
> After a few loops of this eventually the pwrite workload would error out
> with an ENOSPC.
> 
> We've had the same problem with metadata as well, and we serialized all
> metadata allocations to satisfy this problem.  This wasn't usually a
> problem with data because data reservations are more straightforward,
> but obviously could still happen.
> 
> Fix this by not allowing reservations to occur if there are any pending
> tickets waiting to be satisfied on the space info.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 17/23] btrfs: use the same helper for data and metadata reservations
  2020-01-31 22:36 ` [PATCH 17/23] btrfs: use the same helper for data and metadata reservations Josef Bacik
@ 2020-02-03 15:47   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 15:47 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Now that data reservations follow the same pattern as metadata
> reservations we can simply rename __reserve_metadata_bytes to
> __reserve_bytes and use that helper for data reservations.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

It's indeed identical but it's somewhat cryptic e.g.
btrfs_can_overcommit will return 0 for data space so if we don't have
space we proceed further down to setup our ticket. Since for data we can
never be called with BTRFS_RESERVE_FLUSH_ALL we just add a priority
ticket and eventually execute handle_reserve_ticket in __reserve_bytes.

Please add the above explanation to the changelog or David can do it if
he deems it necessary. Nevertheless,

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 11/23] btrfs: check tickets after waiting on ordered extents
  2020-02-03 13:10   ` Nikolay Borisov
@ 2020-02-03 15:57     ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-02-03 15:57 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 2/3/20 8:10 AM, Nikolay Borisov wrote:
> 
> 
> On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
>> Right now if the space is free'd up after the ordered extents complete
>> (which is likely since the reservations are held until they complete),
>> we would do extra delalloc flushing before we'd notice that we didn't
>> have any more tickets.  Fix this by moving the tickets check after our
>> wait_ordered_extents check.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> This patch makes sense only for metadata. Is this your intention -
> tweaking the metadata change behavior? Correct me if I'm wrong but
> 
> btrfs_start_delalloc_roots from previous patch will essentially call
> btrfs_run_delalloc_range for the roots which will create ordered extents in:
> btrfs_run_delalloc_range
>    cow_file_range
>     add_ordered_extents
> 
> Following page writeout, from the endio routines we'll eventually do:
> 
> finish_ordered_fn
>   btrfs_finish_ordered_io
>    insert_reserved_file_extent
>     btrfs_alloc_reserved_file_extent
>      create delayed ref  <---- after the delayed extent is run this will
> free some data space. But this happens in transaction commit context and
> not when runnig ordered extents
>    btrfs_remove_ordered_extent
>     btrfs_delalloc_release_metadata <- this is only for metadata
>      btrfs_inode_rsv_release
>       __btrfs_block_rsv_release <-- frees metadata but not data?
> 
> 
> I'm looking those patches thinking every change should be pertinent to
> data space but apparently it's not. If so I think it will be best if you
> update the cover letter for V2 to mention which patches can go in
> independently or give more context why this particular patch is
> pertinent to data flush.
> 


Specifically what I'm addressing here for data is this behavior

btrfs_start_delalloc_roots()
   ->check space_info->tickets, it's not empty
btrfs_finish_ordered_io()
   -> we drop a previously uncompressed extent (ie larger one) for a newly
      compressed extent, now space_info->tickets _is_ empty
loop again despite having no tickets to flush for.

Does this scenario happen all the time?  Nope, but I noticed it was happening 
sometimes with the data intensive enopsc tests xfstests so I threw it in there 
because it made a tiny difference, plus it's actually correct.

It definitely affects the metadata case more for sure, but I only noticed it 
when I forgot I had compression enabled for one of my xfstests runs.  Thanks,

Joosef

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-01-31 22:36 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
@ 2020-02-03 16:02   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 16:02 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> This was an old wart left over from how we previously did data
> reservations.  Before we could have people race in and take a
> reservation while we were flushing space, so we needed to make sure we
> looped a few times before giving up.  Now that we're using the ticketing
> infrastructure we don't have to worry about this and can drop the logic
> altogether.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/space-info.c | 22 ++--------------------
>  1 file changed, 2 insertions(+), 20 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 13a3692a0122..3060754a3341 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -858,7 +858,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
>  					int states_nr)
>  {
>  	int flush_state = 0;
> -	int commit_cycles = 2;
>  
>  	while (!space_info->full) {
>  		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
> @@ -869,21 +868,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
>  		}
>  		spin_unlock(&space_info->lock);
>  	}
> -again:
> -	while (flush_state < states_nr) {
> -		u64 flush_bytes = U64_MAX;
> -
> -		if (!commit_cycles) {
> -			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
> -				flush_state++;
> -				continue;
> -			}
> -			if (states[flush_state] == COMMIT_TRANS)
> -				flush_bytes = ticket->bytes;
> -		}
>  
> -		flush_space(fs_info, space_info, flush_bytes,
> -			    states[flush_state]);
> +	while (flush_state < states_nr) {
> +		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);

nit: Wouldn't make much of a difference but I wonder if the semantic
here is closer to a for loop rather than a while.

>  		spin_lock(&space_info->lock);
>  		if (ticket->bytes == 0) {
>  			spin_unlock(&space_info->lock);
> @@ -892,11 +879,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
>  		spin_unlock(&space_info->lock);
>  		flush_state++;
>  	}
> -	if (commit_cycles) {
> -		commit_cycles--;
> -		flush_state = 0;
> -		goto again;
> -	}
>  }
>  
>  static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction
  2020-01-31 22:36 ` [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction Josef Bacik
@ 2020-02-03 16:10   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 16:10 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> This was put into place in order to mirror the way data flushing handled
> committing the transaction.  Now that we do not loop on committing the
> transaction simply force a transaction commit if we are data.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/space-info.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 3060754a3341..cef14a4d4167 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -412,14 +412,14 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
>   * will return -ENOSPC.
>   */
>  static int may_commit_transaction(struct btrfs_fs_info *fs_info,
> -				  struct btrfs_space_info *space_info,
> -				  u64 bytes_needed)
> +				  struct btrfs_space_info *space_info)
>  {
>  	struct reserve_ticket *ticket = NULL;
>  	struct btrfs_block_rsv *delayed_rsv = &fs_info->delayed_block_rsv;
>  	struct btrfs_block_rsv *delayed_refs_rsv = &fs_info->delayed_refs_rsv;
>  	struct btrfs_trans_handle *trans;
>  	u64 reclaim_bytes = 0;
> +	u64 bytes_needed;
>  	u64 cur_free_bytes = 0;
>  	bool do_commit = false;
>  
> @@ -428,12 +428,10 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
>  		return -EAGAIN;
>  
>  	/*
> -	 * If we are data and have passed in U64_MAX we just want to
> -	 * unconditionally commit the transaction to match the previous data
> -	 * flushing behavior.
> +	 * If we are data just force the commit, we aren't likely to do this
> +	 * over and over again.
>  	 */

I don't think the 2nd part of the comment brings any value hence can  be
removed.

> -	if ((space_info->flags & BTRFS_BLOCK_GROUP_DATA) &&
> -	   bytes_needed == U64_MAX) {
> +	if (space_info->flags & BTRFS_BLOCK_GROUP_DATA) {
>  		do_commit = true;
>  		goto check_pinned;
>  	}

<snip>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 20/23] btrfs: don't force commit if we are data
  2020-01-31 22:36 ` [PATCH 20/23] btrfs: don't force commit if we are data Josef Bacik
@ 2020-02-03 16:12   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 16:12 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> We used to unconditionally commit the transaction at least 2 times and
> then on the 3rd try check against pinned space to make sure committing
> the transaction was worth the effort.  This is overkill, we know nobody
> is going to steal our reservation, and if we can't make our reservation
> with the pinned amount simply bail out.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

I think the previous patch can be dropped and simply this one applied
after it.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data
  2020-01-31 22:36 ` [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
@ 2020-02-03 16:13   ` Nikolay Borisov
  0 siblings, 0 replies; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 16:13 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Before we were waiting on iputs after we committed the transaction, but
> this doesn't really make much sense.  We want to reclaim any space we
> may have in order to be more likely to commit the transaction, due to
> pinned space being added by running the delayed iputs.  Fix this by
> making delayed iputs run before committing the transaction.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/space-info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 0c2d8e66cf5e..c86fad4174f1 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -804,8 +804,8 @@ static const enum btrfs_flush_state evict_flush_states[] = {
>  
>  static const enum btrfs_flush_state data_flush_states[] = {
>  	FLUSH_DELALLOC_WAIT,
> -	COMMIT_TRANS,
>  	RUN_DELAYED_IPUTS,
> +	COMMIT_TRANS,
>  };
>  
>  static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space
  2020-01-31 22:36 ` [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
@ 2020-02-03 16:16   ` Nikolay Borisov
  2020-02-03 16:24     ` Josef Bacik
  0 siblings, 1 reply; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 16:16 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> We can end up with free'd extents in the delayed refs, and thus
> may_commit_transaction() may not think we have enough pinned space to
> commit the transaction and we'll ENOSPC early.  Handle this by running
> the delayed refs in order to make sure pinned is uptodate before we try
> to commit the transaction.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Actually wouldn't it be better if the stages are structured:

FLUSH_DELALLOC which potentially creates a bunch of delayed refs,
FLUSH_DELAYED_REFS which in turn could free some reservations. Then IPUT
to process anything from delayed refs running and finally committing
transaction to reclaim pinned space?

Codewise this is ok.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/space-info.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index c86fad4174f1..5b0dc1046daa 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -805,6 +805,7 @@ static const enum btrfs_flush_state evict_flush_states[] = {
>  static const enum btrfs_flush_state data_flush_states[] = {
>  	FLUSH_DELALLOC_WAIT,
>  	RUN_DELAYED_IPUTS,
> +	FLUSH_DELAYED_REFS,
>  	COMMIT_TRANS,
>  };
>  
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space
  2020-02-03 16:16   ` Nikolay Borisov
@ 2020-02-03 16:24     ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-02-03 16:24 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 2/3/20 11:16 AM, Nikolay Borisov wrote:
> 
> 
> On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
>> We can end up with free'd extents in the delayed refs, and thus
>> may_commit_transaction() may not think we have enough pinned space to
>> commit the transaction and we'll ENOSPC early.  Handle this by running
>> the delayed refs in order to make sure pinned is uptodate before we try
>> to commit the transaction.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Actually wouldn't it be better if the stages are structured:
> 
> FLUSH_DELALLOC which potentially creates a bunch of delayed refs,
> FLUSH_DELAYED_REFS which in turn could free some reservations. Then IPUT
> to process anything from delayed refs running and finally committing
> transaction to reclaim pinned space?
> 

It's more about doing the least amount of work for the largest chance of success.

Flushing delalloc will reclaim space in one of two ways, first the compression 
case where we allocate less disk space than we had reserved, second the free'ing 
of extents we drop when overwriting old space.  In order to get that free'd 
space we'd have to wait on delayed refs to make sure pinned was uptodate before 
committing the transaction.

Running delayed iputs only reclaims space in the form of delayed refs flushing 
and then committing the transaction to free that pinned area.  So at this point 
we have to run delayed refs.  Running delayed refs isn't free, so I'd rather 
just coalesce these two operations together and then run delayed refs so we're 
completely sure we need to commit the transaction if we get to that point.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 23/23] btrfs: do async reclaim for data reservations
  2020-01-31 22:36 ` [PATCH 23/23] btrfs: do async reclaim for data reservations Josef Bacik
@ 2020-02-03 17:19   ` Nikolay Borisov
  2020-02-03 18:51     ` Josef Bacik
  0 siblings, 1 reply; 56+ messages in thread
From: Nikolay Borisov @ 2020-02-03 17:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
> Now that we have the data ticketing stuff in place, move normal data
> reservations to use an async reclaim helper to satisfy tickets.  Before
> we could have multiple tasks race in and both allocate chunks, resulting
> in more data chunks than we would necessarily need.  Serializing these
> allocations and making a single thread responsible for flushing will
> only allocate chunks as needed, as well as cut down on transaction
> commits and other flush related activities.
> 
> Priority reservations will still work as they have before, simply
> trying to allocate a chunk until they can make their reservation.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com> but look below for one
minor nit.

> ---
>  fs/btrfs/ctree.h      |   3 +-
>  fs/btrfs/disk-io.c    |   2 +-
>  fs/btrfs/space-info.c | 123 ++++++++++++++++++++++++++++++------------
>  3 files changed, 91 insertions(+), 37 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 865b24a1759e..709823a23c62 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -493,7 +493,7 @@ enum btrfs_orphan_cleanup_state {
>  	ORPHAN_CLEANUP_DONE	= 2,
>  };
>  
> -void btrfs_init_async_reclaim_work(struct work_struct *work);
> +void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);

nit: This doesn't bring much value apart from introducing yet another
function, I'd rather have the 2 INIT_WORK calls open coded in init_fs_info.

>  
>  /* fs_info */
>  struct reloc_control;
> @@ -917,6 +917,7 @@ struct btrfs_fs_info {
>  
>  	/* Used to reclaim the metadata space in the background. */
>  	struct work_struct async_reclaim_work;
> +	struct work_struct async_data_reclaim_work;
>  
>  	spinlock_t unused_bgs_lock;
>  	struct list_head unused_bgs;
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 56d0a24aec74..825242f5c3f7 100644

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 23/23] btrfs: do async reclaim for data reservations
  2020-02-03 17:19   ` Nikolay Borisov
@ 2020-02-03 18:51     ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-02-03 18:51 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 2/3/20 12:19 PM, Nikolay Borisov wrote:
> 
> 
> On 1.02.20 г. 0:36 ч., Josef Bacik wrote:
>> Now that we have the data ticketing stuff in place, move normal data
>> reservations to use an async reclaim helper to satisfy tickets.  Before
>> we could have multiple tasks race in and both allocate chunks, resulting
>> in more data chunks than we would necessarily need.  Serializing these
>> allocations and making a single thread responsible for flushing will
>> only allocate chunks as needed, as well as cut down on transaction
>> commits and other flush related activities.
>>
>> Priority reservations will still work as they have before, simply
>> trying to allocate a chunk until they can make their reservation.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Reviewed-by: Nikolay Borisov <nborisov@suse.com> but look below for one
> minor nit.
> 
>> ---
>>   fs/btrfs/ctree.h      |   3 +-
>>   fs/btrfs/disk-io.c    |   2 +-
>>   fs/btrfs/space-info.c | 123 ++++++++++++++++++++++++++++++------------
>>   3 files changed, 91 insertions(+), 37 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 865b24a1759e..709823a23c62 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -493,7 +493,7 @@ enum btrfs_orphan_cleanup_state {
>>   	ORPHAN_CLEANUP_DONE	= 2,
>>   };
>>   
>> -void btrfs_init_async_reclaim_work(struct work_struct *work);
>> +void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
> 
> nit: This doesn't bring much value apart from introducing yet another
> function, I'd rather have the 2 INIT_WORK calls open coded in init_fs_info.

Then I'd have to export the two reclaim functions, thats why this exists.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-07-21 14:22 [PATCH 00/23][v4] Change data reservations to use the ticketing infra Josef Bacik
@ 2020-07-21 14:22 ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-07-21 14:22 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 605be9f4fcac..5efb3eb9253d 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1077,7 +1077,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -1088,21 +1087,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -1111,11 +1098,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-07-08 13:59 [PATCH 00/23][v3] Change data reservations to use the ticketing infra Josef Bacik
@ 2020-07-08 14:00 ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-07-08 14:00 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 605be9f4fcac..5efb3eb9253d 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1077,7 +1077,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -1088,21 +1087,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -1111,11 +1098,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-07-07 15:42 [PATCH 00/23][v2] Change data reservations to use the ticketing infra Josef Bacik
@ 2020-07-07 15:42 ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-07-07 15:42 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 1e070cd485bc..59d0ad1736e8 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1077,7 +1077,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -1088,21 +1087,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -1111,11 +1098,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-06-30 13:58 [PATCH 00/23] Change data reservations to use the ticketing infra Josef Bacik
@ 2020-06-30 13:59 ` Josef Bacik
  0 siblings, 0 replies; 56+ messages in thread
From: Josef Bacik @ 2020-06-30 13:59 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b5a007fab743..c88a31210b9a 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1075,7 +1075,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -1086,21 +1085,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -1109,11 +1096,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-02-04 16:19 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
@ 2020-02-05  7:44   ` Johannes Thumshirn
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Thumshirn @ 2020-02-05  7:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team; +Cc: Nikolay Borisov

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations
  2020-02-04 16:19 [PATCH 0/23][v4] Convert data reservations to the ticketing infrastructure Josef Bacik
@ 2020-02-04 16:19 ` Josef Bacik
  2020-02-05  7:44   ` Johannes Thumshirn
  0 siblings, 1 reply; 56+ messages in thread
From: Josef Bacik @ 2020-02-04 16:19 UTC (permalink / raw)
  To: linux-btrfs, kernel-team; +Cc: Nikolay Borisov

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 22 ++--------------------
 1 file changed, 2 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 225f7f8a0503..8c5543328ec4 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -859,7 +859,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					int states_nr)
 {
 	int flush_state = 0;
-	int commit_cycles = 2;
 
 	while (!space_info->full) {
 		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
@@ -870,21 +869,9 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		}
 		spin_unlock(&space_info->lock);
 	}
-again:
-	while (flush_state < states_nr) {
-		u64 flush_bytes = U64_MAX;
-
-		if (!commit_cycles) {
-			if (states[flush_state] == FLUSH_DELALLOC_WAIT) {
-				flush_state++;
-				continue;
-			}
-			if (states[flush_state] == COMMIT_TRANS)
-				flush_bytes = ticket->bytes;
-		}
 
-		flush_space(fs_info, space_info, flush_bytes,
-			    states[flush_state]);
+	while (flush_state < states_nr) {
+		flush_space(fs_info, space_info, U64_MAX, states[flush_state]);
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
@@ -893,11 +880,6 @@ static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 		spin_unlock(&space_info->lock);
 		flush_state++;
 	}
-	if (commit_cycles) {
-		commit_cycles--;
-		flush_state = 0;
-		goto again;
-	}
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2020-07-21 14:23 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31 22:35 [PATCH 00/23][v2] Convert data reservations to the ticketing infrastructure Josef Bacik
2020-01-31 22:35 ` [PATCH 01/23] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
2020-02-03 10:45   ` Johannes Thumshirn
2020-02-03 11:15   ` Nikolay Borisov
2020-01-31 22:35 ` [PATCH 02/23] btrfs: remove orig from shrink_delalloc Josef Bacik
2020-02-03 10:46   ` Johannes Thumshirn
2020-01-31 22:35 ` [PATCH 03/23] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
2020-02-03 11:01   ` Johannes Thumshirn
2020-02-03 11:05   ` Johannes Thumshirn
2020-01-31 22:35 ` [PATCH 04/23] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
2020-02-03 10:53   ` Johannes Thumshirn
2020-01-31 22:35 ` [PATCH 05/23] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
2020-02-03 11:19   ` Johannes Thumshirn
2020-01-31 22:35 ` [PATCH 06/23] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
2020-01-31 22:35 ` [PATCH 07/23] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
2020-02-03 11:24   ` Johannes Thumshirn
2020-01-31 22:35 ` [PATCH 08/23] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
2020-01-31 22:35 ` [PATCH 09/23] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
2020-02-03 14:46   ` Johannes Thumshirn
2020-01-31 22:36 ` [PATCH 10/23] btrfs: use btrfs_start_delalloc_roots in shrink_delalloc Josef Bacik
2020-02-03 12:03   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 11/23] btrfs: check tickets after waiting on ordered extents Josef Bacik
2020-02-03 13:10   ` Nikolay Borisov
2020-02-03 15:57     ` Josef Bacik
2020-01-31 22:36 ` [PATCH 12/23] btrfs: add the data transaction commit logic into may_commit_transaction Josef Bacik
2020-02-03 14:00   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 13/23] btrfs: add flushing states for handling data reservations Josef Bacik
2020-02-03 14:00   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 14/23] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
2020-02-03 14:20   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 15/23] btrfs: use ticketing for data space reservations Josef Bacik
2020-02-03 14:29   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 16/23] btrfs: serialize data reservations if we are flushing Josef Bacik
2020-02-03 15:06   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 17/23] btrfs: use the same helper for data and metadata reservations Josef Bacik
2020-02-03 15:47   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-02-03 16:02   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 19/23] btrfs: don't pass bytes_needed to may_commit_transaction Josef Bacik
2020-02-03 16:10   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 20/23] btrfs: don't force commit if we are data Josef Bacik
2020-02-03 16:12   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 21/23] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
2020-02-03 16:13   ` Nikolay Borisov
2020-01-31 22:36 ` [PATCH 22/23] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
2020-02-03 16:16   ` Nikolay Borisov
2020-02-03 16:24     ` Josef Bacik
2020-01-31 22:36 ` [PATCH 23/23] btrfs: do async reclaim for data reservations Josef Bacik
2020-02-03 17:19   ` Nikolay Borisov
2020-02-03 18:51     ` Josef Bacik
2020-02-04 16:19 [PATCH 0/23][v4] Convert data reservations to the ticketing infrastructure Josef Bacik
2020-02-04 16:19 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-02-05  7:44   ` Johannes Thumshirn
2020-06-30 13:58 [PATCH 00/23] Change data reservations to use the ticketing infra Josef Bacik
2020-06-30 13:59 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-07-07 15:42 [PATCH 00/23][v2] Change data reservations to use the ticketing infra Josef Bacik
2020-07-07 15:42 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-07-08 13:59 [PATCH 00/23][v3] Change data reservations to use the ticketing infra Josef Bacik
2020-07-08 14:00 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik
2020-07-21 14:22 [PATCH 00/23][v4] Change data reservations to use the ticketing infra Josef Bacik
2020-07-21 14:22 ` [PATCH 18/23] btrfs: drop the commit_cycles stuff for data reservations Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.