All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure
@ 2020-01-29 23:50 Josef Bacik
  2020-01-29 23:50 ` [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
                   ` (19 more replies)
  0 siblings, 20 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Nikolay reported a problem where we'd occasionally fail generic/371.  This test
has two tasks running in a loop, one that fallocate and rm's, and one that
pwrite's and rm's.  There is enough space for both of these files to exist at
one time, but sometimes the pwrite would fail.

It would fail because we do not serialize data reseravtions.  If one task is
stuck doing the reclaim work, and another task comes in and steals it's
reservation enough times, we'll give up and return ENOSPC.  We validated this by
adding a printk to the data reservation path to tell us that it was succeeding
at making a reservation while another task was flushing.

To solve this problem I've converted data reservations over to the ticketing
system that metadata uses.  There are some cleanups and some fixes that have to
come before we could do that.  The following are simply cleanups

  [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  [PATCH 02/20] btrfs: remove orig from shrink_delalloc
  [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc

The following are fixes that are needed to handle data space infos properly.

  [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg
  [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags
  [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing
  [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning
  [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving
  [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use

I then converted the data reservation path over to the ticketing infrastructure,
but I did it in a way that mirrored exactly what we currently have.  The idea is
that I want to be able to bisect regressions that happen from behavior change,
and doing that would be hard if I just had a single patch doing the whole
conversion at once.  So the following patches are only moving code around
logically, but preserve the same behavior as before

  [PATCH 10/20] btrfs: add flushing states for handling data
  [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it
  [PATCH 12/20] btrfs: use ticketing for data space reservations

And then the following patches were changing the behavior of how we flush space
for data reservations.

  [PATCH 13/20] btrfs: run delayed iputs before committing the
  [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data
  [PATCH 15/20] btrfs: serialize data reservations if we are flushing
  [PATCH 16/20] btrfs: rework chunk allocate for data reservations
  [PATCH 17/20] btrfs: drop the commit_cycles stuff for data

And then a cleanup now that the data reservation code is the same as the
metadata reservation code.

  [PATCH 18/20] btrfs: use the same helper for data and metadata

Finally a patch to change the flushing from direct to asynchronous, mirroring
what we do for metadata for better latency.

  [PATCH 19/20] btrfs: do async reclaim for data reservations

And then a final cleanup now that we're where we want to be with the data
reservation path.

  [PATCH 20/20] btrfs: kill the priority_reclaim_space helper

I've marked this as an RFC because I've only tested this with generic/371.  I'm
starting my overnight runs of xfstests now and will likely find regressions, but
I'd like to get review started so this can get merged ASAP.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-30 12:06   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 02/20] btrfs: remove orig from shrink_delalloc Josef Bacik
                   ` (18 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have btrfs_wait_ordered_roots() which takes a u64 for nr, but
btrfs_start_delalloc_roots() that takes an int for nr, which makes using
them in conjunction, especially for something like (u64)-1, annoying and
inconsistent.  Fix btrfs_start_delalloc_roots() to take a u64 for nr and
adjust start_delalloc_inodes() and it's callers appropriately.

Part of adjusting the callers to this means changing
btrfs_writeback_inodes_sb_nr() to take a u64 for items.  This may be
confusing because it seems unrelated, but the caller of
btrfs_writeback_inodes_sb_nr() already passes in a u64, it's just the
function variable that needs to be changed.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h       |  2 +-
 fs/btrfs/dev-replace.c |  2 +-
 fs/btrfs/inode.c       | 27 +++++++++++----------------
 fs/btrfs/ioctl.c       |  2 +-
 fs/btrfs/space-info.c  |  2 +-
 5 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 8a2c1665baad..6afa0885a9bb 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2884,7 +2884,7 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
 			       u32 min_type);
 
 int btrfs_start_delalloc_snapshot(struct btrfs_root *root);
-int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr);
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, u64 nr);
 int btrfs_set_extent_delalloc(struct inode *inode, u64 start, u64 end,
 			      unsigned int extra_bits,
 			      struct extent_state **cached_state);
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index f639dde2a679..6ff08eb3c35d 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -593,7 +593,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	 * flush all outstanding I/O and inode extent mappings before the
 	 * copy operation is declared as being finished
 	 */
-	ret = btrfs_start_delalloc_roots(fs_info, -1);
+	ret = btrfs_start_delalloc_roots(fs_info, U64_MAX);
 	if (ret) {
 		mutex_unlock(&dev_replace->lock_finishing_cancel_unmount);
 		return ret;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9320f13778ce..5c6ce78bff1d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9619,7 +9619,8 @@ static struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode
  * some fairly slow code that needs optimization. This walks the list
  * of all the inodes with pending delalloc and forces them to disk.
  */
-static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
+static int start_delalloc_inodes(struct btrfs_root *root, u64 *nr,
+				 bool snapshot)
 {
 	struct btrfs_inode *binode;
 	struct inode *inode;
@@ -9659,9 +9660,11 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
 		list_add_tail(&work->list, &works);
 		btrfs_queue_work(root->fs_info->flush_workers,
 				 &work->work);
-		ret++;
-		if (nr != -1 && ret >= nr)
-			goto out;
+		if (*nr != U64_MAX) {
+			(*nr)--;
+			if (*nr == 0)
+				goto out;
+		}
 		cond_resched();
 		spin_lock(&root->delalloc_lock);
 	}
@@ -9686,18 +9689,15 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
 int btrfs_start_delalloc_snapshot(struct btrfs_root *root)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	int ret;
+	u64 nr = U64_MAX;
 
 	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
 		return -EROFS;
 
-	ret = start_delalloc_inodes(root, -1, true);
-	if (ret > 0)
-		ret = 0;
-	return ret;
+	return start_delalloc_inodes(root, &nr, true);
 }
 
-int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
+int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, u64 nr)
 {
 	struct btrfs_root *root;
 	struct list_head splice;
@@ -9720,15 +9720,10 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
 			       &fs_info->delalloc_roots);
 		spin_unlock(&fs_info->delalloc_root_lock);
 
-		ret = start_delalloc_inodes(root, nr, false);
+		ret = start_delalloc_inodes(root, &nr, false);
 		btrfs_put_root(root);
 		if (ret < 0)
 			goto out;
-
-		if (nr != -1) {
-			nr -= ret;
-			WARN_ON(nr < 0);
-		}
 		spin_lock(&fs_info->delalloc_root_lock);
 	}
 	spin_unlock(&fs_info->delalloc_root_lock);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ecb6b188df15..442c89502f06 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5510,7 +5510,7 @@ long btrfs_ioctl(struct file *file, unsigned int
 	case BTRFS_IOC_SYNC: {
 		int ret;
 
-		ret = btrfs_start_delalloc_roots(fs_info, -1);
+		ret = btrfs_start_delalloc_roots(fs_info, U64_MAX);
 		if (ret)
 			return ret;
 		ret = btrfs_sync_fs(inode->i_sb, 1);
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 01297c5b2666..edda1ee0455e 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -310,7 +310,7 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 }
 
 static void btrfs_writeback_inodes_sb_nr(struct btrfs_fs_info *fs_info,
-					 unsigned long nr_pages, int nr_items)
+					 unsigned long nr_pages, u64 nr_items)
 {
 	struct super_block *sb = fs_info->sb;
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 02/20] btrfs: remove orig from shrink_delalloc
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
  2020-01-29 23:50 ` [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-30 11:44   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We don't use this anywhere inside of shrink_delalloc, remove it.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index edda1ee0455e..671c3a379224 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -350,7 +350,7 @@ static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
  * shrink metadata reservation for delalloc
  */
 static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
-			    u64 orig, bool wait_ordered)
+			    bool wait_ordered)
 {
 	struct btrfs_space_info *space_info;
 	struct btrfs_trans_handle *trans;
@@ -569,7 +569,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes * 2, num_bytes,
+		shrink_delalloc(fs_info, num_bytes * 2,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
  2020-01-29 23:50 ` [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
  2020-01-29 23:50 ` [PATCH 02/20] btrfs: remove orig from shrink_delalloc Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-30 11:46   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Data allocations are going to want to pass in U64_MAX for flushing
space, adjust shrink_delalloc to handle this properly.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 671c3a379224..924cee245e4a 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -363,8 +363,19 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
 	int loops;
 
 	/* Calc the number of the pages we need flush for space reservation */
-	items = calc_reclaim_items_nr(fs_info, to_reclaim);
-	to_reclaim = items * EXTENT_SIZE_PER_ITEM;
+	if (to_reclaim == U64_MAX) {
+		items = U64_MAX;
+	} else {
+		/*
+		 * to_reclaim is set to however much metadata we need to
+		 * reclaim, but reclaiming that much data doesn't really track
+		 * exactly, so increase the amount to reclaim by 2x in order to
+		 * make sure we're flushing enough delalloc to hopefully reclaim
+		 * some metadata reservations.
+		 */
+		items = calc_reclaim_items_nr(fs_info, to_reclaim) * 2;
+		to_reclaim = items * EXTENT_SIZE_PER_ITEM;
+	}
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
 	space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
@@ -569,7 +580,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes * 2,
+		shrink_delalloc(fs_info, num_bytes,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (2 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-30 12:15   ` Nikolay Borisov
  2020-01-30 12:35   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
                   ` (15 subsequent siblings)
  19 siblings, 2 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Currently shrink_delalloc just looks up the metadata space info, but
this won't work if we're trying to reclaim space for data chunks.  We
get the right space_info we want passed into flush_space, so simply pass
that along to shrink_delalloc.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 924cee245e4a..17e2b5a53cb5 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -349,10 +349,10 @@ static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
 /*
  * shrink metadata reservation for delalloc
  */
-static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
-			    bool wait_ordered)
+static void shrink_delalloc(struct btrfs_fs_info *fs_info,
+			    struct btrfs_space_info *space_info,
+			    u64 to_reclaim, bool wait_ordered)
 {
-	struct btrfs_space_info *space_info;
 	struct btrfs_trans_handle *trans;
 	u64 delalloc_bytes;
 	u64 dio_bytes;
@@ -378,7 +378,6 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info, u64 to_reclaim,
 	}
 
 	trans = (struct btrfs_trans_handle *)current->journal_info;
-	space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
 
 	delalloc_bytes = percpu_counter_sum_positive(
 						&fs_info->delalloc_bytes);
@@ -580,7 +579,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 		break;
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
-		shrink_delalloc(fs_info, num_bytes,
+		shrink_delalloc(fs_info, space_info, num_bytes,
 				state == FLUSH_DELALLOC_WAIT);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (3 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-30 13:19   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
                   ` (14 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We have traditionally used flush_space() to flush metadata space, so
we've been unconditionally using btrfs_metadata_alloc_profile() for our
profile to allocate a chunk.  However if we're going to use this for
data we need to use btrfs_get_alloc_profile() on the space_info we pass
in.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 17e2b5a53cb5..5a92851af2b3 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -604,7 +604,8 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 			break;
 		}
 		ret = btrfs_chunk_alloc(trans,
-				btrfs_metadata_alloc_profile(fs_info),
+				btrfs_get_alloc_profile(fs_info,
+							space_info->flags),
 				(state == ALLOC_CHUNK) ? CHUNK_ALLOC_NO_FORCE :
 					CHUNK_ALLOC_FORCE);
 		btrfs_end_transaction(trans);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (4 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-31 11:16   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
                   ` (13 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We were missing a call to btrfs_try_granting_tickets in
btrfs_free_reserved_bytes, so add it to handle the case where we're able
to satisfy an allocation because we've freed a pending reservation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 77ec0597bd17..616d0dd69394 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2932,6 +2932,8 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache,
 	if (delalloc)
 		cache->delalloc_bytes -= num_bytes;
 	spin_unlock(&cache->lock);
+
+	btrfs_try_granting_tickets(cache->fs_info, space_info);
 	spin_unlock(&space_info->lock);
 }
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (5 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-31 11:25   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
                   ` (12 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When unpinning we were only calling btrfs_try_granting_tickets() if
global_rsv->space_info == space_info, which is problematic because we
use ticketing for SYSTEM chunks, and want to use it for DATA as well.
Fix this by moving this call outside of that if statement.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/extent-tree.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 0783341ef2e7..dfa810854850 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2885,11 +2885,10 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
 				len -= to_add;
 			}
 			spin_unlock(&global_rsv->lock);
-			/* Add to any tickets we may have */
-			if (len)
-				btrfs_try_granting_tickets(fs_info,
-							   space_info);
 		}
+		/* Add to any tickets we may have */
+		if (!readonly && return_free_space && len)
+			btrfs_try_granting_tickets(fs_info, space_info);
 		spin_unlock(&space_info->lock);
 	}
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (6 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-31 11:37   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
                   ` (11 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we have compression on we could free up more space than we reserved,
and thus be able to make a space reservation.  Add the call for this
scenario.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 616d0dd69394..aca4510f1f2d 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2899,6 +2899,13 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
 						      space_info, -ram_bytes);
 		if (delalloc)
 			cache->delalloc_bytes += num_bytes;
+
+		/*
+		 * Compression can use less space than we reserved, so wake
+		 * tickets if that happens.
+		 */
+		if (num_bytes < ram_bytes)
+			btrfs_try_granting_tickets(cache->fs_info, space_info);
 	}
 	spin_unlock(&cache->lock);
 	spin_unlock(&space_info->lock);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (7 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-31 11:56   ` Nikolay Borisov
  2020-01-29 23:50 ` [PATCH 10/20] btrfs: add flushing states for handling data reservations Josef Bacik
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We are going to use the ticket infrastructure for data, so use the
btrfs_space_info_free_bytes_may_use() helper in
btrfs_free_reserved_data_space_noquota() so we get the
try_granting_tickets call when we free our reservation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/delalloc-space.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 4cdac4d834f5..08cfef49f88b 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -179,9 +179,7 @@ void btrfs_free_reserved_data_space_noquota(struct inode *inode, u64 start,
 	start = round_down(start, fs_info->sectorsize);
 
 	data_sinfo = fs_info->data_sinfo;
-	spin_lock(&data_sinfo->lock);
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, -len);
-	spin_unlock(&data_sinfo->lock);
+	btrfs_space_info_free_bytes_may_use(fs_info, data_sinfo, len);
 }
 
 /*
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 10/20] btrfs: add flushing states for handling data reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (8 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Currently the way we do data reservations is by seeing if we have enough
space in our space_info.  If we do not and we're a normal inode we'll

1) Attempt to force a chunk allocation.
2) If that fails we'll flush delalloc, then commit the transaction, then
   run the delayed iputs.

If we are a free space inode we're only allowed to force a chunk
allocation.  In order to use the normal flushing mechanism we need to
encode this into a flush state array for each flush type, so add the two
different flush types we'll need and add their respective states that
are valid for trying to make a data reservation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h      |  2 ++
 fs/btrfs/space-info.c | 11 +++++++++++
 2 files changed, 13 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6afa0885a9bb..865b24a1759e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2526,6 +2526,8 @@ enum btrfs_reserve_flush_enum {
 	 */
 	BTRFS_RESERVE_FLUSH_LIMIT,
 	BTRFS_RESERVE_FLUSH_EVICT,
+	BTRFS_RESERVE_FLUSH_DATA,
+	BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE,
 	BTRFS_RESERVE_FLUSH_ALL,
 };
 
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 5a92851af2b3..066471c0f47c 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -854,6 +854,17 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	COMMIT_TRANS,
 };
 
+static const enum btrfs_flush_state data_flush_states[] = {
+	ALLOC_CHUNK_FORCE,
+	FLUSH_DELALLOC_WAIT,
+	COMMIT_TRANS,
+	RUN_DELAYED_IPUTS,
+};
+
+static const enum btrfs_flush_state free_space_inode_flush_states[] = {
+	ALLOC_CHUNK_FORCE,
+};
+
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (9 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 10/20] btrfs: add flushing states for handling data reservations Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 12/20] btrfs: use ticketing for data space reservations Josef Bacik
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Create a new function btrfs_reserve_data_bytes() in order to handle data
reservations.  This uses the new flush types and flush states to handle
making data reservations.

This patch specifically does not change any functionality, and is
purposefully not cleaned up in order to make bisection easier for the
future patches.  The new helper is identical to the old helper in how it
handles data reservations.  We first try to force a chunk allocation,
and then we run through the flush states all at once and in the same
order that they were done with the old helper.

Subsequent patches will clean this up and change the behavior of the
flushing, and it is important to keep those changes separate so we can
easily bisect down to the patch that caused the regression, rather than
the patch that made us start using the new infrastructure.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/delalloc-space.c | 119 ++------------------------------------
 fs/btrfs/space-info.c     |  77 ++++++++++++++++++++++++
 fs/btrfs/space-info.h     |   2 +
 3 files changed, 83 insertions(+), 115 deletions(-)

diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 08cfef49f88b..c13d8609cc99 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -13,126 +13,15 @@ int btrfs_alloc_data_chunk_ondemand(struct btrfs_inode *inode, u64 bytes)
 {
 	struct btrfs_root *root = inode->root;
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	u64 used;
-	int ret = 0;
-	int need_commit = 2;
-	int have_pinned_space;
+	enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_DATA;
 
 	/* Make sure bytes are sectorsize aligned */
 	bytes = ALIGN(bytes, fs_info->sectorsize);
 
-	if (btrfs_is_free_space_inode(inode)) {
-		need_commit = 0;
-		ASSERT(current->journal_info);
-	}
-
-again:
-	/* Make sure we have enough space to handle the data first */
-	spin_lock(&data_sinfo->lock);
-	used = btrfs_space_info_used(data_sinfo, true);
-
-	if (used + bytes > data_sinfo->total_bytes) {
-		struct btrfs_trans_handle *trans;
-
-		/*
-		 * If we don't have enough free bytes in this space then we need
-		 * to alloc a new chunk.
-		 */
-		if (!data_sinfo->full) {
-			u64 alloc_target;
-
-			data_sinfo->force_alloc = CHUNK_ALLOC_FORCE;
-			spin_unlock(&data_sinfo->lock);
-
-			alloc_target = btrfs_data_alloc_profile(fs_info);
-			/*
-			 * It is ugly that we don't call nolock join
-			 * transaction for the free space inode case here.
-			 * But it is safe because we only do the data space
-			 * reservation for the free space cache in the
-			 * transaction context, the common join transaction
-			 * just increase the counter of the current transaction
-			 * handler, doesn't try to acquire the trans_lock of
-			 * the fs.
-			 */
-			trans = btrfs_join_transaction(root);
-			if (IS_ERR(trans))
-				return PTR_ERR(trans);
-
-			ret = btrfs_chunk_alloc(trans, alloc_target,
-						CHUNK_ALLOC_NO_FORCE);
-			btrfs_end_transaction(trans);
-			if (ret < 0) {
-				if (ret != -ENOSPC)
-					return ret;
-				else {
-					have_pinned_space = 1;
-					goto commit_trans;
-				}
-			}
-
-			goto again;
-		}
-
-		/*
-		 * If we don't have enough pinned space to deal with this
-		 * allocation, and no removed chunk in current transaction,
-		 * don't bother committing the transaction.
-		 */
-		have_pinned_space = __percpu_counter_compare(
-			&data_sinfo->total_bytes_pinned,
-			used + bytes - data_sinfo->total_bytes,
-			BTRFS_TOTAL_BYTES_PINNED_BATCH);
-		spin_unlock(&data_sinfo->lock);
-
-		/* Commit the current transaction and try again */
-commit_trans:
-		if (need_commit) {
-			need_commit--;
-
-			if (need_commit > 0) {
-				btrfs_start_delalloc_roots(fs_info, -1);
-				btrfs_wait_ordered_roots(fs_info, U64_MAX, 0,
-							 (u64)-1);
-			}
-
-			trans = btrfs_join_transaction(root);
-			if (IS_ERR(trans))
-				return PTR_ERR(trans);
-			if (have_pinned_space >= 0 ||
-			    test_bit(BTRFS_TRANS_HAVE_FREE_BGS,
-				     &trans->transaction->flags) ||
-			    need_commit > 0) {
-				ret = btrfs_commit_transaction(trans);
-				if (ret)
-					return ret;
-				/*
-				 * The cleaner kthread might still be doing iput
-				 * operations. Wait for it to finish so that
-				 * more space is released.  We don't need to
-				 * explicitly run the delayed iputs here because
-				 * the commit_transaction would have woken up
-				 * the cleaner.
-				 */
-				ret = btrfs_wait_on_delayed_iputs(fs_info);
-				if (ret)
-					return ret;
-				goto again;
-			} else {
-				btrfs_end_transaction(trans);
-			}
-		}
-
-		trace_btrfs_space_reservation(fs_info,
-					      "space_info:enospc",
-					      data_sinfo->flags, bytes, 1);
-		return -ENOSPC;
-	}
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-	spin_unlock(&data_sinfo->lock);
+	if (btrfs_is_free_space_inode(inode))
+		flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
 
-	return 0;
+	return btrfs_reserve_data_bytes(fs_info, bytes, flush);
 }
 
 int btrfs_check_data_free_space(struct inode *inode,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 066471c0f47c..371af6d89259 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1125,3 +1125,80 @@ int btrfs_reserve_metadata_bytes(struct btrfs_root *root,
 	}
 	return ret;
 }
+
+/**
+ * btrfs_reserve_data_bytes - try to reserve data bytes for an allocation
+ * @root - the root we are allocating for
+ * @bytes - the number of bytes we need
+ * @flush - how we are allowed to flush
+ *
+ * This will reserve bytes from the data space info.  If there is not enough
+ * space then we will attempt to flush space as specified ty flush.
+ */
+int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+			     enum btrfs_reserve_flush_enum flush)
+{
+	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
+	const enum btrfs_flush_state *states;
+	u64 used;
+	int states_nr;
+	int commit_cycles = 2;
+	int ret = -ENOSPC;
+
+	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
+
+	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
+		states = data_flush_states;
+		states_nr = ARRAY_SIZE(data_flush_states);
+	} else {
+		states = free_space_inode_flush_states;
+		states_nr = ARRAY_SIZE(free_space_inode_flush_states);
+		commit_cycles = 0;
+	}
+
+	spin_lock(&data_sinfo->lock);
+again:
+	used = btrfs_space_info_used(data_sinfo, true);
+
+	if (used + bytes > data_sinfo->total_bytes) {
+		u64 prev_total_bytes = data_sinfo->total_bytes;
+		int flush_state = 1;
+
+		spin_unlock(&data_sinfo->lock);
+
+		/*
+		 * Everybody can force chunk allocation, so try this first to
+		 * see if we can just bail here and make our reservation.
+		 */
+		flush_space(fs_info, data_sinfo, bytes, ALLOC_CHUNK_FORCE);
+		spin_lock(&data_sinfo->lock);
+		if (prev_total_bytes < data_sinfo->total_bytes)
+			goto again;
+		spin_unlock(&data_sinfo->lock);
+
+		if (!commit_cycles)
+			goto out;
+
+		/*
+		 * Cycle through the rest of the flushing options for our flush
+		 * type, then try again.
+		 */
+		while (flush_state < states_nr) {
+			flush_space(fs_info, data_sinfo, (u64)-1,
+				    states[flush_state]);
+			flush_state++;
+		}
+		commit_cycles--;
+		spin_lock(&data_sinfo->lock);
+		goto again;
+	}
+	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
+	ret = 0;
+	spin_unlock(&data_sinfo->lock);
+out:
+	if (ret)
+		trace_btrfs_space_reservation(fs_info,
+					      "space_info:enospc",
+					      data_sinfo->flags, bytes, 1);
+	return ret;
+}
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 24514cd2c6c1..179f757c4a6b 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -141,5 +141,7 @@ static inline void btrfs_space_info_free_bytes_may_use(
 	btrfs_try_granting_tickets(fs_info, space_info);
 	spin_unlock(&space_info->lock);
 }
+int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+			     enum btrfs_reserve_flush_enum flush);
 
 #endif /* BTRFS_SPACE_INFO_H */
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 12/20] btrfs: use ticketing for data space reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (10 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 13/20] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have all the infrastructure in place, use the ticketing
infrastructure to make data allocations.  This still essentially behaves
the same way as before, with the exception that we check after each
flush operation to see if our reservation has been satisfied.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 90 +++++++++++++++++++++----------------------
 1 file changed, 43 insertions(+), 47 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 371af6d89259..b4c43af7b499 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -865,6 +865,27 @@ static const enum btrfs_flush_state free_space_inode_flush_states[] = {
 	ALLOC_CHUNK_FORCE,
 };
 
+
+static void priority_reclaim_space(struct btrfs_fs_info *fs_info,
+				   struct btrfs_space_info *space_info,
+				   struct reserve_ticket *ticket,
+				   u64 to_reclaim,
+				   const enum btrfs_flush_state *states,
+				   int states_nr)
+{
+	int flush_state = 0;
+	do {
+		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
+		flush_state++;
+		spin_lock(&space_info->lock);
+		if (ticket->bytes == 0) {
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		spin_unlock(&space_info->lock);
+	} while (flush_state < states_nr);
+}
+
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket,
@@ -872,7 +893,6 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				int states_nr)
 {
 	u64 to_reclaim;
-	int flush_state;
 
 	spin_lock(&space_info->lock);
 	to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info);
@@ -881,18 +901,8 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 		return;
 	}
 	spin_unlock(&space_info->lock);
-
-	flush_state = 0;
-	do {
-		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
-		flush_state++;
-		spin_lock(&space_info->lock);
-		if (ticket->bytes == 0) {
-			spin_unlock(&space_info->lock);
-			return;
-		}
-		spin_unlock(&space_info->lock);
-	} while (flush_state < states_nr);
+	priority_reclaim_space(fs_info, space_info, ticket, to_reclaim,
+			       states, states_nr);
 }
 
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
@@ -960,6 +970,16 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						evict_flush_states,
 						ARRAY_SIZE(evict_flush_states));
 		break;
+	case BTRFS_RESERVE_FLUSH_DATA:
+		priority_reclaim_space(fs_info, space_info, ticket, U64_MAX,
+				       data_flush_states,
+				       ARRAY_SIZE(data_flush_states));
+		break;
+	case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
+		priority_reclaim_space(fs_info, space_info, ticket, U64_MAX,
+				free_space_inode_flush_states,
+				ARRAY_SIZE(free_space_inode_flush_states));
+		break;
 	default:
 		ASSERT(0);
 		break;
@@ -1139,57 +1159,33 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 			     enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	const enum btrfs_flush_state *states;
 	u64 used;
-	int states_nr;
 	int commit_cycles = 2;
 	int ret = -ENOSPC;
 
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
-		states = data_flush_states;
-		states_nr = ARRAY_SIZE(data_flush_states);
-	} else {
-		states = free_space_inode_flush_states;
-		states_nr = ARRAY_SIZE(free_space_inode_flush_states);
+	if (flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE)
 		commit_cycles = 0;
-	}
 
-	spin_lock(&data_sinfo->lock);
 again:
+	spin_lock(&data_sinfo->lock);
 	used = btrfs_space_info_used(data_sinfo, true);
 
 	if (used + bytes > data_sinfo->total_bytes) {
-		u64 prev_total_bytes = data_sinfo->total_bytes;
-		int flush_state = 1;
+		struct reserve_ticket ticket;
 
+		init_waitqueue_head(&ticket.wait);
+		ticket.bytes = bytes;
+		ticket.error = 0;
+		list_add_tail(&ticket.list, &data_sinfo->priority_tickets);
 		spin_unlock(&data_sinfo->lock);
 
-		/*
-		 * Everybody can force chunk allocation, so try this first to
-		 * see if we can just bail here and make our reservation.
-		 */
-		flush_space(fs_info, data_sinfo, bytes, ALLOC_CHUNK_FORCE);
-		spin_lock(&data_sinfo->lock);
-		if (prev_total_bytes < data_sinfo->total_bytes)
-			goto again;
-		spin_unlock(&data_sinfo->lock);
-
-		if (!commit_cycles)
+		ret = handle_reserve_ticket(fs_info, data_sinfo, &ticket,
+					    flush);
+		if (!ret || !commit_cycles)
 			goto out;
-
-		/*
-		 * Cycle through the rest of the flushing options for our flush
-		 * type, then try again.
-		 */
-		while (flush_state < states_nr) {
-			flush_space(fs_info, data_sinfo, (u64)-1,
-				    states[flush_state]);
-			flush_state++;
-		}
 		commit_cycles--;
-		spin_lock(&data_sinfo->lock);
 		goto again;
 	}
 	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 13/20] btrfs: run delayed iputs before committing the transaction for data
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (11 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 12/20] btrfs: use ticketing for data space reservations Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Before we were waiting on iputs after we committed the transaction, but
this doesn't really make much sense.  We want to reclaim any space we
may have in order to be more likely to commit the transaction, due to
pinned space being added by running the delayed iputs.  Fix this by
making delayed iputs run before committing the transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b4c43af7b499..03e8c45365ea 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -857,8 +857,8 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 static const enum btrfs_flush_state data_flush_states[] = {
 	ALLOC_CHUNK_FORCE,
 	FLUSH_DELALLOC_WAIT,
-	COMMIT_TRANS,
 	RUN_DELAYED_IPUTS,
+	COMMIT_TRANS,
 };
 
 static const enum btrfs_flush_state free_space_inode_flush_states[] = {
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data space
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (12 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 13/20] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 15/20] btrfs: serialize data reservations if we are flushing Josef Bacik
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We can end up with free'd extents in the delayed refs, and thus
may_commit_transaction() may not think we have enough pinned space to
commit the transaction and we'll ENOSPC early.  Handle this by running
the delayed refs in order to make sure pinned is uptodate before we try
to commit the transaction.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 03e8c45365ea..520c91430f90 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -858,6 +858,7 @@ static const enum btrfs_flush_state data_flush_states[] = {
 	ALLOC_CHUNK_FORCE,
 	FLUSH_DELALLOC_WAIT,
 	RUN_DELAYED_IPUTS,
+	FLUSH_DELAYED_REFS,
 	COMMIT_TRANS,
 };
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 15/20] btrfs: serialize data reservations if we are flushing
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (13 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 16/20] btrfs: rework chunk allocate for data reservations Josef Bacik
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Nikolay reported a problem where generic/371 would fail sometimes with a
slow drive.  The gist of the test is that we fallocate a file in
parallel with a pwrite of a different file.  These two files combined
are smaller than the file system, but sometimes the pwrite would ENOSPC.

A fair bit of investigation uncovered the fact that the fallocate
workload was racing in and grabbing the free space that the pwrite
workload was trying to free up so it could make its own reservation.
After a few loops of this eventually the pwrite workload would error out
with an ENOSPC.

We've had the same problem with metadata as well, and we serialized all
metadata allocations to satisfy this problem.  This wasn't usually a
problem with data because data reservations are more straightforward,
but obviously could still happen.

Fix this by not allowing reservations to occur if there are any pending
tickets waiting to be satisfied on the space info.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 520c91430f90..3668c2d4fadf 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1163,6 +1163,7 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 	u64 used;
 	int commit_cycles = 2;
 	int ret = -ENOSPC;
+	bool pending_tickets;
 
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
@@ -1172,8 +1173,11 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 again:
 	spin_lock(&data_sinfo->lock);
 	used = btrfs_space_info_used(data_sinfo, true);
+	pending_tickets = !list_empty(&data_sinfo->tickets) ||
+		!list_empty(&data_sinfo->priority_tickets);
 
-	if (used + bytes > data_sinfo->total_bytes) {
+	if (pending_tickets ||
+	    used + bytes > data_sinfo->total_bytes) {
 		struct reserve_ticket ticket;
 
 		init_waitqueue_head(&ticket.wait);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 16/20] btrfs: rework chunk allocate for data reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (14 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 15/20] btrfs: serialize data reservations if we are flushing Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 17/20] btrfs: drop the commit_cycles stuff " Josef Bacik
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The chunk allocator only allocates chunks in min(10% ofs fs, 1gib)
chunks at a time, but if our data reservation is larger than that we
need to allocate more chunks.  Historically this just magically happened
because we looped through the flushing logic a few times for data
reservations, so we were getting lucky.

Fix this by looping trying to allocate enough space to make our
reservation until it's satisfied or until we cannot allocate chunks
anymore.

Since this may result in too many chunk allocations we'll do our normal
flushing series once, and then fall back on allocate data chunks to
satisfy our reservation.

We will keep our initial one-shot chunk allocation in the normal case
for performance reasons, there's no sense in flushing all the things if
we can just allocate a single chunk and carry on.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 64 ++++++++++++++++++++++++++++++-------------
 1 file changed, 45 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 3668c2d4fadf..7a401f1c3724 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -862,17 +862,12 @@ static const enum btrfs_flush_state data_flush_states[] = {
 	COMMIT_TRANS,
 };
 
-static const enum btrfs_flush_state free_space_inode_flush_states[] = {
-	ALLOC_CHUNK_FORCE,
-};
-
-
-static void priority_reclaim_space(struct btrfs_fs_info *fs_info,
-				   struct btrfs_space_info *space_info,
-				   struct reserve_ticket *ticket,
-				   u64 to_reclaim,
-				   const enum btrfs_flush_state *states,
-				   int states_nr)
+static int priority_reclaim_space(struct btrfs_fs_info *fs_info,
+				  struct btrfs_space_info *space_info,
+				  struct reserve_ticket *ticket,
+				  u64 to_reclaim,
+				  const enum btrfs_flush_state *states,
+				  int states_nr)
 {
 	int flush_state = 0;
 	do {
@@ -881,10 +876,11 @@ static void priority_reclaim_space(struct btrfs_fs_info *fs_info,
 		spin_lock(&space_info->lock);
 		if (ticket->bytes == 0) {
 			spin_unlock(&space_info->lock);
-			return;
+			return 0;
 		}
 		spin_unlock(&space_info->lock);
 	} while (flush_state < states_nr);
+	return -ENOSPC;
 }
 
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
@@ -906,6 +902,41 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 			       states, states_nr);
 }
 
+static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
+					struct btrfs_space_info *space_info,
+					struct reserve_ticket *ticket,
+					enum btrfs_reserve_flush_enum flush)
+{
+	int ret;
+
+	/*
+	 * First see if we can reclaim pinned space before forcing a bunch of
+	 * chunk allocations.
+	 */
+	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
+		ret = priority_reclaim_space(fs_info, space_info, ticket,
+					     U64_MAX, data_flush_states,
+					     ARRAY_SIZE(data_flush_states));
+		if (!ret)
+			return;
+	}
+
+	/*
+	 * The chunk allocator will only allocate min(10% of the fs, 1gib) at a
+	 * time, but our reservation may be much larger than that threshold, so
+	 * loop allocating chunks until we're able to satisfy our allocation.
+	 */
+	while (1) {
+		flush_space(fs_info, space_info, U64_MAX, ALLOC_CHUNK_FORCE);
+		spin_lock(&space_info->lock);
+		if (ticket->bytes == 0 || space_info->full) {
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		spin_unlock(&space_info->lock);
+	}
+}
+
 static void wait_reserve_ticket(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket)
@@ -972,14 +1003,9 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						ARRAY_SIZE(evict_flush_states));
 		break;
 	case BTRFS_RESERVE_FLUSH_DATA:
-		priority_reclaim_space(fs_info, space_info, ticket, U64_MAX,
-				       data_flush_states,
-				       ARRAY_SIZE(data_flush_states));
-		break;
 	case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
-		priority_reclaim_space(fs_info, space_info, ticket, U64_MAX,
-				free_space_inode_flush_states,
-				ARRAY_SIZE(free_space_inode_flush_states));
+		priority_reclaim_data_space(fs_info, space_info, ticket,
+					    flush);
 		break;
 	default:
 		ASSERT(0);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 17/20] btrfs: drop the commit_cycles stuff for data reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (15 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 16/20] btrfs: rework chunk allocate for data reservations Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 18/20] btrfs: use the same helper for data and metadata reservations Josef Bacik
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

This was an old wart left over from how we previously did data
reservations.  Before we could have people race in and take a
reservation while we were flushing space, so we needed to make sure we
looped a few times before giving up.  Now that we're using the ticketing
infrastructure we don't have to worry about this and can drop the logic
altogether.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 7a401f1c3724..a456139c698d 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1187,16 +1187,11 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 {
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
 	u64 used;
-	int commit_cycles = 2;
 	int ret = -ENOSPC;
 	bool pending_tickets;
 
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	if (flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE)
-		commit_cycles = 0;
-
-again:
 	spin_lock(&data_sinfo->lock);
 	used = btrfs_space_info_used(data_sinfo, true);
 	pending_tickets = !list_empty(&data_sinfo->tickets) ||
@@ -1214,15 +1209,12 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 
 		ret = handle_reserve_ticket(fs_info, data_sinfo, &ticket,
 					    flush);
-		if (!ret || !commit_cycles)
-			goto out;
-		commit_cycles--;
-		goto again;
+	} else {
+		btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo,
+						      bytes);
+		ret = 0;
 	}
-	btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo, bytes);
-	ret = 0;
 	spin_unlock(&data_sinfo->lock);
-out:
 	if (ret)
 		trace_btrfs_space_reservation(fs_info,
 					      "space_info:enospc",
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 18/20] btrfs: use the same helper for data and metadata reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (16 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 17/20] btrfs: drop the commit_cycles stuff " Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 19/20] btrfs: do async reclaim for data reservations Josef Bacik
  2020-01-29 23:50 ` [PATCH 20/20] btrfs: kill the priority_reclaim_space helper Josef Bacik
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that data reservations follow the same pattern as metadata
reservations we can simply rename __reserve_metadata_bytes to
__reserve_bytes and use that helper for data reservations.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 49 +++++++++++++------------------------------
 1 file changed, 14 insertions(+), 35 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a456139c698d..bbdf20253180 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1050,10 +1050,9 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
  * regain reservations will be made and this will fail if there is not enough
  * space already.
  */
-static int __reserve_metadata_bytes(struct btrfs_fs_info *fs_info,
-				    struct btrfs_space_info *space_info,
-				    u64 orig_bytes,
-				    enum btrfs_reserve_flush_enum flush)
+static int __reserve_bytes(struct btrfs_fs_info *fs_info,
+			   struct btrfs_space_info *space_info, u64 orig_bytes,
+			   enum btrfs_reserve_flush_enum flush)
 {
 	struct reserve_ticket ticket;
 	u64 used;
@@ -1153,8 +1152,8 @@ int btrfs_reserve_metadata_bytes(struct btrfs_root *root,
 	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
 	int ret;
 
-	ret = __reserve_metadata_bytes(fs_info, block_rsv->space_info,
-				       orig_bytes, flush);
+	ret = __reserve_bytes(fs_info, block_rsv->space_info, orig_bytes,
+			      flush);
 	if (ret == -ENOSPC &&
 	    unlikely(root->orphan_cleanup_state == ORPHAN_CLEANUP_STARTED)) {
 		if (block_rsv != global_rsv &&
@@ -1186,38 +1185,18 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
 			     enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
-	u64 used;
-	int ret = -ENOSPC;
-	bool pending_tickets;
+	int ret;
 
+	ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
+	       flush == BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE);
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
 
-	spin_lock(&data_sinfo->lock);
-	used = btrfs_space_info_used(data_sinfo, true);
-	pending_tickets = !list_empty(&data_sinfo->tickets) ||
-		!list_empty(&data_sinfo->priority_tickets);
-
-	if (pending_tickets ||
-	    used + bytes > data_sinfo->total_bytes) {
-		struct reserve_ticket ticket;
-
-		init_waitqueue_head(&ticket.wait);
-		ticket.bytes = bytes;
-		ticket.error = 0;
-		list_add_tail(&ticket.list, &data_sinfo->priority_tickets);
-		spin_unlock(&data_sinfo->lock);
-
-		ret = handle_reserve_ticket(fs_info, data_sinfo, &ticket,
-					    flush);
-	} else {
-		btrfs_space_info_update_bytes_may_use(fs_info, data_sinfo,
-						      bytes);
-		ret = 0;
-	}
-	spin_unlock(&data_sinfo->lock);
-	if (ret)
-		trace_btrfs_space_reservation(fs_info,
-					      "space_info:enospc",
+	ret = __reserve_bytes(fs_info, data_sinfo, bytes, flush);
+	if (ret == -ENOSPC) {
+		trace_btrfs_space_reservation(fs_info, "space_info:enospc",
 					      data_sinfo->flags, bytes, 1);
+		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+			btrfs_dump_space_info(fs_info, data_sinfo, bytes, 0);
+	}
 	return ret;
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 19/20] btrfs: do async reclaim for data reservations
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (17 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 18/20] btrfs: use the same helper for data and metadata reservations Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  2020-01-29 23:50 ` [PATCH 20/20] btrfs: kill the priority_reclaim_space helper Josef Bacik
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that we have the data ticketing stuff in place, move normal data
reservations to use an async reclaim helper to satisfy tickets.  Before
we could have multiple tasks race in and both allocate chunks, resulting
in more data chunks than we would necessarily need.  Serializing these
allocations and making a single thread responsible for flushing will
only allocate chunks as needed, as well as cut down on transaction
commits and other flush related activities.

Priority reservations will still work as they have before, simply
trying to allocate a chunk until they can make their reservation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/ctree.h      |   3 +-
 fs/btrfs/disk-io.c    |   2 +-
 fs/btrfs/space-info.c | 109 +++++++++++++++++++++++++++++-------------
 3 files changed, 80 insertions(+), 34 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 865b24a1759e..709823a23c62 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -493,7 +493,7 @@ enum btrfs_orphan_cleanup_state {
 	ORPHAN_CLEANUP_DONE	= 2,
 };
 
-void btrfs_init_async_reclaim_work(struct work_struct *work);
+void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
 
 /* fs_info */
 struct reloc_control;
@@ -917,6 +917,7 @@ struct btrfs_fs_info {
 
 	/* Used to reclaim the metadata space in the background. */
 	struct work_struct async_reclaim_work;
+	struct work_struct async_data_reclaim_work;
 
 	spinlock_t unused_bgs_lock;
 	struct list_head unused_bgs;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 56d0a24aec74..825242f5c3f7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2753,7 +2753,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
 	fs_info->check_integrity_print_mask = 0;
 #endif
 	btrfs_init_balance(fs_info);
-	btrfs_init_async_reclaim_work(&fs_info->async_reclaim_work);
+	btrfs_init_async_reclaim_work(fs_info);
 
 	spin_lock_init(&fs_info->block_group_cache_lock);
 	fs_info->block_group_cache_tree = RB_ROOT;
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index bbdf20253180..b47eca433c62 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -832,9 +832,72 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
 	} while (flush_state <= COMMIT_TRANS);
 }
 
-void btrfs_init_async_reclaim_work(struct work_struct *work)
+static const enum btrfs_flush_state data_flush_states[] = {
+	ALLOC_CHUNK_FORCE,
+	FLUSH_DELALLOC_WAIT,
+	RUN_DELAYED_IPUTS,
+	FLUSH_DELAYED_REFS,
+	COMMIT_TRANS,
+};
+
+static void btrfs_async_reclaim_data_space(struct work_struct *work)
+{
+	struct btrfs_fs_info *fs_info;
+	struct btrfs_space_info *space_info;
+	u64 last_tickets_id;
+	int flush_state = 0;
+
+	fs_info = container_of(work, struct btrfs_fs_info,
+			       async_data_reclaim_work);
+	space_info = fs_info->data_sinfo;
+
+	spin_lock(&space_info->lock);
+	if (list_empty(&space_info->tickets)) {
+		space_info->flush = 0;
+		spin_unlock(&space_info->lock);
+		return;
+	}
+	last_tickets_id = space_info->tickets_id;
+	spin_unlock(&space_info->lock);
+
+	while (flush_state < ARRAY_SIZE(data_flush_states)) {
+		flush_space(fs_info, space_info, U64_MAX,
+			    data_flush_states[flush_state]);
+		spin_lock(&space_info->lock);
+		if (list_empty(&space_info->tickets)) {
+			space_info->flush = 0;
+			spin_unlock(&space_info->lock);
+			return;
+		}
+
+		if (last_tickets_id == space_info->tickets_id) {
+			flush_state++;
+		} else {
+			last_tickets_id = space_info->tickets_id;
+			flush_state = 0;
+		}
+
+		if (flush_state >= ARRAY_SIZE(data_flush_states)) {
+			if (space_info->full) {
+				if (maybe_fail_all_tickets(fs_info,
+							   space_info))
+					flush_state = 0;
+				else
+					space_info->flush = 0;
+			} else {
+				flush_state = 0;
+			}
+		}
+		spin_unlock(&space_info->lock);
+	}
+}
+
+void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info)
 {
-	INIT_WORK(work, btrfs_async_reclaim_metadata_space);
+	INIT_WORK(&fs_info->async_reclaim_work,
+		  btrfs_async_reclaim_metadata_space);
+	INIT_WORK(&fs_info->async_data_reclaim_work,
+		  btrfs_async_reclaim_data_space);
 }
 
 static const enum btrfs_flush_state priority_flush_states[] = {
@@ -854,14 +917,6 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	COMMIT_TRANS,
 };
 
-static const enum btrfs_flush_state data_flush_states[] = {
-	ALLOC_CHUNK_FORCE,
-	FLUSH_DELALLOC_WAIT,
-	RUN_DELAYED_IPUTS,
-	FLUSH_DELAYED_REFS,
-	COMMIT_TRANS,
-};
-
 static int priority_reclaim_space(struct btrfs_fs_info *fs_info,
 				  struct btrfs_space_info *space_info,
 				  struct reserve_ticket *ticket,
@@ -904,23 +959,8 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 
 static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
 					struct btrfs_space_info *space_info,
-					struct reserve_ticket *ticket,
-					enum btrfs_reserve_flush_enum flush)
+					struct reserve_ticket *ticket)
 {
-	int ret;
-
-	/*
-	 * First see if we can reclaim pinned space before forcing a bunch of
-	 * chunk allocations.
-	 */
-	if (flush == BTRFS_RESERVE_FLUSH_DATA) {
-		ret = priority_reclaim_space(fs_info, space_info, ticket,
-					     U64_MAX, data_flush_states,
-					     ARRAY_SIZE(data_flush_states));
-		if (!ret)
-			return;
-	}
-
 	/*
 	 * The chunk allocator will only allocate min(10% of the fs, 1gib) at a
 	 * time, but our reservation may be much larger than that threshold, so
@@ -989,6 +1029,7 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 	int ret;
 
 	switch (flush) {
+	case BTRFS_RESERVE_FLUSH_DATA:
 	case BTRFS_RESERVE_FLUSH_ALL:
 		wait_reserve_ticket(fs_info, space_info, ticket);
 		break;
@@ -1002,10 +1043,8 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info,
 						evict_flush_states,
 						ARRAY_SIZE(evict_flush_states));
 		break;
-	case BTRFS_RESERVE_FLUSH_DATA:
 	case BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE:
-		priority_reclaim_data_space(fs_info, space_info, ticket,
-					    flush);
+		priority_reclaim_data_space(fs_info, space_info, ticket);
 		break;
 	default:
 		ASSERT(0);
@@ -1054,6 +1093,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 			   struct btrfs_space_info *space_info, u64 orig_bytes,
 			   enum btrfs_reserve_flush_enum flush)
 {
+	struct work_struct *async_work;
 	struct reserve_ticket ticket;
 	u64 used;
 	int ret = 0;
@@ -1062,6 +1102,11 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 	ASSERT(orig_bytes);
 	ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_ALL);
 
+	if (flush == BTRFS_RESERVE_FLUSH_DATA)
+		async_work = &fs_info->async_data_reclaim_work;
+	else
+		async_work = &fs_info->async_reclaim_work;
+
 	spin_lock(&space_info->lock);
 	ret = -ENOSPC;
 	used = btrfs_space_info_used(space_info, true);
@@ -1091,7 +1136,8 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 		ticket.bytes = orig_bytes;
 		ticket.error = 0;
 		init_waitqueue_head(&ticket.wait);
-		if (flush == BTRFS_RESERVE_FLUSH_ALL) {
+		if (flush == BTRFS_RESERVE_FLUSH_ALL ||
+		    flush == BTRFS_RESERVE_FLUSH_DATA) {
 			list_add_tail(&ticket.list, &space_info->tickets);
 			if (!space_info->flush) {
 				space_info->flush = 1;
@@ -1099,8 +1145,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 							  space_info->flags,
 							  orig_bytes, flush,
 							  "enospc");
-				queue_work(system_unbound_wq,
-					   &fs_info->async_reclaim_work);
+				queue_work(system_unbound_wq, async_work);
 			}
 		} else {
 			list_add_tail(&ticket.list,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 20/20] btrfs: kill the priority_reclaim_space helper
  2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
                   ` (18 preceding siblings ...)
  2020-01-29 23:50 ` [PATCH 19/20] btrfs: do async reclaim for data reservations Josef Bacik
@ 2020-01-29 23:50 ` Josef Bacik
  19 siblings, 0 replies; 33+ messages in thread
From: Josef Bacik @ 2020-01-29 23:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Now that only the metadata stuff is using the helper, merge it into the
priority metadata reclaim loop.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 35 ++++++++++++-----------------------
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b47eca433c62..6db8ca1419cd 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -917,27 +917,6 @@ static const enum btrfs_flush_state evict_flush_states[] = {
 	COMMIT_TRANS,
 };
 
-static int priority_reclaim_space(struct btrfs_fs_info *fs_info,
-				  struct btrfs_space_info *space_info,
-				  struct reserve_ticket *ticket,
-				  u64 to_reclaim,
-				  const enum btrfs_flush_state *states,
-				  int states_nr)
-{
-	int flush_state = 0;
-	do {
-		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
-		flush_state++;
-		spin_lock(&space_info->lock);
-		if (ticket->bytes == 0) {
-			spin_unlock(&space_info->lock);
-			return 0;
-		}
-		spin_unlock(&space_info->lock);
-	} while (flush_state < states_nr);
-	return -ENOSPC;
-}
-
 static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				struct btrfs_space_info *space_info,
 				struct reserve_ticket *ticket,
@@ -945,6 +924,7 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 				int states_nr)
 {
 	u64 to_reclaim;
+	int flush_state = 0;
 
 	spin_lock(&space_info->lock);
 	to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info);
@@ -953,8 +933,17 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info,
 		return;
 	}
 	spin_unlock(&space_info->lock);
-	priority_reclaim_space(fs_info, space_info, ticket, to_reclaim,
-			       states, states_nr);
+
+	do {
+		flush_space(fs_info, space_info, to_reclaim, states[flush_state]);
+		flush_state++;
+		spin_lock(&space_info->lock);
+		if (ticket->bytes == 0) {
+			spin_unlock(&space_info->lock);
+			return;
+		}
+		spin_unlock(&space_info->lock);
+	} while (flush_state < states_nr);
 }
 
 static void priority_reclaim_data_space(struct btrfs_fs_info *fs_info,
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 02/20] btrfs: remove orig from shrink_delalloc
  2020-01-29 23:50 ` [PATCH 02/20] btrfs: remove orig from shrink_delalloc Josef Bacik
@ 2020-01-30 11:44   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 11:44 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> We don't use this anywhere inside of shrink_delalloc, remove it.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com but I'm sure David will
want a bit of history how this became unused. Turns out Omar removed the
usage in 17024ad0a0fd ("Btrfs: fix early ENOSPC due to delalloc").

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc
  2020-01-29 23:50 ` [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
@ 2020-01-30 11:46   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 11:46 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> Data allocations are going to want to pass in U64_MAX for flushing
> space, adjust shrink_delalloc to handle this properly.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-29 23:50 ` [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
@ 2020-01-30 12:06   ` Nikolay Borisov
  2020-01-31 15:07     ` Josef Bacik
  0 siblings, 1 reply; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 12:06 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> We have btrfs_wait_ordered_roots() which takes a u64 for nr, but
> btrfs_start_delalloc_roots() that takes an int for nr, which makes using
> them in conjunction, especially for something like (u64)-1, annoying and
> inconsistent.  Fix btrfs_start_delalloc_roots() to take a u64 for nr and
> adjust start_delalloc_inodes() and it's callers appropriately.

nit: You could include one more sentence to be explicit about the fact
that now 'nr' management is delegated to start_delalloc_inodes i.e you
pass it as a pointer to that function which in turn will control when
btrfs_Start_delalloc_roots breaks out of its own loop.

> 
> Part of adjusting the callers to this means changing
> btrfs_writeback_inodes_sb_nr() to take a u64 for items.  This may be
> confusing because it seems unrelated, but the caller of
> btrfs_writeback_inodes_sb_nr() already passes in a u64, it's just the
> function variable that needs to be changed.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/ctree.h       |  2 +-
>  fs/btrfs/dev-replace.c |  2 +-
>  fs/btrfs/inode.c       | 27 +++++++++++----------------
>  fs/btrfs/ioctl.c       |  2 +-
>  fs/btrfs/space-info.c  |  2 +-
>  5 files changed, 15 insertions(+), 20 deletions(-)
> 

<snip>

> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9619,7 +9619,8 @@ static struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode
>   * some fairly slow code that needs optimization. This walks the list
>   * of all the inodes with pending delalloc and forces them to disk.
>   */
> -static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
> +static int start_delalloc_inodes(struct btrfs_root *root, u64 *nr,
> +				 bool snapshot)
>  {
>  	struct btrfs_inode *binode;
>  	struct inode *inode;
> @@ -9659,9 +9660,11 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
>  		list_add_tail(&work->list, &works);
>  		btrfs_queue_work(root->fs_info->flush_workers,
>  				 &work->work);
> -		ret++;
> -		if (nr != -1 && ret >= nr)
> -			goto out;
> +		if (*nr != U64_MAX) {
> +			(*nr)--;
> +			if (*nr == 0)
> +				goto out;
> +		}
>  		cond_resched();
>  		spin_lock(&root->delalloc_lock);
>  	}
> @@ -9686,18 +9689,15 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
>  int btrfs_start_delalloc_snapshot(struct btrfs_root *root)
>  {
>  	struct btrfs_fs_info *fs_info = root->fs_info;
> -	int ret;
> +	u64 nr = U64_MAX;

This var is never used past start_delalloc_snapshot so you can remove it
and simply pass U64_MAX to start_delalloc_inodes.

>  
>  	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
>  		return -EROFS;
>  
> -	ret = start_delalloc_inodes(root, -1, true);
> -	if (ret > 0)
> -		ret = 0;
> -	return ret;
> +	return start_delalloc_inodes(root, &nr, true);
>  }
>  
> -int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
> +int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, u64 nr)
>  {
>  	struct btrfs_root *root;
>  	struct list_head splice;
> @@ -9720,15 +9720,10 @@ int btrfs_start_delalloc_roots(struct btrfs_fs_info *fs_info, int nr)
>  			       &fs_info->delalloc_roots);
>  		spin_unlock(&fs_info->delalloc_root_lock);
>  
> -		ret = start_delalloc_inodes(root, nr, false);
> +		ret = start_delalloc_inodes(root, &nr, false);
>  		btrfs_put_root(root);
>  		if (ret < 0)
>  			goto out;
> -
> -		if (nr != -1) {
> -			nr -= ret;
> -			WARN_ON(nr < 0);
> -		}
>  		spin_lock(&fs_info->delalloc_root_lock);
>  	}
>  	spin_unlock(&fs_info->delalloc_root_lock);

<snip>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg
  2020-01-29 23:50 ` [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
@ 2020-01-30 12:15   ` Nikolay Borisov
  2020-01-30 12:35   ` Nikolay Borisov
  1 sibling, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 12:15 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> Currently shrink_delalloc just looks up the metadata space info, but
> this won't work if we're trying to reclaim space for data chunks.  We
> get the right space_info we want passed into flush_space, so simply pass
> that along to shrink_delalloc.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>


Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg
  2020-01-29 23:50 ` [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
  2020-01-30 12:15   ` Nikolay Borisov
@ 2020-01-30 12:35   ` Nikolay Borisov
  1 sibling, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 12:35 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> Currently shrink_delalloc just looks up the metadata space info, but
> this won't work if we're trying to reclaim space for data chunks.  We
> get the right space_info we want passed into flush_space, so simply pass
> that along to shrink_delalloc.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags
  2020-01-29 23:50 ` [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
@ 2020-01-30 13:19   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-30 13:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> We have traditionally used flush_space() to flush metadata space, so
> we've been unconditionally using btrfs_metadata_alloc_profile() for our
> profile to allocate a chunk.  However if we're going to use this for
> data we need to use btrfs_get_alloc_profile() on the space_info we pass
> in.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes
  2020-01-29 23:50 ` [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
@ 2020-01-31 11:16   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-31 11:16 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> We were missing a call to btrfs_try_granting_tickets in
> btrfs_free_reserved_bytes, so add it to handle the case where we're able
> to satisfy an allocation because we've freed a pending reservation.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

 Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/block-group.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 77ec0597bd17..616d0dd69394 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2932,6 +2932,8 @@ void btrfs_free_reserved_bytes(struct btrfs_block_group *cache,
>  	if (delalloc)
>  		cache->delalloc_bytes -= num_bytes;
>  	spin_unlock(&cache->lock);
> +
> +	btrfs_try_granting_tickets(cache->fs_info, space_info);
>  	spin_unlock(&space_info->lock);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything
  2020-01-29 23:50 ` [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
@ 2020-01-31 11:25   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-31 11:25 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> When unpinning we were only calling btrfs_try_granting_tickets() if
> global_rsv->space_info == space_info, which is problematic because we
> use ticketing for SYSTEM chunks, and want to use it for DATA as well.
> Fix this by moving this call outside of that if statement.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>


Reviewed-by: Nikolay Borisov <nborisov@suse.com> . 


nit: I only wonderef if making if (!readonly && return_free_space) 
the top level 'if' and then have 2 ifs inside of it : 
1 being 'if (global_rsv->space_info == space_info)' and the other 
being 'if (len)' would make the code look better but it's not - 
it's adding another level of nesting:

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ab354ea09177..4ff9a3140864 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2872,26 +2872,25 @@ static int unpin_extent_range(struct btrfs_fs_info *fs_info,
                        readonly = true;
                }
                spin_unlock(&cache->lock);
-               if (!readonly && return_free_space &&
-                   global_rsv->space_info == space_info) {
-                       u64 to_add = len;
-
-                       spin_lock(&global_rsv->lock);
-                       if (!global_rsv->full) {
-                               to_add = min(len, global_rsv->size -
-                                            global_rsv->reserved);
-                               global_rsv->reserved += to_add;
-                               btrfs_space_info_update_bytes_may_use(fs_info,
-                                               space_info, to_add);
-                               if (global_rsv->reserved >= global_rsv->size)
-                                       global_rsv->full = 1;
-                               len -= to_add;
+               if (!readonly && return_free_space) {
+                       if (global_rsv->space_info = space_info ){
+                               spin_lock(&global_rsv->lock);
+                               if (!global_rsv->full) {
+                                       u64 to_add = len;
+                                       to_add = min(len, global_rsv->size -
+                                                    global_rsv->reserved);
+                                       global_rsv->reserved += to_add;
+                                       btrfs_space_info_update_bytes_may_use(fs_info,
+                                                       space_info, to_add);
+                                       if (global_rsv->reserved >= global_rsv->size)
+                                               global_rsv->full = 1;
+                                       len -= to_add;
+                               }
+                               spin_unlock(&global_rsv->lock);
                        }
-                       spin_unlock(&global_rsv->lock);
+                       if (len)
+                               btrfs_try_granting_tickets(fs_info, space_info);
                }
-               /* Add to any tickets we may have */
-               if (!readonly && return_free_space && len)
-                       btrfs_try_granting_tickets(fs_info, space_info);
                spin_unlock(&space_info->lock);
        }

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space
  2020-01-29 23:50 ` [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
@ 2020-01-31 11:37   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-31 11:37 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> If we have compression on we could free up more space than we reserved,
> and thus be able to make a space reservation.  Add the call for this
> scenario.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>


Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/block-group.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 616d0dd69394..aca4510f1f2d 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2899,6 +2899,13 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
>  						      space_info, -ram_bytes);
>  		if (delalloc)
>  			cache->delalloc_bytes += num_bytes;
> +
> +		/*
> +		 * Compression can use less space than we reserved, so wake
> +		 * tickets if that happens.
> +		 */
> +		if (num_bytes < ram_bytes)
> +			btrfs_try_granting_tickets(cache->fs_info, space_info);
>  	}
>  	spin_unlock(&cache->lock);
>  	spin_unlock(&space_info->lock);
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc
  2020-01-29 23:50 ` [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
@ 2020-01-31 11:56   ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-31 11:56 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
> We are going to use the ticket infrastructure for data, so use the
> btrfs_space_info_free_bytes_may_use() helper in
> btrfs_free_reserved_data_space_noquota() so we get the
> try_granting_tickets call when we free our reservation.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-30 12:06   ` Nikolay Borisov
@ 2020-01-31 15:07     ` Josef Bacik
  2020-01-31 15:13       ` Nikolay Borisov
  0 siblings, 1 reply; 33+ messages in thread
From: Josef Bacik @ 2020-01-31 15:07 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs, kernel-team

On 1/30/20 7:06 AM, Nikolay Borisov wrote:
> 
> 
> On 30.01.20 г. 1:50 ч., Josef Bacik wrote:
>> We have btrfs_wait_ordered_roots() which takes a u64 for nr, but
>> btrfs_start_delalloc_roots() that takes an int for nr, which makes using
>> them in conjunction, especially for something like (u64)-1, annoying and
>> inconsistent.  Fix btrfs_start_delalloc_roots() to take a u64 for nr and
>> adjust start_delalloc_inodes() and it's callers appropriately.
> 
> nit: You could include one more sentence to be explicit about the fact
> that now 'nr' management is delegated to start_delalloc_inodes i.e you
> pass it as a pointer to that function which in turn will control when
> btrfs_Start_delalloc_roots breaks out of its own loop.
> 
>>
>> Part of adjusting the callers to this means changing
>> btrfs_writeback_inodes_sb_nr() to take a u64 for items.  This may be
>> confusing because it seems unrelated, but the caller of
>> btrfs_writeback_inodes_sb_nr() already passes in a u64, it's just the
>> function variable that needs to be changed.
>>
>> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>> ---
>>   fs/btrfs/ctree.h       |  2 +-
>>   fs/btrfs/dev-replace.c |  2 +-
>>   fs/btrfs/inode.c       | 27 +++++++++++----------------
>>   fs/btrfs/ioctl.c       |  2 +-
>>   fs/btrfs/space-info.c  |  2 +-
>>   5 files changed, 15 insertions(+), 20 deletions(-)
>>
> 
> <snip>
> 
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -9619,7 +9619,8 @@ static struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode
>>    * some fairly slow code that needs optimization. This walks the list
>>    * of all the inodes with pending delalloc and forces them to disk.
>>    */
>> -static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
>> +static int start_delalloc_inodes(struct btrfs_root *root, u64 *nr,
>> +				 bool snapshot)
>>   {
>>   	struct btrfs_inode *binode;
>>   	struct inode *inode;
>> @@ -9659,9 +9660,11 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
>>   		list_add_tail(&work->list, &works);
>>   		btrfs_queue_work(root->fs_info->flush_workers,
>>   				 &work->work);
>> -		ret++;
>> -		if (nr != -1 && ret >= nr)
>> -			goto out;
>> +		if (*nr != U64_MAX) {
>> +			(*nr)--;
>> +			if (*nr == 0)
>> +				goto out;
>> +		}
>>   		cond_resched();
>>   		spin_lock(&root->delalloc_lock);
>>   	}
>> @@ -9686,18 +9689,15 @@ static int start_delalloc_inodes(struct btrfs_root *root, int nr, bool snapshot)
>>   int btrfs_start_delalloc_snapshot(struct btrfs_root *root)
>>   {
>>   	struct btrfs_fs_info *fs_info = root->fs_info;
>> -	int ret;
>> +	u64 nr = U64_MAX;
> 
> This var is never used past start_delalloc_snapshot so you can remove it
> and simply pass U64_MAX to start_delalloc_inodes.

Except start_delalloc_inodes() is now taking a pointer, not the value itself. 
Thanks,

Josef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots
  2020-01-31 15:07     ` Josef Bacik
@ 2020-01-31 15:13       ` Nikolay Borisov
  0 siblings, 0 replies; 33+ messages in thread
From: Nikolay Borisov @ 2020-01-31 15:13 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team



On 31.01.20 г. 17:07 ч., Josef Bacik wrote:
> On 1/30/20 7:06 AM, Nikolay Borisov wrote:
<snip>

> Except start_delalloc_inodes() is now taking a pointer, not the value
> itself. Thanks,

Doh ... you are right :)


> 
> Josef

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-01-31 15:13 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-29 23:50 [PATCH 00/20][RFC] Convert data reservations to the ticketing infrastructure Josef Bacik
2020-01-29 23:50 ` [PATCH 01/20] btrfs: change nr to u64 in btrfs_start_delalloc_roots Josef Bacik
2020-01-30 12:06   ` Nikolay Borisov
2020-01-31 15:07     ` Josef Bacik
2020-01-31 15:13       ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 02/20] btrfs: remove orig from shrink_delalloc Josef Bacik
2020-01-30 11:44   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 03/20] btrfs: handle U64_MAX for shrink_delalloc Josef Bacik
2020-01-30 11:46   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 04/20] btrfs: make shrink_delalloc take space_info as an arg Josef Bacik
2020-01-30 12:15   ` Nikolay Borisov
2020-01-30 12:35   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 05/20] btrfs: make ALLOC_CHUNK use the space info flags Josef Bacik
2020-01-30 13:19   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 06/20] btrfs: call btrfs_try_granting_tickets when freeing reserved bytes Josef Bacik
2020-01-31 11:16   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 07/20] btrfs: call btrfs_try_granting_tickets when unpinning anything Josef Bacik
2020-01-31 11:25   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 08/20] btrfs: call btrfs_try_granting_tickets when reserving space Josef Bacik
2020-01-31 11:37   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 09/20] btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc Josef Bacik
2020-01-31 11:56   ` Nikolay Borisov
2020-01-29 23:50 ` [PATCH 10/20] btrfs: add flushing states for handling data reservations Josef Bacik
2020-01-29 23:50 ` [PATCH 11/20] btrfs: add btrfs_reserve_data_bytes and use it Josef Bacik
2020-01-29 23:50 ` [PATCH 12/20] btrfs: use ticketing for data space reservations Josef Bacik
2020-01-29 23:50 ` [PATCH 13/20] btrfs: run delayed iputs before committing the transaction for data Josef Bacik
2020-01-29 23:50 ` [PATCH 14/20] btrfs: flush delayed refs when trying to reserve data space Josef Bacik
2020-01-29 23:50 ` [PATCH 15/20] btrfs: serialize data reservations if we are flushing Josef Bacik
2020-01-29 23:50 ` [PATCH 16/20] btrfs: rework chunk allocate for data reservations Josef Bacik
2020-01-29 23:50 ` [PATCH 17/20] btrfs: drop the commit_cycles stuff " Josef Bacik
2020-01-29 23:50 ` [PATCH 18/20] btrfs: use the same helper for data and metadata reservations Josef Bacik
2020-01-29 23:50 ` [PATCH 19/20] btrfs: do async reclaim for data reservations Josef Bacik
2020-01-29 23:50 ` [PATCH 20/20] btrfs: kill the priority_reclaim_space helper Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.