linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5][v2] Fix global reserve size and can overcommit
@ 2019-08-22 19:18 Josef Bacik
  2019-08-22 19:19 ` [PATCH 1/5] btrfs: change the minimum global reserve size Josef Bacik
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:18 UTC (permalink / raw)
  To: kernel-team, linux-btrfs

This didn't series didn't change much, but I did move things around a little bit
in previous series so these needed to be updated.

 fs/btrfs/block-rsv.c  | 36 ++++++++++++++++++++++++++----------
 fs/btrfs/space-info.c | 51 ++++++++++++++++++++++++++-------------------------
 2 files changed, 52 insertions(+), 35 deletions(-)

v1->v2:
- Rebased with the new name changes from previous patches.
- Added the reviewed-by's.
- Added a few lockdep_assert_held()'s.
- Patch " btrfs: use btrfs_try_granting_tickets in update_global_rsv" fixed to
  just use btrfs_try_granting_tickets directly since we're already holding the
  space_info->lock.

-- Original email --
We hit a pretty crappy corner case in production that resulted in boxes slowing
down to a crawl.

can_overcommit() will not allow us to overcommit if there is not enough "real"
space to satisfy the global reserve.  This is for hysterical raisins, we used to
not be able to allocate block groups a transaction commit time, so running out
of real space for the global reserve would result in an aborted transaction.

However that limitation was fixed years ago, so we no longer need to impose that
limitation on ourselves and can just overcommit with reckless abandon.

But this change exposed a bunch of corner cases in how we deal with very small
file systems.  A lot of enospc related xfstests make very tiny file systems.
Btrfs by default gives you an 8mib metadata chunk.  We default to having a
minimum global reserve size of 16mib.  This is problematic.

Before we would allocate a new metadata chunk, which actually meant we were
"passing" these tests with 24mib metadata chunks but 256kib of used metadata,
and drastically less data space than was intended.  With my change we started
failing these tests because we were no longer allocating the extra metadata
chunks.

The changes to the global reserve are to make it so we pass these xfstests, as
well as add some real hard logic to the minimum size of the global reserve,
rather than just an arbitrary 16mib.  In practice those changes won't affect
real users because real users use more than 8mib of metadata and will get full
chunks.

These patches fix the original problem I intended to fix, and also have the nice
side-effect of making a bunch of the enospc xfstests finish _much_ faster.  The
diffstat is as follows

 fs/btrfs/block-rsv.c  | 43 ++++++++++++++++++++++++++++++-------------
 fs/btrfs/space-info.c | 51 +++++++++++++++++++++++++--------------------------
 2 files changed, 55 insertions(+), 39 deletions(-)

Thanks,

Josef


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] btrfs: change the minimum global reserve size
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
@ 2019-08-22 19:19 ` Josef Bacik
  2019-08-22 19:19 ` [PATCH 2/5] btrfs: always reserve our entire size for the global reserve Josef Bacik
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:19 UTC (permalink / raw)
  To: kernel-team, linux-btrfs

It made sense to have the global reserve set at 16M in the past, but
since it is used less nowadays set the minimum size to the number of
items we'll need to update the main trees we update during a transaction
commit, plus some slop area so we can do unlinks if we need to.

In practice this doesn't affect normal file systems, but for xfstests
where we do things like fill up a fs and then rm * it can fall over in
weird ways.  This enables us for more sane behavior at extremely small
file system sizes.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-rsv.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 60f313888a7d..76be1d978c36 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -259,6 +259,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
 	struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
 	struct btrfs_space_info *sinfo = block_rsv->space_info;
 	u64 num_bytes;
+	unsigned min_items;
 
 	/*
 	 * The global block rsv is based on the size of the extent tree, the
@@ -268,7 +269,26 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
 	num_bytes = btrfs_root_used(&fs_info->extent_root->root_item) +
 		btrfs_root_used(&fs_info->csum_root->root_item) +
 		btrfs_root_used(&fs_info->tree_root->root_item);
-	num_bytes = max_t(u64, num_bytes, SZ_16M);
+
+	/*
+	 * We at a minimum are going to modify the csum root, the tree root, and
+	 * the extent root.
+	 */
+	min_items = 3;
+
+	/*
+	 * But we also want to reserve enough space so we can do the fallback
+	 * global reserve for an unlink, which is an additional 5 items (see the
+	 * comment in __unlink_start_trans for what we're modifying.)
+	 *
+	 * But we also need space for the delayed ref updates from the unlink,
+	 * so its 10, 5 for the actual operation, and 5 for the delayed ref
+	 * updates.
+	 */
+	min_items += 10;
+
+	num_bytes = max_t(u64, num_bytes,
+			  btrfs_calc_insert_metadata_size(fs_info, min_items));
 
 	spin_lock(&sinfo->lock);
 	spin_lock(&block_rsv->lock);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] btrfs: always reserve our entire size for the global reserve
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
  2019-08-22 19:19 ` [PATCH 1/5] btrfs: change the minimum global reserve size Josef Bacik
@ 2019-08-22 19:19 ` Josef Bacik
  2019-08-22 19:19 ` [PATCH 3/5] btrfs: use btrfs_try_granting_tickets in update_global_rsv Josef Bacik
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:19 UTC (permalink / raw)
  To: kernel-team, linux-btrfs; +Cc: Nikolay Borisov

While messing with the overcommit logic I noticed that sometimes we'd
ENOSPC out when really we should have run out of space much earlier.  It
turns out it's because we'll only reserve up to the free amount left in
the space info for the global reserve, but that doesn't make sense with
overcommit because we could be well above our actual size.  This results
in the global reserve not carving out it's entire reservation, and thus
not putting enough pressure on the rest of the infrastructure to do the
right thing and ENOSPC out at a convenient time.  Fix this by always
taking our full reservation amount for the global reserve.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/block-rsv.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 76be1d978c36..67071bb8e433 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -296,15 +296,10 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
 	block_rsv->size = min_t(u64, num_bytes, SZ_512M);
 
 	if (block_rsv->reserved < block_rsv->size) {
-		num_bytes = btrfs_space_info_used(sinfo, true);
-		if (sinfo->total_bytes > num_bytes) {
-			num_bytes = sinfo->total_bytes - num_bytes;
-			num_bytes = min(num_bytes,
-					block_rsv->size - block_rsv->reserved);
-			block_rsv->reserved += num_bytes;
-			btrfs_space_info_update_bytes_may_use(fs_info, sinfo,
-							      num_bytes);
-		}
+		num_bytes = block_rsv->size - block_rsv->reserved;
+		block_rsv->reserved += num_bytes;
+		btrfs_space_info_update_bytes_may_use(fs_info, sinfo,
+						      num_bytes);
 	} else if (block_rsv->reserved > block_rsv->size) {
 		num_bytes = block_rsv->reserved - block_rsv->size;
 		btrfs_space_info_update_bytes_may_use(fs_info, sinfo,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] btrfs: use btrfs_try_granting_tickets in update_global_rsv
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
  2019-08-22 19:19 ` [PATCH 1/5] btrfs: change the minimum global reserve size Josef Bacik
  2019-08-22 19:19 ` [PATCH 2/5] btrfs: always reserve our entire size for the global reserve Josef Bacik
@ 2019-08-22 19:19 ` Josef Bacik
  2019-08-22 19:19 ` [PATCH 4/5] btrfs: do not account global reserve in can_overcommit Josef Bacik
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:19 UTC (permalink / raw)
  To: kernel-team, linux-btrfs; +Cc: Nikolay Borisov

We have some annoying xfstests tests that will create a very small fs,
fill it up, delete it, and repeat to make sure everything works right.
This trips btrfs up sometimes because we may commit a transaction to
free space, but most of the free metadata space was being reserved by
the global reserve.  So we commit and update the global reserve, but the
space is simply added to bytes_may_use directly, instead of trying to
add it to existing tickets.  This results in ENOSPC when we really did
have space.  Fix this by calling btrfs_try_granting_tickets once we add
back our excess space to wake any pending tickets.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/block-rsv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 67071bb8e433..a292866221b0 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -305,6 +305,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
 		btrfs_space_info_update_bytes_may_use(fs_info, sinfo,
 						      -num_bytes);
 		block_rsv->reserved = block_rsv->size;
+		btrfs_try_granting_tickets(fs_info, sinfo);
 	}
 
 	if (block_rsv->reserved == block_rsv->size)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] btrfs: do not account global reserve in can_overcommit
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
                   ` (2 preceding siblings ...)
  2019-08-22 19:19 ` [PATCH 3/5] btrfs: use btrfs_try_granting_tickets in update_global_rsv Josef Bacik
@ 2019-08-22 19:19 ` Josef Bacik
  2020-01-23 16:14   ` Anand Jain
  2019-08-22 19:19 ` [PATCH 5/5] btrfs: add enospc debug messages for ticket failure Josef Bacik
  2019-08-23 13:17 ` [PATCH 0/5][v2] Fix global reserve size and can overcommit David Sterba
  5 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:19 UTC (permalink / raw)
  To: kernel-team, linux-btrfs; +Cc: Nikolay Borisov

We ran into a problem in production where a box with plenty of space was
getting wedged doing ENOSPC flushing.  These boxes only had 20% of the
disk allocated, but their metadata space + global reserve was right at
the size of their metadata chunk.

In this case can_overcommit should be allowing allocations without
problem, but there's logic in can_overcommit that doesn't allow us to
overcommit if there's not enough real space to satisfy the global
reserve.

This is for historical reasons.  Before there were only certain places
we could allocate chunks.  We could go to commit the transaction and not
have enough space for our pending delayed refs and such and be unable to
allocate a new chunk.  This would result in a abort because of ENOSPC.
This code was added to solve this problem.

However since then we've gained the ability to always be able to
allocate a chunk.  So we can easily overcommit in these cases without
risking a transaction abort because of ENOSPC.

Also prior to now the global reserve really would be used because that's
the space we relied on for delayed refs.  With delayed refs being
tracked separately we no longer have to worry about running out of
delayed refs space while committing.  We are much less likely to
exhaust our global reserve space during transaction commit.

Fix the can_overcommit code to simply see if our current usage + what we
want is less than our current free space plus whatever slack space we
have in the disk is.  This solves the problem we were seeing in
production and keeps us from flushing as aggressively as we approach our
actual metadata size usage.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/space-info.c | 19 +------------------
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a43f6287074b..3053b3e91b34 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -165,9 +165,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
 			  enum btrfs_reserve_flush_enum flush,
 			  bool system_chunk)
 {
-	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
 	u64 profile;
-	u64 space_size;
 	u64 avail;
 	u64 used;
 	int factor;
@@ -181,22 +179,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
 	else
 		profile = btrfs_metadata_alloc_profile(fs_info);
 
-	used = btrfs_space_info_used(space_info, false);
-
-	/*
-	 * We only want to allow over committing if we have lots of actual space
-	 * free, but if we don't have enough space to handle the global reserve
-	 * space then we could end up having a real enospc problem when trying
-	 * to allocate a chunk or some other such important allocation.
-	 */
-	spin_lock(&global_rsv->lock);
-	space_size = calc_global_rsv_need_space(global_rsv);
-	spin_unlock(&global_rsv->lock);
-	if (used + space_size >= space_info->total_bytes)
-		return 0;
-
-	used += space_info->bytes_may_use;
-
+	used = btrfs_space_info_used(space_info, true);
 	avail = atomic64_read(&fs_info->free_chunk_space);
 
 	/*
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] btrfs: add enospc debug messages for ticket failure
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
                   ` (3 preceding siblings ...)
  2019-08-22 19:19 ` [PATCH 4/5] btrfs: do not account global reserve in can_overcommit Josef Bacik
@ 2019-08-22 19:19 ` Josef Bacik
  2019-08-23 13:17 ` [PATCH 0/5][v2] Fix global reserve size and can overcommit David Sterba
  5 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2019-08-22 19:19 UTC (permalink / raw)
  To: kernel-team, linux-btrfs; +Cc: Nikolay Borisov

When debugging weird enospc problems it's handy to be able to dump the
space info when we wake up all tickets, and see what the ticket values
are.  This helped me figure out cases where we were enospc'ing when we
shouldn't have been.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/space-info.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 3053b3e91b34..54d0f34682d8 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -258,14 +258,11 @@ do {									\
 	spin_unlock(&__rsv->lock);					\
 } while (0)
 
-void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
-			   struct btrfs_space_info *info, u64 bytes,
-			   int dump_block_groups)
+static void __btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
+				    struct btrfs_space_info *info)
 {
-	struct btrfs_block_group_cache *cache;
-	int index = 0;
+	lockdep_assert_held(&info->lock);
 
-	spin_lock(&info->lock);
 	btrfs_info(fs_info, "space_info %llu has %llu free, is %sfull",
 		   info->flags,
 		   info->total_bytes - btrfs_space_info_used(info, true),
@@ -275,7 +272,6 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 		info->total_bytes, info->bytes_used, info->bytes_pinned,
 		info->bytes_reserved, info->bytes_may_use,
 		info->bytes_readonly);
-	spin_unlock(&info->lock);
 
 	DUMP_BLOCK_RSV(fs_info, global_block_rsv);
 	DUMP_BLOCK_RSV(fs_info, trans_block_rsv);
@@ -283,6 +279,19 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 	DUMP_BLOCK_RSV(fs_info, delayed_block_rsv);
 	DUMP_BLOCK_RSV(fs_info, delayed_refs_rsv);
 
+}
+
+void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
+			   struct btrfs_space_info *info, u64 bytes,
+			   int dump_block_groups)
+{
+	struct btrfs_block_group_cache *cache;
+	int index = 0;
+
+	spin_lock(&info->lock);
+	__btrfs_dump_space_info(fs_info, info);
+	spin_unlock(&info->lock);
+
 	if (!dump_block_groups)
 		return;
 
@@ -681,6 +690,11 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 	u64 tickets_id = space_info->tickets_id;
 	u64 first_ticket_bytes = 0;
 
+	if (btrfs_test_opt(fs_info, ENOSPC_DEBUG)) {
+		btrfs_info(fs_info, "cannot satisfy tickets, dumping space info");
+		__btrfs_dump_space_info(fs_info, space_info);
+	}
+
 	while (!list_empty(&space_info->tickets) &&
 	       tickets_id == space_info->tickets_id) {
 		ticket = list_first_entry(&space_info->tickets,
@@ -701,6 +715,10 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
 		else if (first_ticket_bytes > ticket->bytes)
 			return true;
 
+		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+			btrfs_info(fs_info, "failing ticket with %llu bytes",
+				   ticket->bytes);
+
 		list_del_init(&ticket->list);
 		ticket->error = -ENOSPC;
 		wake_up(&ticket->wait);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5][v2] Fix global reserve size and can overcommit
  2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
                   ` (4 preceding siblings ...)
  2019-08-22 19:19 ` [PATCH 5/5] btrfs: add enospc debug messages for ticket failure Josef Bacik
@ 2019-08-23 13:17 ` David Sterba
  2019-08-28 18:03   ` David Sterba
  5 siblings, 1 reply; 12+ messages in thread
From: David Sterba @ 2019-08-23 13:17 UTC (permalink / raw)
  To: Josef Bacik; +Cc: kernel-team, linux-btrfs

On Thu, Aug 22, 2019 at 03:18:59PM -0400, Josef Bacik wrote:
> This didn't series didn't change much, but I did move things around a little bit
> in previous series so these needed to be updated.
> 
>  fs/btrfs/block-rsv.c  | 36 ++++++++++++++++++++++++++----------
>  fs/btrfs/space-info.c | 51 ++++++++++++++++++++++++++-------------------------
>  2 files changed, 52 insertions(+), 35 deletions(-)

Added as topic branch to for-next (based on the ticket rework patchset).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5][v2] Fix global reserve size and can overcommit
  2019-08-23 13:17 ` [PATCH 0/5][v2] Fix global reserve size and can overcommit David Sterba
@ 2019-08-28 18:03   ` David Sterba
  0 siblings, 0 replies; 12+ messages in thread
From: David Sterba @ 2019-08-28 18:03 UTC (permalink / raw)
  To: David Sterba; +Cc: Josef Bacik, kernel-team, linux-btrfs

On Fri, Aug 23, 2019 at 03:17:26PM +0200, David Sterba wrote:
> On Thu, Aug 22, 2019 at 03:18:59PM -0400, Josef Bacik wrote:
> > This didn't series didn't change much, but I did move things around a little bit
> > in previous series so these needed to be updated.
> > 
> >  fs/btrfs/block-rsv.c  | 36 ++++++++++++++++++++++++++----------
> >  fs/btrfs/space-info.c | 51 ++++++++++++++++++++++++++-------------------------
> >  2 files changed, 52 insertions(+), 35 deletions(-)
> 
> Added as topic branch to for-next (based on the ticket rework patchset).

Moved to misc-next.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] btrfs: do not account global reserve in can_overcommit
  2019-08-22 19:19 ` [PATCH 4/5] btrfs: do not account global reserve in can_overcommit Josef Bacik
@ 2020-01-23 16:14   ` Anand Jain
  2020-01-24  0:48     ` Josef Bacik
  0 siblings, 1 reply; 12+ messages in thread
From: Anand Jain @ 2020-01-23 16:14 UTC (permalink / raw)
  To: Josef Bacik, kernel-team, linux-btrfs; +Cc: Nikolay Borisov



This patch is causing regression in the test case generic/027
and generic/275 with MKFS_OPTIONS="-n64K" on x86_64.

Both of these tests, test FS behavior at ENOSPC.

In generic/027, it fails to delete a file with ENOSPC

      +rm: cannot remove '/mnt/scratch/testdir/6/file_1598': No space 
left on device

In generic/275, it failed to create at least 128k size file after
deleting 256k sized file. Failure may be fair taking into the cow part,
but any idea why it could be successful before this patch?

     +du: cannot access '/mnt/scratch/tmp1': No such file or directory
     +stat: cannot stat '/mnt/scratch/tmp1': No such file or directory

These fail on misc-next as well.

Thanks, Anand


On 8/23/19 3:19 AM, Josef Bacik wrote:
> We ran into a problem in production where a box with plenty of space was
> getting wedged doing ENOSPC flushing.  These boxes only had 20% of the
> disk allocated, but their metadata space + global reserve was right at
> the size of their metadata chunk.
> 
> In this case can_overcommit should be allowing allocations without
> problem, but there's logic in can_overcommit that doesn't allow us to
> overcommit if there's not enough real space to satisfy the global
> reserve.
> 
> This is for historical reasons.  Before there were only certain places
> we could allocate chunks.  We could go to commit the transaction and not
> have enough space for our pending delayed refs and such and be unable to
> allocate a new chunk.  This would result in a abort because of ENOSPC.
> This code was added to solve this problem.
> 
> However since then we've gained the ability to always be able to
> allocate a chunk.  So we can easily overcommit in these cases without
> risking a transaction abort because of ENOSPC.
> 
> Also prior to now the global reserve really would be used because that's
> the space we relied on for delayed refs.  With delayed refs being
> tracked separately we no longer have to worry about running out of
> delayed refs space while committing.  We are much less likely to
> exhaust our global reserve space during transaction commit.
> 
> Fix the can_overcommit code to simply see if our current usage + what we
> want is less than our current free space plus whatever slack space we
> have in the disk is.  This solves the problem we were seeing in
> production and keeps us from flushing as aggressively as we approach our
> actual metadata size usage.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> Reviewed-by: Nikolay Borisov <nborisov@suse.com>
> ---
>   fs/btrfs/space-info.c | 19 +------------------
>   1 file changed, 1 insertion(+), 18 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index a43f6287074b..3053b3e91b34 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -165,9 +165,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
>   			  enum btrfs_reserve_flush_enum flush,
>   			  bool system_chunk)
>   {
> -	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
>   	u64 profile;
> -	u64 space_size;
>   	u64 avail;
>   	u64 used;
>   	int factor;
> @@ -181,22 +179,7 @@ static int can_overcommit(struct btrfs_fs_info *fs_info,
>   	else
>   		profile = btrfs_metadata_alloc_profile(fs_info);
>   
> -	used = btrfs_space_info_used(space_info, false);
> -
> -	/*
> -	 * We only want to allow over committing if we have lots of actual space
> -	 * free, but if we don't have enough space to handle the global reserve
> -	 * space then we could end up having a real enospc problem when trying
> -	 * to allocate a chunk or some other such important allocation.
> -	 */
> -	spin_lock(&global_rsv->lock);
> -	space_size = calc_global_rsv_need_space(global_rsv);
> -	spin_unlock(&global_rsv->lock);
> -	if (used + space_size >= space_info->total_bytes)
> -		return 0;
> -
> -	used += space_info->bytes_may_use;
> -
> +	used = btrfs_space_info_used(space_info, true);
>   	avail = atomic64_read(&fs_info->free_chunk_space);
>   
>   	/*
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/5] btrfs: do not account global reserve in can_overcommit
  2020-01-23 16:14   ` Anand Jain
@ 2020-01-24  0:48     ` Josef Bacik
  0 siblings, 0 replies; 12+ messages in thread
From: Josef Bacik @ 2020-01-24  0:48 UTC (permalink / raw)
  To: Anand Jain, kernel-team, linux-btrfs; +Cc: Nikolay Borisov

On 1/23/20 11:14 AM, Anand Jain wrote:
> 
> 
> This patch is causing regression in the test case generic/027
> and generic/275 with MKFS_OPTIONS="-n64K" on x86_64.
> 
> Both of these tests, test FS behavior at ENOSPC.
> 
> In generic/027, it fails to delete a file with ENOSPC
> 
>       +rm: cannot remove '/mnt/scratch/testdir/6/file_1598': No space left on 
> device
> 
> In generic/275, it failed to create at least 128k size file after
> deleting 256k sized file. Failure may be fair taking into the cow part,
> but any idea why it could be successful before this patch?
> 
>      +du: cannot access '/mnt/scratch/tmp1': No such file or directory
>      +stat: cannot stat '/mnt/scratch/tmp1': No such file or directory
> 
> These fail on misc-next as well.

I spot checked this and it appears because before we just weren't getting as 
full before, and now we are.

These failures are the next thing on my list, I'll likely have them fixed early 
next week.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 5/5] btrfs: add enospc debug messages for ticket failure
  2019-08-16 15:20 ` [PATCH 5/5] btrfs: add enospc debug messages for ticket failure Josef Bacik
@ 2019-08-21  6:39   ` Nikolay Borisov
  0 siblings, 0 replies; 12+ messages in thread
From: Nikolay Borisov @ 2019-08-21  6:39 UTC (permalink / raw)
  To: Josef Bacik, kernel-team, linux-btrfs



On 16.08.19 г. 18:20 ч., Josef Bacik wrote:
> When debugging weird enospc problems it's handy to be able to dump the
> space info when we wake up all tickets, and see what the ticket values
> are.  This helped me figure out cases where we were enospc'ing when we
> shouldn't have been.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

With the suggestion below implemented you can add:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++--------
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 3d3f301bae26..2819fa91c4f0 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -256,14 +256,9 @@ do {									\
>  	spin_unlock(&__rsv->lock);					\
>  } while (0)
>  
> -void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
> -			   struct btrfs_space_info *info, u64 bytes,
> -			   int dump_block_groups)
> +static void __btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
> +				    struct btrfs_space_info *info)
>  {
> -	struct btrfs_block_group_cache *cache;
> -	int index = 0;
> -
> -	spin_lock(&info->lock);

lockdep_assert_held please

>  	btrfs_info(fs_info, "space_info %llu has %llu free, is %sfull",
>  		   info->flags,
>  		   info->total_bytes - btrfs_space_info_used(info, true),
> @@ -273,7 +268,6 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
>  		info->total_bytes, info->bytes_used, info->bytes_pinned,
>  		info->bytes_reserved, info->bytes_may_use,
>  		info->bytes_readonly);
> -	spin_unlock(&info->lock);
>  
>  	DUMP_BLOCK_RSV(fs_info, global_block_rsv);
>  	DUMP_BLOCK_RSV(fs_info, trans_block_rsv);
> @@ -281,6 +275,19 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
>  	DUMP_BLOCK_RSV(fs_info, delayed_block_rsv);
>  	DUMP_BLOCK_RSV(fs_info, delayed_refs_rsv);
>  
> +}
> +
> +void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
> +			   struct btrfs_space_info *info, u64 bytes,
> +			   int dump_block_groups)
> +{
> +	struct btrfs_block_group_cache *cache;
> +	int index = 0;
> +
> +	spin_lock(&info->lock);
> +	__btrfs_dump_space_info(fs_info, info);
> +	spin_unlock(&info->lock);
> +
>  	if (!dump_block_groups)
>  		return;
>  
> @@ -685,6 +692,11 @@ static bool wake_all_tickets(struct btrfs_fs_info *fs_info,
>  	u64 tickets_id = space_info->tickets_id;
>  	u64 first_ticket_bytes = 0;
>  
> +	if (btrfs_test_opt(fs_info, ENOSPC_DEBUG)) {
> +		btrfs_info(fs_info, "cannot satisfy tickets, dumping space info");
> +		__btrfs_dump_space_info(fs_info, space_info);
> +	}
> +
>  	while (!list_empty(&space_info->tickets) &&
>  	       tickets_id == space_info->tickets_id) {
>  		ticket = list_first_entry(&space_info->tickets,
> @@ -705,6 +717,10 @@ static bool wake_all_tickets(struct btrfs_fs_info *fs_info,
>  		else if (first_ticket_bytes > ticket->bytes)
>  			return true;
>  
> +		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
> +			btrfs_info(fs_info, "failing ticket with %llu bytes",
> +				   ticket->bytes);
> +
>  		list_del_init(&ticket->list);
>  		ticket->error = -ENOSPC;
>  		wake_up(&ticket->wait);
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 5/5] btrfs: add enospc debug messages for ticket failure
  2019-08-16 15:20 [PATCH 0/5] " Josef Bacik
@ 2019-08-16 15:20 ` Josef Bacik
  2019-08-21  6:39   ` Nikolay Borisov
  0 siblings, 1 reply; 12+ messages in thread
From: Josef Bacik @ 2019-08-16 15:20 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When debugging weird enospc problems it's handy to be able to dump the
space info when we wake up all tickets, and see what the ticket values
are.  This helped me figure out cases where we were enospc'ing when we
shouldn't have been.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 3d3f301bae26..2819fa91c4f0 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -256,14 +256,9 @@ do {									\
 	spin_unlock(&__rsv->lock);					\
 } while (0)
 
-void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
-			   struct btrfs_space_info *info, u64 bytes,
-			   int dump_block_groups)
+static void __btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
+				    struct btrfs_space_info *info)
 {
-	struct btrfs_block_group_cache *cache;
-	int index = 0;
-
-	spin_lock(&info->lock);
 	btrfs_info(fs_info, "space_info %llu has %llu free, is %sfull",
 		   info->flags,
 		   info->total_bytes - btrfs_space_info_used(info, true),
@@ -273,7 +268,6 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 		info->total_bytes, info->bytes_used, info->bytes_pinned,
 		info->bytes_reserved, info->bytes_may_use,
 		info->bytes_readonly);
-	spin_unlock(&info->lock);
 
 	DUMP_BLOCK_RSV(fs_info, global_block_rsv);
 	DUMP_BLOCK_RSV(fs_info, trans_block_rsv);
@@ -281,6 +275,19 @@ void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
 	DUMP_BLOCK_RSV(fs_info, delayed_block_rsv);
 	DUMP_BLOCK_RSV(fs_info, delayed_refs_rsv);
 
+}
+
+void btrfs_dump_space_info(struct btrfs_fs_info *fs_info,
+			   struct btrfs_space_info *info, u64 bytes,
+			   int dump_block_groups)
+{
+	struct btrfs_block_group_cache *cache;
+	int index = 0;
+
+	spin_lock(&info->lock);
+	__btrfs_dump_space_info(fs_info, info);
+	spin_unlock(&info->lock);
+
 	if (!dump_block_groups)
 		return;
 
@@ -685,6 +692,11 @@ static bool wake_all_tickets(struct btrfs_fs_info *fs_info,
 	u64 tickets_id = space_info->tickets_id;
 	u64 first_ticket_bytes = 0;
 
+	if (btrfs_test_opt(fs_info, ENOSPC_DEBUG)) {
+		btrfs_info(fs_info, "cannot satisfy tickets, dumping space info");
+		__btrfs_dump_space_info(fs_info, space_info);
+	}
+
 	while (!list_empty(&space_info->tickets) &&
 	       tickets_id == space_info->tickets_id) {
 		ticket = list_first_entry(&space_info->tickets,
@@ -705,6 +717,10 @@ static bool wake_all_tickets(struct btrfs_fs_info *fs_info,
 		else if (first_ticket_bytes > ticket->bytes)
 			return true;
 
+		if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
+			btrfs_info(fs_info, "failing ticket with %llu bytes",
+				   ticket->bytes);
+
 		list_del_init(&ticket->list);
 		ticket->error = -ENOSPC;
 		wake_up(&ticket->wait);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-01-24  0:48 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22 19:18 [PATCH 0/5][v2] Fix global reserve size and can overcommit Josef Bacik
2019-08-22 19:19 ` [PATCH 1/5] btrfs: change the minimum global reserve size Josef Bacik
2019-08-22 19:19 ` [PATCH 2/5] btrfs: always reserve our entire size for the global reserve Josef Bacik
2019-08-22 19:19 ` [PATCH 3/5] btrfs: use btrfs_try_granting_tickets in update_global_rsv Josef Bacik
2019-08-22 19:19 ` [PATCH 4/5] btrfs: do not account global reserve in can_overcommit Josef Bacik
2020-01-23 16:14   ` Anand Jain
2020-01-24  0:48     ` Josef Bacik
2019-08-22 19:19 ` [PATCH 5/5] btrfs: add enospc debug messages for ticket failure Josef Bacik
2019-08-23 13:17 ` [PATCH 0/5][v2] Fix global reserve size and can overcommit David Sterba
2019-08-28 18:03   ` David Sterba
  -- strict thread matches above, loose matches on Subject: below --
2019-08-16 15:20 [PATCH 0/5] " Josef Bacik
2019-08-16 15:20 ` [PATCH 5/5] btrfs: add enospc debug messages for ticket failure Josef Bacik
2019-08-21  6:39   ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).