All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH][v2] btrfs: stop partially refilling tickets when releasing space
Date: Wed, 28 Aug 2019 11:15:24 -0400	[thread overview]
Message-ID: <20190828151524.17706-1-josef@toxicpanda.com> (raw)
In-Reply-To: <20190822191102.13732-5-josef@toxicpanda.com>

btrfs_space_info_add_old_bytes is used when adding the extra space from
an existing reservation back into the space_info to be used by any
waiting tickets.  In order to keep us from overcommitting we check to
make sure that we can still use this space for our reserve ticket, and
if we cannot we'll simply subtract it from space_info->bytes_may_use.

However this is problematic, because it assumes that only changes to
bytes_may_use would affect our ability to make reservations.  Any
changes to bytes_reserved would be missed.  If we were unable to make a
reservation prior because of reserved space, but that reserved space was
free'd due to unlink or truncate and we were allowed to immediately
reclaim that metadata space we would still ENOSPC.

Consider the example where we create a file with a bunch of extents,
using up 2mib of actual space for the new tree blocks.  Then we try to
make a reservation of 2mib but we do not have enough space to make this
reservation.  The iput() occurs in another thread and we remove this
space, and since we did not write the blocks we simply do
space_info->bytes_reserved -= 2mib.  We would never see this because we
do not check our space info used, we just try to re-use the freed
reservations.

To fix this problem, and to greatly simplify the wakeup code, do away
with this partial refilling nonsense.  Use
btrfs_space_info_add_old_bytes to subtract the reservation from
space_info->bytes_may_use, and then check the ticket against the total
used of the space_info the same way we do with the initial reservation
attempt.

This keeps the reservation logic consistent and solves the problem of
early ENOSPC in the case that we free up space in places other than
bytes_may_use and bytes_pinned.  Thanks,

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
v1->v2:
- Simply updated the changelog, what I was describing couldn't actually happen.
  I went back and re-ran tests and added in tracing and realized it was
  bytes_reserved that was changing without telling anybody, not that we were
  removing more of bytes_may_use than expected.  Updated the changelog to
  reflect this as well as hopefully make it clearer the motivation for the
  change.

 fs/btrfs/space-info.c | 43 ++++++++++++++++---------------------------
 1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a0a36d5768e1..357fe7548e07 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -233,52 +233,41 @@ void btrfs_space_info_add_old_bytes(struct btrfs_fs_info *fs_info,
 				    struct btrfs_space_info *space_info,
 				    u64 num_bytes)
 {
-	struct reserve_ticket *ticket;
 	struct list_head *head;
-	u64 used;
 	enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_NO_FLUSH;
-	bool check_overcommit = false;
 
 	spin_lock(&space_info->lock);
 	head = &space_info->priority_tickets;
+	btrfs_space_info_update_bytes_may_use(fs_info, space_info, -num_bytes);
 
-	/*
-	 * If we are over our limit then we need to check and see if we can
-	 * overcommit, and if we can't then we just need to free up our space
-	 * and not satisfy any requests.
-	 */
-	used = btrfs_space_info_used(space_info, true);
-	if (used - num_bytes >= space_info->total_bytes)
-		check_overcommit = true;
 again:
-	while (!list_empty(head) && num_bytes) {
-		ticket = list_first_entry(head, struct reserve_ticket,
-					  list);
-		/*
-		 * We use 0 bytes because this space is already reserved, so
-		 * adding the ticket space would be a double count.
-		 */
-		if (check_overcommit &&
-		    !can_overcommit(fs_info, space_info, 0, flush, false))
-			break;
-		if (num_bytes >= ticket->bytes) {
+	while (!list_empty(head)) {
+		struct reserve_ticket *ticket;
+		u64 used = btrfs_space_info_used(space_info, true);
+
+		ticket = list_first_entry(head, struct reserve_ticket, list);
+
+		/* Check and see if our ticket can be satisified now. */
+		if ((used + ticket->bytes <= space_info->total_bytes) ||
+		    can_overcommit(fs_info, space_info, ticket->bytes, flush,
+				   false)) {
+			btrfs_space_info_update_bytes_may_use(fs_info,
+							      space_info,
+							      ticket->bytes);
 			list_del_init(&ticket->list);
-			num_bytes -= ticket->bytes;
 			ticket->bytes = 0;
 			space_info->tickets_id++;
 			wake_up(&ticket->wait);
 		} else {
-			ticket->bytes -= num_bytes;
-			num_bytes = 0;
+			break;
 		}
 	}
 
-	if (num_bytes && head == &space_info->priority_tickets) {
+	if (head == &space_info->priority_tickets) {
 		head = &space_info->tickets;
 		flush = BTRFS_RESERVE_FLUSH_ALL;
 		goto again;
 	}
-	btrfs_space_info_update_bytes_may_use(fs_info, space_info, -num_bytes);
 	spin_unlock(&space_info->lock);
 }
 
-- 
2.21.0


  parent reply	other threads:[~2019-08-28 15:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-22 19:10 [PATCH 0/9][v3] Rework reserve ticket handling Josef Bacik
2019-08-22 19:10 ` [PATCH 1/9] btrfs: do not allow reservations if we have pending tickets Josef Bacik
2019-08-23  7:33   ` Nikolay Borisov
2019-08-22 19:10 ` [PATCH 2/9] btrfs: roll tracepoint into btrfs_space_info_update helper Josef Bacik
2019-08-23 12:12   ` David Sterba
2019-08-22 19:10 ` [PATCH 3/9] btrfs: add space reservation tracepoint for reserved bytes Josef Bacik
2019-08-23 12:17   ` David Sterba
2019-08-22 19:10 ` [PATCH 4/9] btrfs: rework btrfs_space_info_add_old_bytes Josef Bacik
2019-08-23  7:48   ` Nikolay Borisov
2019-08-23 12:30     ` David Sterba
2019-08-28 15:15   ` Josef Bacik [this message]
2019-08-22 19:10 ` [PATCH 5/9] btrfs: refactor the ticket wakeup code Josef Bacik
2019-08-22 19:10 ` [PATCH 6/9] btrfs: rework wake_all_tickets Josef Bacik
2019-08-23  8:17   ` Nikolay Borisov
2019-08-27 13:04     ` David Sterba
2019-08-28 15:12   ` [PATCH][v2] " Josef Bacik
2019-08-22 19:11 ` [PATCH 7/9] btrfs: fix may_commit_transaction to deal with no partial filling Josef Bacik
2019-08-23  8:18   ` Nikolay Borisov
2019-08-22 19:11 ` [PATCH 8/9] btrfs: remove orig_bytes from reserve_ticket Josef Bacik
2019-08-22 19:11 ` [PATCH 9/9] btrfs: rename btrfs_space_info_add_old_bytes Josef Bacik
2019-08-23  8:18   ` Nikolay Borisov
2019-08-23 12:55 ` [PATCH 0/9][v3] Rework reserve ticket handling David Sterba
2019-08-28 18:02   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190828151524.17706-1-josef@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.