All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: linux-btrfs@vger.kernel.org, clm@fb.com
Cc: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Subject: [PATCH v7 19/20] btrfs: try more times to alloc metadata reserve space
Date: Thu, 18 Feb 2016 13:42:57 +0800	[thread overview]
Message-ID: <1455774178-3595-20-git-send-email-quwenruo@cn.fujitsu.com> (raw)
In-Reply-To: <1455774178-3595-1-git-send-email-quwenruo@cn.fujitsu.com>

From: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>

In btrfs_delalloc_reserve_metadata(), the number of metadata bytes we try
to reserve is calculated by the difference between outstanding_extents and
reserved_extents.

When reserve_metadata_bytes() fails to reserve desited metadata space,
it has already done some reclaim work, such as write ordered extents.

In that case, outstanding_extents and reserved_extents may already
changed, and we may reserve enough metadata space then.

So this patch will try to call reserve_metadata_bytes() at most 3 times
to ensure we really run out of space.

Such false ENOSPC is mainly caused by small file extents and time
consuming delalloc functions, which mainly affects in-band
de-duplication. (Compress should also be affected, but LZO/zlib is
faster than SHA256, so still harder to trigger than dedup).

Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2a17c88..c60e24a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5669,6 +5669,7 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
 	bool delalloc_lock = true;
 	u64 to_free = 0;
 	unsigned dropped;
+	int loops = 0;
 
 	/* If we are a free space inode we need to not flush since we will be in
 	 * the middle of a transaction commit.  We also don't need the delalloc
@@ -5684,11 +5685,12 @@ int btrfs_delalloc_reserve_metadata(struct inode *inode, u64 num_bytes)
 	    btrfs_transaction_in_commit(root->fs_info))
 		schedule_timeout(1);
 
+	num_bytes = ALIGN(num_bytes, root->sectorsize);
+
+again:
 	if (delalloc_lock)
 		mutex_lock(&BTRFS_I(inode)->delalloc_mutex);
 
-	num_bytes = ALIGN(num_bytes, root->sectorsize);
-
 	spin_lock(&BTRFS_I(inode)->lock);
 	nr_extents = (unsigned)div64_u64(num_bytes +
 					 BTRFS_MAX_EXTENT_SIZE - 1,
@@ -5809,6 +5811,23 @@ out_fail:
 	}
 	if (delalloc_lock)
 		mutex_unlock(&BTRFS_I(inode)->delalloc_mutex);
+	/*
+	 * The number of metadata bytes is calculated by the difference
+	 * between outstanding_extents and reserved_extents. Sometimes though
+	 * reserve_metadata_bytes() fails to reserve the wanted metadata bytes,
+	 * indeed it has already done some work to reclaim metadata space, hence
+	 * both outstanding_extents and reserved_extents would have changed and
+	 * the bytes we try to reserve would also has changed(may be smaller).
+	 * So here we try to reserve again. This is much useful for online
+	 * dedup, which will easily eat almost all meta space.
+	 *
+	 * XXX: Indeed here 3 is arbitrarily choosed, it's a good workaround for
+	 * online dedup, later we should find a better method to avoid dedup
+	 * enospc issue.
+	 */
+	if (unlikely(ret == -ENOSPC && loops++ < 3))
+		goto again;
+
 	return ret;
 }
 
-- 
2.7.1




  parent reply	other threads:[~2016-02-18  5:45 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-18  5:42 [GIT PULL][PATCH v7 00/19][For 4.6] Btrfs: Add inband (write time) de-duplication framework Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 01/20] btrfs: dedup: Introduce dedup framework and its header Qu Wenruo
2016-03-09 21:27   ` NeilBrown
2016-03-10  0:57     ` Qu Wenruo
2016-03-11 11:43       ` David Sterba
2016-03-12  8:16         ` Qu Wenruo
2016-03-13  5:16           ` Qu Wenruo
2016-03-13 11:33             ` NeilBrown
2016-03-13 16:55               ` Duncan
2016-03-15 22:08                 ` Nicholas D Steeves
2016-03-15 23:19                   ` Duncan
2016-02-18  5:42 ` [PATCH v7 02/20] btrfs: dedup: Introduce function to initialize dedup info Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 03/20] btrfs: dedup: Introduce function to add hash into in-memory tree Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 04/20] btrfs: dedup: Introduce function to remove hash from " Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 05/20] btrfs: delayed-ref: Add support for increasing data ref under spinlock Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 06/20] btrfs: dedup: Introduce function to search for an existing hash Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 07/20] btrfs: dedup: Implement btrfs_dedup_calc_hash interface Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 08/20] btrfs: ordered-extent: Add support for dedup Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 09/20] btrfs: dedup: Inband in-memory only de-duplication implement Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 10/20] btrfs: dedup: Add basic tree structure for on-disk dedup method Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 11/20] btrfs: dedup: Introduce interfaces to resume and cleanup dedup info Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 12/20] btrfs: dedup: Add support for on-disk hash search Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 13/20] btrfs: dedup: Add support to delete hash for on-disk backend Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 14/20] btrfs: dedup: Add support for adding " Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 15/20] btrfs: dedup: Add ioctl for inband deduplication Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 16/20] btrfs: dedup: add an inode nodedup flag Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 17/20] btrfs: dedup: add a property handler for online dedup Qu Wenruo
2016-02-18  5:42 ` [PATCH v7 18/20] btrfs: dedup: add per-file online dedup control Qu Wenruo
2016-02-18  5:42 ` Qu Wenruo [this message]
2016-02-18  5:42 ` [PATCH v7 20/20] btrfs: dedup: Fix a bug when running inband dedup with balance Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1455774178-3595-20-git-send-email-quwenruo@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=clm@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wangxg.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.