linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs: trim latency improvements
@ 2018-09-06 21:18 jeffm
  2018-09-06 21:18 ` [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs jeffm
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: jeffm @ 2018-09-06 21:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Jeff Mahoney

From: Jeff Mahoney <jeffm@suse.com>

This patch set fixes a few issues with trim.

1) Fix device list iteration.  We're iterating the ->alloc_list while
   holding the device_list_mutex.  The ->alloc_list is protected by
   the chunk mutex and we don't want to hold it across the entire
   trim execution.  Instead, use the ->devices list, which is protected
   by the device_list_mutex.

2) Skip trim on devices that don't support it.  Rather than letting
   the block layer reject it, bounce out early.

3) Don't keep the commit_root_sem locked and the transaction pinned
   across the block layer component of trim.  We only need these to
   ensure the pending chunks list doesn't go away underneath us, so
   it's safe to drop across the trim itself.  Historically, this
   caused issues when fstrim and balance would run at the same time
   since balance would produce lots of transactions and would
   have to wait constantly, causing problems for everything else that
   wanted to start a transaction.

-Jeff
---

Jeff Mahoney (3):
  btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs
  btrfs: don't attempt to trim devices that don't support it
  btrfs: keep trim from interfering with transaction commits

 fs/btrfs/extent-tree.c | 33 +++++++++++++++++++++++----------
 1 file changed, 23 insertions(+), 10 deletions(-)

-- 
2.12.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs
  2018-09-06 21:18 [PATCH 0/3] btrfs: trim latency improvements jeffm
@ 2018-09-06 21:18 ` jeffm
  2018-09-14 16:22   ` David Sterba
  2018-09-06 21:18 ` [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it jeffm
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: jeffm @ 2018-09-06 21:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Jeff Mahoney

From: Jeff Mahoney <jeffm@suse.com>

btrfs_trim_fs iterates over the fs_devices->alloc_list while holding
the device_list_mutex.  The problem is that ->alloc_list is protected
by the chunk mutex.  We don't want to hold the chunk mutex over
the trim of the entire file system.  Fortunately, the ->dev_list
list is protected by the dev_list mutex and while it will give us
all devices, including read-only devices, we already just skip the
read-only devices.  Then we can continue to take and release the chunk
mutex while scanning each device.

Fixes: 499f377f49f (btrfs: iterate over unused chunk space in FITRIM)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 fs/btrfs/extent-tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3d9fe58c0080..a0e82589c3e8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -11008,8 +11008,8 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 	}
 
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
-	devices = &fs_info->fs_devices->alloc_list;
-	list_for_each_entry(device, devices, dev_alloc_list) {
+	devices = &fs_info->fs_devices->devices;
+	list_for_each_entry(device, devices, dev_list) {
 		ret = btrfs_trim_free_extents(device, range->minlen,
 					      &group_trimmed);
 		if (ret)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it
  2018-09-06 21:18 [PATCH 0/3] btrfs: trim latency improvements jeffm
  2018-09-06 21:18 ` [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs jeffm
@ 2018-09-06 21:18 ` jeffm
  2018-09-14 16:27   ` David Sterba
  2018-09-06 21:18 ` [PATCH 3/3] btrfs: keep trim from interfering with transaction commits jeffm
  2018-09-14 16:33 ` [PATCH 0/3] btrfs: trim latency improvements David Sterba
  3 siblings, 1 reply; 8+ messages in thread
From: jeffm @ 2018-09-06 21:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Jeff Mahoney

From: Jeff Mahoney <jeffm@suse.com>

We check whether any device the file system is using supports discard
in the ioctl call, but then we attempt to trim free extents on every
device regardless of whether discard is supported.  Due to the way
we mask off EOPNOTSUPP, we can end up issuing the trim operations
on each free range on devices that don't support it, just wasting time.

Fixes: 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 fs/btrfs/extent-tree.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a0e82589c3e8..92e5e9fd9bdd 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10887,6 +10887,10 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 
 	*trimmed = 0;
 
+	/* Discard not supported = nothing to do. */
+	if (!blk_queue_discard(bdev_get_queue(device->bdev)))
+		return 0;
+
 	/* Not writeable = nothing to do. */
 	if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 		return 0;
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] btrfs: keep trim from interfering with transaction commits
  2018-09-06 21:18 [PATCH 0/3] btrfs: trim latency improvements jeffm
  2018-09-06 21:18 ` [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs jeffm
  2018-09-06 21:18 ` [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it jeffm
@ 2018-09-06 21:18 ` jeffm
  2018-09-14 16:28   ` David Sterba
  2018-09-14 16:33 ` [PATCH 0/3] btrfs: trim latency improvements David Sterba
  3 siblings, 1 reply; 8+ messages in thread
From: jeffm @ 2018-09-06 21:18 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Jeff Mahoney

From: Jeff Mahoney <jeffm@suse.com>

Commit 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM)
fixed free space trimming, but introduced latency when it was running.
This is due to it pinning the transaction using both a incremented
refcount and holding the commit root sem for the duration of a single
trim operation.

This was to ensure safety but it's unnecessary.  We already hold the the
chunk mutex so we know that the chunk we're using can't be allocated
while we're trimming it.

In order to check against chunks allocated already in this transaction,
we need to check the pending chunks list.  To to that safely without
joining the transaction (or attaching than then having to commit it) we
need to ensure that the dev root's commit root doesn't change underneath
us and the pending chunk lists stays around until we're done with it.

We can ensure the former by holding the commit root sem and the latter
by pinning the transaction.  We do this now, but the critical section
covers the trim operation itself and we don't need to do that.

This patch moves the pinning and unpinning logic into helpers and
unpins the transaction after performing the search and check for
pending chunks.

Limiting the critical section of the transaction pinning improves
the latency substantially on slower storage (e.g. image files over NFS).

Fixes: 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
---
 fs/btrfs/extent-tree.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 92e5e9fd9bdd..8dc8e090667c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -10870,14 +10870,16 @@ int btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info,
  * We don't want a transaction for this since the discard may take a
  * substantial amount of time.  We don't require that a transaction be
  * running, but we do need to take a running transaction into account
- * to ensure that we're not discarding chunks that were released in
- * the current transaction.
+ * to ensure that we're not discarding chunks that were released or
+ * allocated in the current transaction.
  *
  * Holding the chunks lock will prevent other threads from allocating
  * or releasing chunks, but it won't prevent a running transaction
  * from committing and releasing the memory that the pending chunks
  * list head uses.  For that, we need to take a reference to the
- * transaction.
+ * transaction and hold the commit root sem.  We only need to hold
+ * it while performing the free space search since we have already
+ * held back allocations.
  */
 static int btrfs_trim_free_extents(struct btrfs_device *device,
 				   u64 minlen, u64 *trimmed)
@@ -10908,9 +10910,13 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 
 		ret = mutex_lock_interruptible(&fs_info->chunk_mutex);
 		if (ret)
-			return ret;
+			break;
 
-		down_read(&fs_info->commit_root_sem);
+		ret = down_read_killable(&fs_info->commit_root_sem);
+		if (ret) {
+			mutex_unlock(&fs_info->chunk_mutex);
+			break;
+		}
 
 		spin_lock(&fs_info->trans_lock);
 		trans = fs_info->running_transaction;
@@ -10918,13 +10924,17 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 			refcount_inc(&trans->use_count);
 		spin_unlock(&fs_info->trans_lock);
 
+		if (!trans)
+			up_read(&fs_info->commit_root_sem);
+
 		ret = find_free_dev_extent_start(trans, device, minlen, start,
 						 &start, &len);
-		if (trans)
+		if (trans) {
+			up_read(&fs_info->commit_root_sem);
 			btrfs_put_transaction(trans);
+		}
 
 		if (ret) {
-			up_read(&fs_info->commit_root_sem);
 			mutex_unlock(&fs_info->chunk_mutex);
 			if (ret == -ENOSPC)
 				ret = 0;
@@ -10932,7 +10942,6 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 		}
 
 		ret = btrfs_issue_discard(device->bdev, start, len, &bytes);
-		up_read(&fs_info->commit_root_sem);
 		mutex_unlock(&fs_info->chunk_mutex);
 
 		if (ret)
-- 
2.12.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs
  2018-09-06 21:18 ` [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs jeffm
@ 2018-09-14 16:22   ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2018-09-14 16:22 UTC (permalink / raw)
  To: jeffm; +Cc: linux-btrfs

On Thu, Sep 06, 2018 at 05:18:14PM -0400, jeffm@suse.com wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> 
> btrfs_trim_fs iterates over the fs_devices->alloc_list while holding
> the device_list_mutex.  The problem is that ->alloc_list is protected
> by the chunk mutex.  We don't want to hold the chunk mutex over
> the trim of the entire file system.  Fortunately, the ->dev_list
> list is protected by the dev_list mutex and while it will give us
> all devices, including read-only devices, we already just skip the
> read-only devices.  Then we can continue to take and release the chunk
> mutex while scanning each device.
> 
> Fixes: 499f377f49f (btrfs: iterate over unused chunk space in FITRIM)
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it
  2018-09-06 21:18 ` [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it jeffm
@ 2018-09-14 16:27   ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2018-09-14 16:27 UTC (permalink / raw)
  To: jeffm; +Cc: linux-btrfs

On Thu, Sep 06, 2018 at 05:18:15PM -0400, jeffm@suse.com wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> 
> We check whether any device the file system is using supports discard
> in the ioctl call, but then we attempt to trim free extents on every
> device regardless of whether discard is supported.  Due to the way
> we mask off EOPNOTSUPP, we can end up issuing the trim operations
> on each free range on devices that don't support it, just wasting time.
> 
> Fixes: 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM)
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] btrfs: keep trim from interfering with transaction commits
  2018-09-06 21:18 ` [PATCH 3/3] btrfs: keep trim from interfering with transaction commits jeffm
@ 2018-09-14 16:28   ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2018-09-14 16:28 UTC (permalink / raw)
  To: jeffm; +Cc: linux-btrfs

On Thu, Sep 06, 2018 at 05:18:16PM -0400, jeffm@suse.com wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> 
> Commit 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM)
> fixed free space trimming, but introduced latency when it was running.
> This is due to it pinning the transaction using both a incremented
> refcount and holding the commit root sem for the duration of a single
> trim operation.
> 
> This was to ensure safety but it's unnecessary.  We already hold the the
> chunk mutex so we know that the chunk we're using can't be allocated
> while we're trimming it.
> 
> In order to check against chunks allocated already in this transaction,
> we need to check the pending chunks list.  To to that safely without
> joining the transaction (or attaching than then having to commit it) we
> need to ensure that the dev root's commit root doesn't change underneath
> us and the pending chunk lists stays around until we're done with it.
> 
> We can ensure the former by holding the commit root sem and the latter
> by pinning the transaction.  We do this now, but the critical section
> covers the trim operation itself and we don't need to do that.
> 
> This patch moves the pinning and unpinning logic into helpers and
> unpins the transaction after performing the search and check for
> pending chunks.
> 
> Limiting the critical section of the transaction pinning improves
> the latency substantially on slower storage (e.g. image files over NFS).
> 
> Fixes: 499f377f49f08 (btrfs: iterate over unused chunk space in FITRIM
> Signed-off-by: Jeff Mahoney <jeffm@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] btrfs: trim latency improvements
  2018-09-06 21:18 [PATCH 0/3] btrfs: trim latency improvements jeffm
                   ` (2 preceding siblings ...)
  2018-09-06 21:18 ` [PATCH 3/3] btrfs: keep trim from interfering with transaction commits jeffm
@ 2018-09-14 16:33 ` David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2018-09-14 16:33 UTC (permalink / raw)
  To: jeffm; +Cc: linux-btrfs

On Thu, Sep 06, 2018 at 05:18:13PM -0400, jeffm@suse.com wrote:
> From: Jeff Mahoney <jeffm@suse.com>
> 
> This patch set fixes a few issues with trim.
> 
> 1) Fix device list iteration.  We're iterating the ->alloc_list while
>    holding the device_list_mutex.  The ->alloc_list is protected by
>    the chunk mutex and we don't want to hold it across the entire
>    trim execution.  Instead, use the ->devices list, which is protected
>    by the device_list_mutex.
> 
> 2) Skip trim on devices that don't support it.  Rather than letting
>    the block layer reject it, bounce out early.
> 
> 3) Don't keep the commit_root_sem locked and the transaction pinned
>    across the block layer component of trim.  We only need these to
>    ensure the pending chunks list doesn't go away underneath us, so
>    it's safe to drop across the trim itself.  Historically, this
>    caused issues when fstrim and balance would run at the same time
>    since balance would produce lots of transactions and would
>    have to wait constantly, causing problems for everything else that
>    wanted to start a transaction.
> 
> -Jeff
> ---
> 
> Jeff Mahoney (3):
>   btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs
>   btrfs: don't attempt to trim devices that don't support it
>   btrfs: keep trim from interfering with transaction commits

The patches have been in for-next for some time, I'm moving them to
misc-next now and will probably forward them to the next rc (5) with
other trim fixes from Qu.

Please note that the commit subject in the Fixes: tag should be in
"...", I've fixed that.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-14 21:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-06 21:18 [PATCH 0/3] btrfs: trim latency improvements jeffm
2018-09-06 21:18 ` [PATCH 1/3] btrfs: use ->devices list instead of ->alloc_list in btrfs_trim_fs jeffm
2018-09-14 16:22   ` David Sterba
2018-09-06 21:18 ` [PATCH 2/3] btrfs: don't attempt to trim devices that don't support it jeffm
2018-09-14 16:27   ` David Sterba
2018-09-06 21:18 ` [PATCH 3/3] btrfs: keep trim from interfering with transaction commits jeffm
2018-09-14 16:28   ` David Sterba
2018-09-14 16:33 ` [PATCH 0/3] btrfs: trim latency improvements David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).