[PATCH v3 for-next 0/3] btrfs async discard fixes & improvements

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements
@ 2020-12-06 15:56 Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 1/3] btrfs: fix async discard stall Pavel Begunkov
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-12-06 15:56 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Fix async discard stalls with the first patch, and address other minor
things.

v2: fix async discard stalls, see patch [1/3]
v3: if now == bg->discard_eligible_time it fails to init discard state,
    and index. Always init it if peek return !=NULL bg, it's more
    resilient.

Pavel Begunkov (3):
  btrfs: fix async discard stall
  btrfs: fix racy access to discard_ctl data
  btrfs: don't overabuse discard lock

 fs/btrfs/discard.c | 70 ++++++++++++++++++++++++----------------------
 1 file changed, 36 insertions(+), 34 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/3] btrfs: fix async discard stall
  2020-12-06 15:56 [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements Pavel Begunkov
@ 2020-12-06 15:56 ` Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 2/3] btrfs: fix racy access to discard_ctl data Pavel Begunkov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-12-06 15:56 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Might happen that bg->discard_eligible_time was changed without
rescheduling, so btrfs_discard_workfn() wakes up earlier than that new
time, peek_discard_list() return null, and all work halts and goes to
sleep without further rescheduling even there are block groups to
discard.

It happens pretty often, but not so visible from the userspace because
after some time it usually will be kicked off anyway by someone else
calling btrfs_discard_reschedule_work().

Fix it by continue rescheduling if block group discard lists are not
empty.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/discard.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index 1db966bf85b2..36431d7e1334 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -199,16 +199,15 @@ static struct btrfs_block_group *find_next_block_group(
 static struct btrfs_block_group *peek_discard_list(
 					struct btrfs_discard_ctl *discard_ctl,
 					enum btrfs_discard_state *discard_state,
-					int *discard_index)
+					int *discard_index, u64 now)
 {
 	struct btrfs_block_group *block_group;
-	const u64 now = ktime_get_ns();
 
 	spin_lock(&discard_ctl->lock);
 again:
 	block_group = find_next_block_group(discard_ctl, now);
 
-	if (block_group && now > block_group->discard_eligible_time) {
+	if (block_group && now >= block_group->discard_eligible_time) {
 		if (block_group->discard_index == BTRFS_DISCARD_INDEX_UNUSED &&
 		    block_group->used != 0) {
 			if (btrfs_is_block_group_data_only(block_group))
@@ -222,12 +221,11 @@ static struct btrfs_block_group *peek_discard_list(
 			block_group->discard_state = BTRFS_DISCARD_EXTENTS;
 		}
 		discard_ctl->block_group = block_group;
+	}
+	if (block_group) {
 		*discard_state = block_group->discard_state;
 		*discard_index = block_group->discard_index;
-	} else {
-		block_group = NULL;
 	}
-
 	spin_unlock(&discard_ctl->lock);
 
 	return block_group;
@@ -438,13 +436,18 @@ static void btrfs_discard_workfn(struct work_struct *work)
 	int discard_index = 0;
 	u64 trimmed = 0;
 	u64 minlen = 0;
+	u64 now = ktime_get_ns();
 
 	discard_ctl = container_of(work, struct btrfs_discard_ctl, work.work);
 
 	block_group = peek_discard_list(discard_ctl, &discard_state,
-					&discard_index);
+					&discard_index, now);
 	if (!block_group || !btrfs_run_discard_work(discard_ctl))
 		return;
+	if (now < block_group->discard_eligible_time) {
+		btrfs_discard_schedule_work(discard_ctl, false);
+		return;
+	}
 
 	/* Perform discarding */
 	minlen = discard_minlen[discard_index];
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/3] btrfs: fix racy access to discard_ctl data
  2020-12-06 15:56 [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 1/3] btrfs: fix async discard stall Pavel Begunkov
@ 2020-12-06 15:56 ` Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 3/3] btrfs: don't overabuse discard lock Pavel Begunkov
  2020-12-14 17:37 ` [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-12-06 15:56 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Because only one discard worker may be running at any given point, it
could have been safe to modify ->prev_discard, etc. without
synchronisation, if not for @override flag in
btrfs_discard_schedule_work() and delayed_work_pending() returning false
while workfn is running.

That may lead lead to torn reads of u64 for some architectures, but
that's not a big problem as only slightly affects the discard rate.

Suggested-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/discard.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index 36431d7e1334..d641f451f840 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -477,13 +477,6 @@ static void btrfs_discard_workfn(struct work_struct *work)
 		discard_ctl->discard_extent_bytes += trimmed;
 	}
 
-	/*
-	 * Updated without locks as this is inside the workfn and nothing else
-	 * is reading the values
-	 */
-	discard_ctl->prev_discard = trimmed;
-	discard_ctl->prev_discard_time = ktime_get_ns();
-
 	/* Determine next steps for a block_group */
 	if (block_group->discard_cursor >= btrfs_block_group_end(block_group)) {
 		if (discard_state == BTRFS_DISCARD_BITMAPS) {
@@ -499,7 +492,10 @@ static void btrfs_discard_workfn(struct work_struct *work)
 		}
 	}
 
+	now = ktime_get_ns();
 	spin_lock(&discard_ctl->lock);
+	discard_ctl->prev_discard = trimmed;
+	discard_ctl->prev_discard_time = now;
 	discard_ctl->block_group = NULL;
 	spin_unlock(&discard_ctl->lock);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 3/3] btrfs: don't overabuse discard lock
  2020-12-06 15:56 [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 1/3] btrfs: fix async discard stall Pavel Begunkov
  2020-12-06 15:56 ` [PATCH v3 2/3] btrfs: fix racy access to discard_ctl data Pavel Begunkov
@ 2020-12-06 15:56 ` Pavel Begunkov
  2020-12-14 17:37 ` [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-12-06 15:56 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

btrfs_discard_workfn() drops discard_ctl->lock just to take it again in
a moment in btrfs_discard_schedule_work(). Avoid that and also reuse
ktime.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/discard.c | 43 +++++++++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index d641f451f840..9d6ab92391a3 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -328,28 +328,15 @@ void btrfs_discard_queue_work(struct btrfs_discard_ctl *discard_ctl,
 		btrfs_discard_schedule_work(discard_ctl, false);
 }
 
-/**
- * btrfs_discard_schedule_work - responsible for scheduling the discard work
- * @discard_ctl: discard control
- * @override: override the current timer
- *
- * Discards are issued by a delayed workqueue item.  @override is used to
- * update the current delay as the baseline delay interval is reevaluated on
- * transaction commit.  This is also maxed with any other rate limit.
- */
-void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
-				 bool override)
+static void __btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
+					  u64 now, bool override)
 {
 	struct btrfs_block_group *block_group;
-	const u64 now = ktime_get_ns();
-
-	spin_lock(&discard_ctl->lock);
 
 	if (!btrfs_run_discard_work(discard_ctl))
-		goto out;
-
+		return;
 	if (!override && delayed_work_pending(&discard_ctl->work))
-		goto out;
+		return;
 
 	block_group = find_next_block_group(discard_ctl, now);
 	if (block_group) {
@@ -391,7 +378,24 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
 		mod_delayed_work(discard_ctl->discard_workers,
 				 &discard_ctl->work, nsecs_to_jiffies(delay));
 	}
-out:
+}
+
+/**
+ * btrfs_discard_schedule_work - responsible for scheduling the discard work
+ * @discard_ctl: discard control
+ * @override: override the current timer
+ *
+ * Discards are issued by a delayed workqueue item.  @override is used to
+ * update the current delay as the baseline delay interval is reevaluated on
+ * transaction commit.  This is also maxed with any other rate limit.
+ */
+void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
+				 bool override)
+{
+	u64 now = ktime_get_ns();
+
+	spin_lock(&discard_ctl->lock);
+	__btrfs_discard_schedule_work(discard_ctl, now, override);
 	spin_unlock(&discard_ctl->lock);
 }
 
@@ -497,9 +501,8 @@ static void btrfs_discard_workfn(struct work_struct *work)
 	discard_ctl->prev_discard = trimmed;
 	discard_ctl->prev_discard_time = now;
 	discard_ctl->block_group = NULL;
+	__btrfs_discard_schedule_work(discard_ctl, now, false);
 	spin_unlock(&discard_ctl->lock);
-
-	btrfs_discard_schedule_work(discard_ctl, false);
 }
 
 /**
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements
  2020-12-06 15:56 [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements Pavel Begunkov
                   ` (2 preceding siblings ...)
  2020-12-06 15:56 ` [PATCH v3 3/3] btrfs: don't overabuse discard lock Pavel Begunkov
@ 2020-12-14 17:37 ` David Sterba
  3 siblings, 0 replies; 5+ messages in thread
From: David Sterba @ 2020-12-14 17:37 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Sun, Dec 06, 2020 at 03:56:19PM +0000, Pavel Begunkov wrote:
> Fix async discard stalls with the first patch, and address other minor
> things.
> 
> v2: fix async discard stalls, see patch [1/3]
> v3: if now == bg->discard_eligible_time it fails to init discard state,
>     and index. Always init it if peek return !=NULL bg, it's more
>     resilient.
> 
> Pavel Begunkov (3):
>   btrfs: fix async discard stall
>   btrfs: fix racy access to discard_ctl data
>   btrfs: don't overabuse discard lock

Thanks.  I'll add the patchset to for-next and will send it with some
post rc1 pull request. 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-14 17:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 15:56 [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements Pavel Begunkov
2020-12-06 15:56 ` [PATCH v3 1/3] btrfs: fix async discard stall Pavel Begunkov
2020-12-06 15:56 ` [PATCH v3 2/3] btrfs: fix racy access to discard_ctl data Pavel Begunkov
2020-12-06 15:56 ` [PATCH v3 3/3] btrfs: don't overabuse discard lock Pavel Begunkov
2020-12-14 17:37 ` [PATCH v3 for-next 0/3] btrfs async discard fixes & improvements David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.