All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] fixes for btrfs async discards
@ 2020-11-04  9:45 Pavel Begunkov
  2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04  9:45 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Several fixes for async discards. The first patch might increase discard
rate, drastically in some cases. That may be a suprise for those
assuming that hitting iops_limit is rare and rarther outliers. Though,
it still stays in allowed range, so should be fine.

Pavel Begunkov (4):
  btrfs: discard: speed up discard up to iops_limit
  btrfs: discard: save discard delay as ns not jiffy
  btrfs: don't miss discards after override-schedule
  btrfs: discard: reschedule work after param update

 fs/btrfs/ctree.h   |  3 ++-
 fs/btrfs/discard.c | 35 +++++++++++++++++++++++------------
 fs/btrfs/sysfs.c   |  5 +++--
 3 files changed, 28 insertions(+), 15 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
@ 2020-11-04  9:45 ` Pavel Begunkov
  2020-11-04 15:29   ` Amy Parker
  2020-11-04 20:52   ` Josef Bacik
  2020-11-04  9:45 ` [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy Pavel Begunkov
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04  9:45 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Instead of using iops_limit only for cutting off extremes, calculate the
discard delay directly from it, so it closely follows iops_limit and
doesn't under-discarding even though quotas are not saturated.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/discard.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index 741c7e19c32f..76796a90e88d 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
 	s64 discardable_bytes;
 	u32 iops_limit;
 	unsigned long delay;
-	unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
 
 	discardable_extents = atomic_read(&discard_ctl->discardable_extents);
 	if (!discardable_extents)
@@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
 
 	iops_limit = READ_ONCE(discard_ctl->iops_limit);
 	if (iops_limit)
-		lower_limit = max_t(unsigned long, lower_limit,
-				    MSEC_PER_SEC / iops_limit);
+		delay = MSEC_PER_SEC / iops_limit;
+	else
+		delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
 
-	delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
-	delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
+	delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
+		      BTRFS_DISCARD_MAX_DELAY_MSEC);
 	discard_ctl->delay = msecs_to_jiffies(delay);
 
 	spin_unlock(&discard_ctl->lock);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy
  2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
  2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
@ 2020-11-04  9:45 ` Pavel Begunkov
  2020-11-04 15:35   ` Amy Parker
  2020-11-04 20:54   ` Josef Bacik
  2020-11-04  9:45 ` [PATCH 3/4] btrfs: don't miss discards after override-schedule Pavel Begunkov
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04  9:45 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

Most of calculations are done in ns or ms, so store discard_ctl->delay
in ms and convert the final delay to jiffies only in the end.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/discard.c | 14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aac3d6f4e35b..d43a82dcdfc0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -472,7 +472,7 @@ struct btrfs_discard_ctl {
 	atomic_t discardable_extents;
 	atomic64_t discardable_bytes;
 	u64 max_discard_size;
-	unsigned long delay;
+	u64 delay_ms;
 	u32 iops_limit;
 	u32 kbps_limit;
 	u64 discard_extent_bytes;
diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index 76796a90e88d..b6c68e5711f0 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -355,7 +355,7 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
 
 	block_group = find_next_block_group(discard_ctl, now);
 	if (block_group) {
-		unsigned long delay = discard_ctl->delay;
+		u64 delay = discard_ctl->delay_ms * NSEC_PER_MSEC;
 		u32 kbps_limit = READ_ONCE(discard_ctl->kbps_limit);
 
 		/*
@@ -366,9 +366,9 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
 		if (kbps_limit && discard_ctl->prev_discard) {
 			u64 bps_limit = ((u64)kbps_limit) * SZ_1K;
 			u64 bps_delay = div64_u64(discard_ctl->prev_discard *
-						  MSEC_PER_SEC, bps_limit);
+						  NSEC_PER_SEC, bps_limit);
 
-			delay = max(delay, msecs_to_jiffies(bps_delay));
+			delay = max(delay, bps_delay);
 		}
 
 		/*
@@ -378,11 +378,11 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
 		if (now < block_group->discard_eligible_time) {
 			u64 bg_timeout = block_group->discard_eligible_time - now;
 
-			delay = max(delay, nsecs_to_jiffies(bg_timeout));
+			delay = max(delay, bg_timeout);
 		}
 
 		mod_delayed_work(discard_ctl->discard_workers,
-				 &discard_ctl->work, delay);
+				 &discard_ctl->work, nsecs_to_jiffies(delay));
 	}
 out:
 	spin_unlock(&discard_ctl->lock);
@@ -555,7 +555,7 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
 
 	delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
 		      BTRFS_DISCARD_MAX_DELAY_MSEC);
-	discard_ctl->delay = msecs_to_jiffies(delay);
+	discard_ctl->delay_ms = delay;
 
 	spin_unlock(&discard_ctl->lock);
 }
@@ -687,7 +687,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info)
 	atomic_set(&discard_ctl->discardable_extents, 0);
 	atomic64_set(&discard_ctl->discardable_bytes, 0);
 	discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE;
-	discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC;
+	discard_ctl->delay_ms = BTRFS_DISCARD_MAX_DELAY_MSEC;
 	discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS;
 	discard_ctl->kbps_limit = 0;
 	discard_ctl->discard_extent_bytes = 0;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] btrfs: don't miss discards after override-schedule
  2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
  2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
  2020-11-04  9:45 ` [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy Pavel Begunkov
@ 2020-11-04  9:45 ` Pavel Begunkov
  2020-11-04 20:59   ` Josef Bacik
  2020-11-04  9:45 ` [PATCH 4/4] btrfs: discard: reschedule work after param update Pavel Begunkov
  2020-11-05 22:23 ` [PATCH 0/4] fixes for btrfs async discards David Sterba
  4 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04  9:45 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

If btrfs_discard_schedule_work() is called with override=true, it sets
delay anew regardless how much time left until the timer should have
fired. If delays are long (that can happen, for example, with low
kbps_limit), they might be constantly overriden without having a chance
to run the discard work.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/ctree.h   |  1 +
 fs/btrfs/discard.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index d43a82dcdfc0..ad71c8c769de 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -469,6 +469,7 @@ struct btrfs_discard_ctl {
 	struct btrfs_block_group *block_group;
 	struct list_head discard_list[BTRFS_NR_DISCARD_LISTS];
 	u64 prev_discard;
+	u64 prev_discard_time;
 	atomic_t discardable_extents;
 	atomic64_t discardable_bytes;
 	u64 max_discard_size;
diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
index b6c68e5711f0..c9018b9ccf99 100644
--- a/fs/btrfs/discard.c
+++ b/fs/btrfs/discard.c
@@ -381,6 +381,15 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
 			delay = max(delay, bg_timeout);
 		}
 
+		if (override && discard_ctl->prev_discard) {
+			u64 elapsed = now - discard_ctl->prev_discard_time;
+
+			if (delay > elapsed)
+				delay -= elapsed;
+			else
+				delay = 0;
+		}
+
 		mod_delayed_work(discard_ctl->discard_workers,
 				 &discard_ctl->work, nsecs_to_jiffies(delay));
 	}
@@ -466,6 +475,7 @@ static void btrfs_discard_workfn(struct work_struct *work)
 	}
 
 	discard_ctl->prev_discard = trimmed;
+	discard_ctl->prev_discard_time = ktime_get_ns();
 
 	/* Determine next steps for a block_group */
 	if (block_group->discard_cursor >= btrfs_block_group_end(block_group)) {
@@ -684,6 +694,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info)
 		INIT_LIST_HEAD(&discard_ctl->discard_list[i]);
 
 	discard_ctl->prev_discard = 0;
+	discard_ctl->prev_discard_time = 0;
 	atomic_set(&discard_ctl->discardable_extents, 0);
 	atomic64_set(&discard_ctl->discardable_bytes, 0);
 	discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] btrfs: discard: reschedule work after param update
  2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
                   ` (2 preceding siblings ...)
  2020-11-04  9:45 ` [PATCH 3/4] btrfs: don't miss discards after override-schedule Pavel Begunkov
@ 2020-11-04  9:45 ` Pavel Begunkov
  2020-11-04 21:00   ` Josef Bacik
  2020-11-05 22:23 ` [PATCH 0/4] fixes for btrfs async discards David Sterba
  4 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04  9:45 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

After sysfs updates discard's iops_limit or kbps_limit it also needs to
adjust current timer through rescheduling, otherwise the discard work
may wait for a long time for the previous timer to expier or bumped by
someone else.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/btrfs/sysfs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 279d9262b676..65410d3939f2 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -433,7 +433,8 @@ static ssize_t btrfs_discard_iops_limit_store(struct kobject *kobj,
 		return -EINVAL;
 
 	WRITE_ONCE(discard_ctl->iops_limit, iops_limit);
-
+	btrfs_discard_calc_delay(discard_ctl);
+	btrfs_discard_schedule_work(discard_ctl, true);
 	return len;
 }
 BTRFS_ATTR_RW(discard, iops_limit, btrfs_discard_iops_limit_show,
@@ -463,7 +464,7 @@ static ssize_t btrfs_discard_kbps_limit_store(struct kobject *kobj,
 		return -EINVAL;
 
 	WRITE_ONCE(discard_ctl->kbps_limit, kbps_limit);
-
+	btrfs_discard_schedule_work(discard_ctl, true);
 	return len;
 }
 BTRFS_ATTR_RW(discard, kbps_limit, btrfs_discard_kbps_limit_show,
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
@ 2020-11-04 15:29   ` Amy Parker
  2020-11-04 17:19     ` Pavel Begunkov
  2020-11-04 20:52   ` Josef Bacik
  1 sibling, 1 reply; 24+ messages in thread
From: Amy Parker @ 2020-11-04 15:29 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> Instead of using iops_limit only for cutting off extremes, calculate the
> discard delay directly from it, so it closely follows iops_limit and
> doesn't under-discarding even though quotas are not saturated.

This sounds like it potentially be a great performance boost, do you
have any performance metrics regarding this patch?

>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  fs/btrfs/discard.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> index 741c7e19c32f..76796a90e88d 100644
> --- a/fs/btrfs/discard.c
> +++ b/fs/btrfs/discard.c
> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>         s64 discardable_bytes;
>         u32 iops_limit;
>         unsigned long delay;
> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
>
>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
>         if (!discardable_extents)
> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>
>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
>         if (iops_limit)
> -               lower_limit = max_t(unsigned long, lower_limit,
> -                                   MSEC_PER_SEC / iops_limit);
> +               delay = MSEC_PER_SEC / iops_limit;
> +       else
> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;

Looks good to me. I wonder why there wasn't handling of if iops_limit
was unfindable
before?

>
> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
>         discard_ctl->delay = msecs_to_jiffies(delay);
>
>         spin_unlock(&discard_ctl->lock);
> --
> 2.24.0
>

This patch looks all great to me.

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy
  2020-11-04  9:45 ` [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy Pavel Begunkov
@ 2020-11-04 15:35   ` Amy Parker
  2020-11-04 15:48     ` Pavel Begunkov
  2020-11-04 20:54   ` Josef Bacik
  1 sibling, 1 reply; 24+ messages in thread
From: Amy Parker @ 2020-11-04 15:35 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 1:52 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> Most of calculations are done in ns or ms, so store discard_ctl->delay
> in ms and convert the final delay to jiffies only in the end.

Great idea.

>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  fs/btrfs/ctree.h   |  2 +-
>  fs/btrfs/discard.c | 14 +++++++-------
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index aac3d6f4e35b..d43a82dcdfc0 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -472,7 +472,7 @@ struct btrfs_discard_ctl {
>         atomic_t discardable_extents;
>         atomic64_t discardable_bytes;
>         u64 max_discard_size;
> -       unsigned long delay;
> +       u64 delay_ms;

Thanks for converting this from the ambiguous unsigned long to the
more specific u64.

>         u32 iops_limit;
>         u32 kbps_limit;
>         u64 discard_extent_bytes;
> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> index 76796a90e88d..b6c68e5711f0 100644
> --- a/fs/btrfs/discard.c
> +++ b/fs/btrfs/discard.c
> @@ -355,7 +355,7 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>
>         block_group = find_next_block_group(discard_ctl, now);
>         if (block_group) {
> -               unsigned long delay = discard_ctl->delay;
> +               u64 delay = discard_ctl->delay_ms * NSEC_PER_MSEC;

I worry about a potential performance impact with this, but it should be
minimal at most.

>                 u32 kbps_limit = READ_ONCE(discard_ctl->kbps_limit);
>
>                 /*
> @@ -366,9 +366,9 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>                 if (kbps_limit && discard_ctl->prev_discard) {
>                         u64 bps_limit = ((u64)kbps_limit) * SZ_1K;
>                         u64 bps_delay = div64_u64(discard_ctl->prev_discard *
> -                                                 MSEC_PER_SEC, bps_limit);
> +                                                 NSEC_PER_SEC, bps_limit);
>
> -                       delay = max(delay, msecs_to_jiffies(bps_delay));
> +                       delay = max(delay, bps_delay);

Great that we got this down to just passing max() a value. Same thing on
the instance below.

>                 }
>
>                 /*
> @@ -378,11 +378,11 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>                 if (now < block_group->discard_eligible_time) {
>                         u64 bg_timeout = block_group->discard_eligible_time - now;
>
> -                       delay = max(delay, nsecs_to_jiffies(bg_timeout));
> +                       delay = max(delay, bg_timeout);
>                 }
>
>                 mod_delayed_work(discard_ctl->discard_workers,
> -                                &discard_ctl->work, delay);
> +                                &discard_ctl->work, nsecs_to_jiffies(delay));
>         }
>  out:
>         spin_unlock(&discard_ctl->lock);
> @@ -555,7 +555,7 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>
>         delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
>                       BTRFS_DISCARD_MAX_DELAY_MSEC);
> -       discard_ctl->delay = msecs_to_jiffies(delay);
> +       discard_ctl->delay_ms = delay;
>
>         spin_unlock(&discard_ctl->lock);
>  }
> @@ -687,7 +687,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info)
>         atomic_set(&discard_ctl->discardable_extents, 0);
>         atomic64_set(&discard_ctl->discardable_bytes, 0);
>         discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE;
> -       discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC;
> +       discard_ctl->delay_ms = BTRFS_DISCARD_MAX_DELAY_MSEC;
>         discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS;
>         discard_ctl->kbps_limit = 0;
>         discard_ctl->discard_extent_bytes = 0;
> --
> 2.24.0
>

Looks all fine to me.

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy
  2020-11-04 15:35   ` Amy Parker
@ 2020-11-04 15:48     ` Pavel Begunkov
  2020-11-04 16:46       ` Amy Parker
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04 15:48 UTC (permalink / raw)
  To: Amy Parker; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On 04/11/2020 15:35, Amy Parker wrote:
> On Wed, Nov 4, 2020 at 1:52 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> Most of calculations are done in ns or ms, so store discard_ctl->delay
>> in ms and convert the final delay to jiffies only in the end.
> 
> Great idea.
> 
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>  fs/btrfs/ctree.h   |  2 +-
>>  fs/btrfs/discard.c | 14 +++++++-------
>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index aac3d6f4e35b..d43a82dcdfc0 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -472,7 +472,7 @@ struct btrfs_discard_ctl {
>>         atomic_t discardable_extents;
>>         atomic64_t discardable_bytes;
>>         u64 max_discard_size;
>> -       unsigned long delay;
>> +       u64 delay_ms;
> 
> Thanks for converting this from the ambiguous unsigned long to the
> more specific u64.
> 
>>         u32 iops_limit;
>>         u32 kbps_limit;
>>         u64 discard_extent_bytes;
>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
>> index 76796a90e88d..b6c68e5711f0 100644
>> --- a/fs/btrfs/discard.c
>> +++ b/fs/btrfs/discard.c
>> @@ -355,7 +355,7 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>>
>>         block_group = find_next_block_group(discard_ctl, now);
>>         if (block_group) {
>> -               unsigned long delay = discard_ctl->delay;
>> +               u64 delay = discard_ctl->delay_ms * NSEC_PER_MSEC;
> 
> I worry about a potential performance impact with this, but it should be
> minimal at most.

That's nothing, nsecs_to_jiffies() in the end is heavier, but this is not
in a hot path and by all means it's heavily amortised by actual discarding.

> 
>>                 u32 kbps_limit = READ_ONCE(discard_ctl->kbps_limit);
>>
>>                 /*
>> @@ -366,9 +366,9 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>>                 if (kbps_limit && discard_ctl->prev_discard) {
>>                         u64 bps_limit = ((u64)kbps_limit) * SZ_1K;
>>                         u64 bps_delay = div64_u64(discard_ctl->prev_discard *
>> -                                                 MSEC_PER_SEC, bps_limit);
>> +                                                 NSEC_PER_SEC, bps_limit);
>>
>> -                       delay = max(delay, msecs_to_jiffies(bps_delay));
>> +                       delay = max(delay, bps_delay);
> 
> Great that we got this down to just passing max() a value. Same thing on
> the instance below.
> 
>>                 }
>>
>>                 /*
>> @@ -378,11 +378,11 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>>                 if (now < block_group->discard_eligible_time) {
>>                         u64 bg_timeout = block_group->discard_eligible_time - now;
>>
>> -                       delay = max(delay, nsecs_to_jiffies(bg_timeout));
>> +                       delay = max(delay, bg_timeout);
>>                 }
>>
>>                 mod_delayed_work(discard_ctl->discard_workers,
>> -                                &discard_ctl->work, delay);
>> +                                &discard_ctl->work, nsecs_to_jiffies(delay));
>>         }
>>  out:
>>         spin_unlock(&discard_ctl->lock);
>> @@ -555,7 +555,7 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>
>>         delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
>>                       BTRFS_DISCARD_MAX_DELAY_MSEC);
>> -       discard_ctl->delay = msecs_to_jiffies(delay);
>> +       discard_ctl->delay_ms = delay;
>>
>>         spin_unlock(&discard_ctl->lock);
>>  }
>> @@ -687,7 +687,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info)
>>         atomic_set(&discard_ctl->discardable_extents, 0);
>>         atomic64_set(&discard_ctl->discardable_bytes, 0);
>>         discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE;
>> -       discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC;
>> +       discard_ctl->delay_ms = BTRFS_DISCARD_MAX_DELAY_MSEC;
>>         discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS;
>>         discard_ctl->kbps_limit = 0;
>>         discard_ctl->discard_extent_bytes = 0;
>> --
>> 2.24.0
>>
> 
> Looks all fine to me.


-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy
  2020-11-04 15:48     ` Pavel Begunkov
@ 2020-11-04 16:46       ` Amy Parker
  0 siblings, 0 replies; 24+ messages in thread
From: Amy Parker @ 2020-11-04 16:46 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 7:51 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 04/11/2020 15:35, Amy Parker wrote:
> > On Wed, Nov 4, 2020 at 1:52 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>
> >> Most of calculations are done in ns or ms, so store discard_ctl->delay
> >> in ms and convert the final delay to jiffies only in the end.
> >
> > Great idea.
> >
> >>
> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >> ---
> >>  fs/btrfs/ctree.h   |  2 +-
> >>  fs/btrfs/discard.c | 14 +++++++-------
> >>  2 files changed, 8 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> >> index aac3d6f4e35b..d43a82dcdfc0 100644
> >> --- a/fs/btrfs/ctree.h
> >> +++ b/fs/btrfs/ctree.h
> >> @@ -472,7 +472,7 @@ struct btrfs_discard_ctl {
> >>         atomic_t discardable_extents;
> >>         atomic64_t discardable_bytes;
> >>         u64 max_discard_size;
> >> -       unsigned long delay;
> >> +       u64 delay_ms;
> >
> > Thanks for converting this from the ambiguous unsigned long to the
> > more specific u64.
> >
> >>         u32 iops_limit;
> >>         u32 kbps_limit;
> >>         u64 discard_extent_bytes;
> >> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> >> index 76796a90e88d..b6c68e5711f0 100644
> >> --- a/fs/btrfs/discard.c
> >> +++ b/fs/btrfs/discard.c
> >> @@ -355,7 +355,7 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
> >>
> >>         block_group = find_next_block_group(discard_ctl, now);
> >>         if (block_group) {
> >> -               unsigned long delay = discard_ctl->delay;
> >> +               u64 delay = discard_ctl->delay_ms * NSEC_PER_MSEC;
> >
> > I worry about a potential performance impact with this, but it should be
> > minimal at most.
>
> That's nothing, nsecs_to_jiffies() in the end is heavier, but this is not
> in a hot path and by all means it's heavily amortised by actual discarding.

Alright, sounds good.

>
> >
> >>                 u32 kbps_limit = READ_ONCE(discard_ctl->kbps_limit);
> >>
> >>                 /*
> >> @@ -366,9 +366,9 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
> >>                 if (kbps_limit && discard_ctl->prev_discard) {
> >>                         u64 bps_limit = ((u64)kbps_limit) * SZ_1K;
> >>                         u64 bps_delay = div64_u64(discard_ctl->prev_discard *
> >> -                                                 MSEC_PER_SEC, bps_limit);
> >> +                                                 NSEC_PER_SEC, bps_limit);
> >>
> >> -                       delay = max(delay, msecs_to_jiffies(bps_delay));
> >> +                       delay = max(delay, bps_delay);
> >
> > Great that we got this down to just passing max() a value. Same thing on
> > the instance below.
> >
> >>                 }
> >>
> >>                 /*
> >> @@ -378,11 +378,11 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
> >>                 if (now < block_group->discard_eligible_time) {
> >>                         u64 bg_timeout = block_group->discard_eligible_time - now;
> >>
> >> -                       delay = max(delay, nsecs_to_jiffies(bg_timeout));
> >> +                       delay = max(delay, bg_timeout);
> >>                 }
> >>
> >>                 mod_delayed_work(discard_ctl->discard_workers,
> >> -                                &discard_ctl->work, delay);
> >> +                                &discard_ctl->work, nsecs_to_jiffies(delay));
> >>         }
> >>  out:
> >>         spin_unlock(&discard_ctl->lock);
> >> @@ -555,7 +555,7 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>
> >>         delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
> >>                       BTRFS_DISCARD_MAX_DELAY_MSEC);
> >> -       discard_ctl->delay = msecs_to_jiffies(delay);
> >> +       discard_ctl->delay_ms = delay;
> >>
> >>         spin_unlock(&discard_ctl->lock);
> >>  }
> >> @@ -687,7 +687,7 @@ void btrfs_discard_init(struct btrfs_fs_info *fs_info)
> >>         atomic_set(&discard_ctl->discardable_extents, 0);
> >>         atomic64_set(&discard_ctl->discardable_bytes, 0);
> >>         discard_ctl->max_discard_size = BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE;
> >> -       discard_ctl->delay = BTRFS_DISCARD_MAX_DELAY_MSEC;
> >> +       discard_ctl->delay_ms = BTRFS_DISCARD_MAX_DELAY_MSEC;
> >>         discard_ctl->iops_limit = BTRFS_DISCARD_MAX_IOPS;
> >>         discard_ctl->kbps_limit = 0;
> >>         discard_ctl->discard_extent_bytes = 0;
> >> --
> >> 2.24.0
> >>
> >
> > Looks all fine to me.
>
>
> --
> Pavel Begunkov

In this case, I see nothing more to consider with this.

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 15:29   ` Amy Parker
@ 2020-11-04 17:19     ` Pavel Begunkov
  2020-11-04 17:33       ` Amy Parker
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04 17:19 UTC (permalink / raw)
  To: Amy Parker; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On 04/11/2020 15:29, Amy Parker wrote:
> On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> Instead of using iops_limit only for cutting off extremes, calculate the
>> discard delay directly from it, so it closely follows iops_limit and
>> doesn't under-discarding even though quotas are not saturated.
> 
> This sounds like it potentially be a great performance boost, do you
> have any performance metrics regarding this patch?

Boosting the discard rate and so reaping stalling blocks may be nice, but
unless it holds too much memory creating lack of space it shouldn't affect
throughput. Though, it's better to ask people with deeper understanding
of the fs.
What I've seen is that in some cases there are extents staying queued for
discarding for _too_ long. E.g. reaping a small number of very fat extents
keeps delay at max and doesn't allow to reap them effectively. That could
be a problem with fast drives.

> 
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>  fs/btrfs/discard.c | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
>> index 741c7e19c32f..76796a90e88d 100644
>> --- a/fs/btrfs/discard.c
>> +++ b/fs/btrfs/discard.c
>> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>         s64 discardable_bytes;
>>         u32 iops_limit;
>>         unsigned long delay;
>> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
>>
>>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
>>         if (!discardable_extents)
>> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>
>>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
>>         if (iops_limit)
>> -               lower_limit = max_t(unsigned long, lower_limit,
>> -                                   MSEC_PER_SEC / iops_limit);
>> +               delay = MSEC_PER_SEC / iops_limit;
>> +       else
>> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> 
> Looks good to me. I wonder why there wasn't handling of if iops_limit
> was unfindable
> before?

Not sure what you mean by unfindable, but async discard is relatively new,
might be that everyone just have their hands full.

> 
>>
>> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
>> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
>> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
>> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
>>         discard_ctl->delay = msecs_to_jiffies(delay);
>>
>>         spin_unlock(&discard_ctl->lock);
>> --
>> 2.24.0
>>
> 
> This patch looks all great to me.
> 
> Best regards,
> Amy Parker
> (they/them)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 17:19     ` Pavel Begunkov
@ 2020-11-04 17:33       ` Amy Parker
  2020-11-04 17:47         ` Pavel Begunkov
  0 siblings, 1 reply; 24+ messages in thread
From: Amy Parker @ 2020-11-04 17:33 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 9:22 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 04/11/2020 15:29, Amy Parker wrote:
> > On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>
> >> Instead of using iops_limit only for cutting off extremes, calculate the
> >> discard delay directly from it, so it closely follows iops_limit and
> >> doesn't under-discarding even though quotas are not saturated.
> >
> > This sounds like it potentially be a great performance boost, do you
> > have any performance metrics regarding this patch?
>
> Boosting the discard rate and so reaping stalling blocks may be nice, but
> unless it holds too much memory creating lack of space it shouldn't affect
> throughput. Though, it's better to ask people with deeper understanding
> of the fs.

Alright, thanks for the clarification.

> What I've seen is that in some cases there are extents staying queued for
> discarding for _too_ long. E.g. reaping a small number of very fat extents
> keeps delay at max and doesn't allow to reap them effectively. That could
> be a problem with fast drives.

Ah, yep. Seen this personally to a smaller extent.

>
> >
> >>
> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >> ---
> >>  fs/btrfs/discard.c | 10 +++++-----
> >>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> >> index 741c7e19c32f..76796a90e88d 100644
> >> --- a/fs/btrfs/discard.c
> >> +++ b/fs/btrfs/discard.c
> >> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>         s64 discardable_bytes;
> >>         u32 iops_limit;
> >>         unsigned long delay;
> >> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
> >>
> >>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
> >>         if (!discardable_extents)
> >> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>
> >>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
> >>         if (iops_limit)
> >> -               lower_limit = max_t(unsigned long, lower_limit,
> >> -                                   MSEC_PER_SEC / iops_limit);
> >> +               delay = MSEC_PER_SEC / iops_limit;
> >> +       else
> >> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >
> > Looks good to me. I wonder why there wasn't handling of if iops_limit
> > was unfindable
> > before?
>
> Not sure what you mean by unfindable, but async discard is relatively new,
> might be that everyone just have their hands full.

By unfindable I mean if iops_limit turned up as null when reading it
from discard_ctl.
Async discard was added in 5.6, correct? So yeah, makes sense then that people
just had their hands full. Thanks for adding it.

>
> >
> >>
> >> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
> >> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
> >> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
> >>         discard_ctl->delay = msecs_to_jiffies(delay);
> >>
> >>         spin_unlock(&discard_ctl->lock);
> >> --
> >> 2.24.0
> >>
> >
> > This patch looks all great to me.
> >
> > Best regards,
> > Amy Parker
> > (they/them)
> >
>
> --
> Pavel Begunkov

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 17:33       ` Amy Parker
@ 2020-11-04 17:47         ` Pavel Begunkov
  2020-11-04 17:55           ` Amy Parker
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04 17:47 UTC (permalink / raw)
  To: Amy Parker; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On 04/11/2020 17:33, Amy Parker wrote:
> On Wed, Nov 4, 2020 at 9:22 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 04/11/2020 15:29, Amy Parker wrote:
>>> On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>>
>>>> Instead of using iops_limit only for cutting off extremes, calculate the
>>>> discard delay directly from it, so it closely follows iops_limit and
>>>> doesn't under-discarding even though quotas are not saturated.
>>>
>>> This sounds like it potentially be a great performance boost, do you
>>> have any performance metrics regarding this patch?
>>
>> Boosting the discard rate and so reaping stalling blocks may be nice, but
>> unless it holds too much memory creating lack of space it shouldn't affect
>> throughput. Though, it's better to ask people with deeper understanding
>> of the fs.
> 
> Alright, thanks for the clarification.
> 
>> What I've seen is that in some cases there are extents staying queued for
>> discarding for _too_ long. E.g. reaping a small number of very fat extents
>> keeps delay at max and doesn't allow to reap them effectively. That could
>> be a problem with fast drives.
> 
> Ah, yep. Seen this personally to a smaller extent.
> 
>>
>>>
>>>>
>>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>>> ---
>>>>  fs/btrfs/discard.c | 10 +++++-----
>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
>>>> index 741c7e19c32f..76796a90e88d 100644
>>>> --- a/fs/btrfs/discard.c
>>>> +++ b/fs/btrfs/discard.c
>>>> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>>>         s64 discardable_bytes;
>>>>         u32 iops_limit;
>>>>         unsigned long delay;
>>>> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
>>>>
>>>>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
>>>>         if (!discardable_extents)
>>>> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>>>
>>>>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
>>>>         if (iops_limit)
>>>> -               lower_limit = max_t(unsigned long, lower_limit,
>>>> -                                   MSEC_PER_SEC / iops_limit);
>>>> +               delay = MSEC_PER_SEC / iops_limit;
>>>> +       else
>>>> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
>>>
>>> Looks good to me. I wonder why there wasn't handling of if iops_limit
>>> was unfindable
>>> before?
>>
>> Not sure what you mean by unfindable, but async discard is relatively new,
>> might be that everyone just have their hands full.
> 
> By unfindable I mean if iops_limit turned up as null when reading it
> from discard_ctl.

Ahh, ok. It's handled and I left it as it was, that BTW is still a problem.

First it calculates a delay based on number of queued extents and than clamps
it to (BTRFS_DISCARD_MIN_DELAY_MSEC, BTRFS_DISCARD_MAX_DELAY_MSEC). Without
this patch it did the same but the lower bound was calculated from iops_limit.

> Async discard was added in 5.6, correct? So yeah, makes sense then that people
> just had their hands full. Thanks for adding it.

b0643e59cfa609c4b5f ("btrfs: add the beginning of async discard, discard
workqueue"). Dec 2019, so less than a year

> 
>>
>>>
>>>>
>>>> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
>>>> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
>>>> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
>>>> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
>>>>         discard_ctl->delay = msecs_to_jiffies(delay);
>>>>
>>>>         spin_unlock(&discard_ctl->lock);
>>>> --
>>>> 2.24.0
>>>>
>>>
>>> This patch looks all great to me.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 17:47         ` Pavel Begunkov
@ 2020-11-04 17:55           ` Amy Parker
  2020-11-04 18:06             ` Pavel Begunkov
  0 siblings, 1 reply; 24+ messages in thread
From: Amy Parker @ 2020-11-04 17:55 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 9:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 04/11/2020 17:33, Amy Parker wrote:
> > On Wed, Nov 4, 2020 at 9:22 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>
> >> On 04/11/2020 15:29, Amy Parker wrote:
> >>> On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>>>
> >>>> Instead of using iops_limit only for cutting off extremes, calculate the
> >>>> discard delay directly from it, so it closely follows iops_limit and
> >>>> doesn't under-discarding even though quotas are not saturated.
> >>>
> >>> This sounds like it potentially be a great performance boost, do you
> >>> have any performance metrics regarding this patch?
> >>
> >> Boosting the discard rate and so reaping stalling blocks may be nice, but
> >> unless it holds too much memory creating lack of space it shouldn't affect
> >> throughput. Though, it's better to ask people with deeper understanding
> >> of the fs.
> >
> > Alright, thanks for the clarification.
> >
> >> What I've seen is that in some cases there are extents staying queued for
> >> discarding for _too_ long. E.g. reaping a small number of very fat extents
> >> keeps delay at max and doesn't allow to reap them effectively. That could
> >> be a problem with fast drives.
> >
> > Ah, yep. Seen this personally to a smaller extent.
> >
> >>
> >>>
> >>>>
> >>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >>>> ---
> >>>>  fs/btrfs/discard.c | 10 +++++-----
> >>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> >>>> index 741c7e19c32f..76796a90e88d 100644
> >>>> --- a/fs/btrfs/discard.c
> >>>> +++ b/fs/btrfs/discard.c
> >>>> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>>>         s64 discardable_bytes;
> >>>>         u32 iops_limit;
> >>>>         unsigned long delay;
> >>>> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
> >>>>
> >>>>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
> >>>>         if (!discardable_extents)
> >>>> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>>>
> >>>>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
> >>>>         if (iops_limit)
> >>>> -               lower_limit = max_t(unsigned long, lower_limit,
> >>>> -                                   MSEC_PER_SEC / iops_limit);
> >>>> +               delay = MSEC_PER_SEC / iops_limit;
> >>>> +       else
> >>>> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >>>
> >>> Looks good to me. I wonder why there wasn't handling of if iops_limit
> >>> was unfindable
> >>> before?
> >>
> >> Not sure what you mean by unfindable, but async discard is relatively new,
> >> might be that everyone just have their hands full.
> >
> > By unfindable I mean if iops_limit turned up as null when reading it
> > from discard_ctl.
>
> Ahh, ok. It's handled and I left it as it was, that BTW is still a problem.

How often is iops_limit unfindable?

>
> First it calculates a delay based on number of queued extents and than clamps
> it to (BTRFS_DISCARD_MIN_DELAY_MSEC, BTRFS_DISCARD_MAX_DELAY_MSEC). Without
> this patch it did the same but the lower bound was calculated from iops_limit.

Thanks for clarifying.

>
> > Async discard was added in 5.6, correct? So yeah, makes sense then that people
> > just had their hands full. Thanks for adding it.
>
> b0643e59cfa609c4b5f ("btrfs: add the beginning of async discard, discard
> workqueue"). Dec 2019, so less than a year

Thanks for finding the commit.

>
> >
> >>
> >>>
> >>>>
> >>>> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >>>> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
> >>>> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
> >>>> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
> >>>>         discard_ctl->delay = msecs_to_jiffies(delay);
> >>>>
> >>>>         spin_unlock(&discard_ctl->lock);
> >>>> --
> >>>> 2.24.0
> >>>>
> >>>
> >>> This patch looks all great to me.
>
> --
> Pavel Begunkov

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 17:55           ` Amy Parker
@ 2020-11-04 18:06             ` Pavel Begunkov
  2020-11-04 18:14               ` Amy Parker
  0 siblings, 1 reply; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04 18:06 UTC (permalink / raw)
  To: Amy Parker; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On 04/11/2020 17:55, Amy Parker wrote:
> On Wed, Nov 4, 2020 at 9:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> On 04/11/2020 17:33, Amy Parker wrote:
>>> On Wed, Nov 4, 2020 at 9:22 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>>
>>>> On 04/11/2020 15:29, Amy Parker wrote:
>>>>> On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>>>>>
>>>>>> Instead of using iops_limit only for cutting off extremes, calculate the
>>>>>> discard delay directly from it, so it closely follows iops_limit and
>>>>>> doesn't under-discarding even though quotas are not saturated.
>>>>>
>>>>> This sounds like it potentially be a great performance boost, do you
>>>>> have any performance metrics regarding this patch?
>>>>
>>>> Boosting the discard rate and so reaping stalling blocks may be nice, but
>>>> unless it holds too much memory creating lack of space it shouldn't affect
>>>> throughput. Though, it's better to ask people with deeper understanding
>>>> of the fs.
>>>
>>> Alright, thanks for the clarification.
>>>
>>>> What I've seen is that in some cases there are extents staying queued for
>>>> discarding for _too_ long. E.g. reaping a small number of very fat extents
>>>> keeps delay at max and doesn't allow to reap them effectively. That could
>>>> be a problem with fast drives.
>>>
>>> Ah, yep. Seen this personally to a smaller extent.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>>>>> ---
>>>>>>  fs/btrfs/discard.c | 10 +++++-----
>>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
>>>>>> index 741c7e19c32f..76796a90e88d 100644
>>>>>> --- a/fs/btrfs/discard.c
>>>>>> +++ b/fs/btrfs/discard.c
>>>>>> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>>>>>         s64 discardable_bytes;
>>>>>>         u32 iops_limit;
>>>>>>         unsigned long delay;
>>>>>> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
>>>>>>
>>>>>>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
>>>>>>         if (!discardable_extents)
>>>>>> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
>>>>>>
>>>>>>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
>>>>>>         if (iops_limit)
>>>>>> -               lower_limit = max_t(unsigned long, lower_limit,
>>>>>> -                                   MSEC_PER_SEC / iops_limit);
>>>>>> +               delay = MSEC_PER_SEC / iops_limit;
>>>>>> +       else
>>>>>> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
>>>>>
>>>>> Looks good to me. I wonder why there wasn't handling of if iops_limit
>>>>> was unfindable
>>>>> before?
>>>>
>>>> Not sure what you mean by unfindable, but async discard is relatively new,
>>>> might be that everyone just have their hands full.
>>>
>>> By unfindable I mean if iops_limit turned up as null when reading it
>>> from discard_ctl.
>>
>> Ahh, ok. It's handled and I left it as it was, that BTW is still a problem.
> 
> How often is iops_limit unfindable?

I don't know, but the default is 10, so shouldn't be too ubiquitous.
Maybe someone here knows statistics.

> 
>>
>> First it calculates a delay based on number of queued extents and than clamps
>> it to (BTRFS_DISCARD_MIN_DELAY_MSEC, BTRFS_DISCARD_MAX_DELAY_MSEC). Without
>> this patch it did the same but the lower bound was calculated from iops_limit.
> 
> Thanks for clarifying.
> 
>>
>>> Async discard was added in 5.6, correct? So yeah, makes sense then that people
>>> just had their hands full. Thanks for adding it.
>>
>> b0643e59cfa609c4b5f ("btrfs: add the beginning of async discard, discard
>> workqueue"). Dec 2019, so less than a year
> 
> Thanks for finding the commit.
> 
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
>>>>>> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
>>>>>> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
>>>>>> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
>>>>>>         discard_ctl->delay = msecs_to_jiffies(delay);
>>>>>>
>>>>>>         spin_unlock(&discard_ctl->lock);
>>>>>> --
>>>>>> 2.24.0
>>>>>>
>>>>>
>>>>> This patch looks all great to me.
>>
>> --
>> Pavel Begunkov
> 
> Best regards,
> Amy Parker
> (they/them)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04 18:06             ` Pavel Begunkov
@ 2020-11-04 18:14               ` Amy Parker
  0 siblings, 0 replies; 24+ messages in thread
From: Amy Parker @ 2020-11-04 18:14 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, Btrfs BTRFS

On Wed, Nov 4, 2020 at 10:09 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> On 04/11/2020 17:55, Amy Parker wrote:
> > On Wed, Nov 4, 2020 at 9:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>
> >> On 04/11/2020 17:33, Amy Parker wrote:
> >>> On Wed, Nov 4, 2020 at 9:22 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>>>
> >>>> On 04/11/2020 15:29, Amy Parker wrote:
> >>>>> On Wed, Nov 4, 2020 at 1:50 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >>>>>>
> >>>>>> Instead of using iops_limit only for cutting off extremes, calculate the
> >>>>>> discard delay directly from it, so it closely follows iops_limit and
> >>>>>> doesn't under-discarding even though quotas are not saturated.
> >>>>>
> >>>>> This sounds like it potentially be a great performance boost, do you
> >>>>> have any performance metrics regarding this patch?
> >>>>
> >>>> Boosting the discard rate and so reaping stalling blocks may be nice, but
> >>>> unless it holds too much memory creating lack of space it shouldn't affect
> >>>> throughput. Though, it's better to ask people with deeper understanding
> >>>> of the fs.
> >>>
> >>> Alright, thanks for the clarification.
> >>>
> >>>> What I've seen is that in some cases there are extents staying queued for
> >>>> discarding for _too_ long. E.g. reaping a small number of very fat extents
> >>>> keeps delay at max and doesn't allow to reap them effectively. That could
> >>>> be a problem with fast drives.
> >>>
> >>> Ah, yep. Seen this personally to a smaller extent.
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> >>>>>> ---
> >>>>>>  fs/btrfs/discard.c | 10 +++++-----
> >>>>>>  1 file changed, 5 insertions(+), 5 deletions(-)
> >>>>>>
> >>>>>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> >>>>>> index 741c7e19c32f..76796a90e88d 100644
> >>>>>> --- a/fs/btrfs/discard.c
> >>>>>> +++ b/fs/btrfs/discard.c
> >>>>>> @@ -519,7 +519,6 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>>>>>         s64 discardable_bytes;
> >>>>>>         u32 iops_limit;
> >>>>>>         unsigned long delay;
> >>>>>> -       unsigned long lower_limit = BTRFS_DISCARD_MIN_DELAY_MSEC;
> >>>>>>
> >>>>>>         discardable_extents = atomic_read(&discard_ctl->discardable_extents);
> >>>>>>         if (!discardable_extents)
> >>>>>> @@ -550,11 +549,12 @@ void btrfs_discard_calc_delay(struct btrfs_discard_ctl *discard_ctl)
> >>>>>>
> >>>>>>         iops_limit = READ_ONCE(discard_ctl->iops_limit);
> >>>>>>         if (iops_limit)
> >>>>>> -               lower_limit = max_t(unsigned long, lower_limit,
> >>>>>> -                                   MSEC_PER_SEC / iops_limit);
> >>>>>> +               delay = MSEC_PER_SEC / iops_limit;
> >>>>>> +       else
> >>>>>> +               delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >>>>>
> >>>>> Looks good to me. I wonder why there wasn't handling of if iops_limit
> >>>>> was unfindable
> >>>>> before?
> >>>>
> >>>> Not sure what you mean by unfindable, but async discard is relatively new,
> >>>> might be that everyone just have their hands full.
> >>>
> >>> By unfindable I mean if iops_limit turned up as null when reading it
> >>> from discard_ctl.
> >>
> >> Ahh, ok. It's handled and I left it as it was, that BTW is still a problem.
> >
> > How often is iops_limit unfindable?
>
> I don't know, but the default is 10, so shouldn't be too ubiquitous.
> Maybe someone here knows statistics.

So it isn't a major issue right now, and we can just have it looked at whenever
someone has the time to.

>
> >
> >>
> >> First it calculates a delay based on number of queued extents and than clamps
> >> it to (BTRFS_DISCARD_MIN_DELAY_MSEC, BTRFS_DISCARD_MAX_DELAY_MSEC). Without
> >> this patch it did the same but the lower bound was calculated from iops_limit.
> >
> > Thanks for clarifying.
> >
> >>
> >>> Async discard was added in 5.6, correct? So yeah, makes sense then that people
> >>> just had their hands full. Thanks for adding it.
> >>
> >> b0643e59cfa609c4b5f ("btrfs: add the beginning of async discard, discard
> >> workqueue"). Dec 2019, so less than a year
> >
> > Thanks for finding the commit.
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> -       delay = BTRFS_DISCARD_TARGET_MSEC / discardable_extents;
> >>>>>> -       delay = clamp(delay, lower_limit, BTRFS_DISCARD_MAX_DELAY_MSEC);
> >>>>>> +       delay = clamp(delay, BTRFS_DISCARD_MIN_DELAY_MSEC,
> >>>>>> +                     BTRFS_DISCARD_MAX_DELAY_MSEC);
> >>>>>>         discard_ctl->delay = msecs_to_jiffies(delay);
> >>>>>>
> >>>>>>         spin_unlock(&discard_ctl->lock);
> >>>>>> --
> >>>>>> 2.24.0
> >>>>>>
> >>>>>
> >>>>> This patch looks all great to me.
> >>
> >> --
> >> Pavel Begunkov
> >
> > Best regards,
> > Amy Parker
> > (they/them)
> >
>
> --
> Pavel Begunkov

Best regards,
Amy Parker
(they/them)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit
  2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
  2020-11-04 15:29   ` Amy Parker
@ 2020-11-04 20:52   ` Josef Bacik
  1 sibling, 0 replies; 24+ messages in thread
From: Josef Bacik @ 2020-11-04 20:52 UTC (permalink / raw)
  To: Pavel Begunkov, Chris Mason, David Sterba, linux-btrfs

On 11/4/20 4:45 AM, Pavel Begunkov wrote:
> Instead of using iops_limit only for cutting off extremes, calculate the
> discard delay directly from it, so it closely follows iops_limit and
> doesn't under-discarding even though quotas are not saturated.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy
  2020-11-04  9:45 ` [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy Pavel Begunkov
  2020-11-04 15:35   ` Amy Parker
@ 2020-11-04 20:54   ` Josef Bacik
  1 sibling, 0 replies; 24+ messages in thread
From: Josef Bacik @ 2020-11-04 20:54 UTC (permalink / raw)
  To: Pavel Begunkov, Chris Mason, David Sterba, linux-btrfs

On 11/4/20 4:45 AM, Pavel Begunkov wrote:
> Most of calculations are done in ns or ms, so store discard_ctl->delay
> in ms and convert the final delay to jiffies only in the end.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] btrfs: don't miss discards after override-schedule
  2020-11-04  9:45 ` [PATCH 3/4] btrfs: don't miss discards after override-schedule Pavel Begunkov
@ 2020-11-04 20:59   ` Josef Bacik
  2020-11-04 21:23     ` Pavel Begunkov
  0 siblings, 1 reply; 24+ messages in thread
From: Josef Bacik @ 2020-11-04 20:59 UTC (permalink / raw)
  To: Pavel Begunkov, Chris Mason, David Sterba, linux-btrfs

On 11/4/20 4:45 AM, Pavel Begunkov wrote:
> If btrfs_discard_schedule_work() is called with override=true, it sets
> delay anew regardless how much time left until the timer should have
> fired. If delays are long (that can happen, for example, with low
> kbps_limit), they might be constantly overriden without having a chance
> to run the discard work.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>   fs/btrfs/ctree.h   |  1 +
>   fs/btrfs/discard.c | 11 +++++++++++
>   2 files changed, 12 insertions(+)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index d43a82dcdfc0..ad71c8c769de 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -469,6 +469,7 @@ struct btrfs_discard_ctl {
>   	struct btrfs_block_group *block_group;
>   	struct list_head discard_list[BTRFS_NR_DISCARD_LISTS];
>   	u64 prev_discard;
> +	u64 prev_discard_time;
>   	atomic_t discardable_extents;
>   	atomic64_t discardable_bytes;
>   	u64 max_discard_size;
> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
> index b6c68e5711f0..c9018b9ccf99 100644
> --- a/fs/btrfs/discard.c
> +++ b/fs/btrfs/discard.c
> @@ -381,6 +381,15 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>   			delay = max(delay, bg_timeout);
>   		}
>   
> +		if (override && discard_ctl->prev_discard) {
> +			u64 elapsed = now - discard_ctl->prev_discard_time;
> +
> +			if (delay > elapsed)
> +				delay -= elapsed;
> +			else
> +				delay = 0;
> +		}
> +
>   		mod_delayed_work(discard_ctl->discard_workers,
>   				 &discard_ctl->work, nsecs_to_jiffies(delay));
>   	}
> @@ -466,6 +475,7 @@ static void btrfs_discard_workfn(struct work_struct *work)
>   	}
>   
>   	discard_ctl->prev_discard = trimmed;
> +	discard_ctl->prev_discard_time = ktime_get_ns();

I noticed these weren't protected by the discard_ctl->lock, so I went to look at 
if that was ok.  It appears to be ok, since this is the workfn, and we only read 
them if there's no pending work, so we're protected there.  Just a note for 
anybody else who finds it weird, though I wouldn't argue with protecting it with 
a lock just to remove any ambiguity.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] btrfs: discard: reschedule work after param update
  2020-11-04  9:45 ` [PATCH 4/4] btrfs: discard: reschedule work after param update Pavel Begunkov
@ 2020-11-04 21:00   ` Josef Bacik
  0 siblings, 0 replies; 24+ messages in thread
From: Josef Bacik @ 2020-11-04 21:00 UTC (permalink / raw)
  To: Pavel Begunkov, Chris Mason, David Sterba, linux-btrfs

On 11/4/20 4:45 AM, Pavel Begunkov wrote:
> After sysfs updates discard's iops_limit or kbps_limit it also needs to
> adjust current timer through rescheduling, otherwise the discard work
> may wait for a long time for the previous timer to expier or bumped by
> someone else.
> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] btrfs: don't miss discards after override-schedule
  2020-11-04 20:59   ` Josef Bacik
@ 2020-11-04 21:23     ` Pavel Begunkov
  0 siblings, 0 replies; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-04 21:23 UTC (permalink / raw)
  To: Josef Bacik, Chris Mason, David Sterba, linux-btrfs

On 04/11/2020 20:59, Josef Bacik wrote:
> On 11/4/20 4:45 AM, Pavel Begunkov wrote:
>> If btrfs_discard_schedule_work() is called with override=true, it sets
>> delay anew regardless how much time left until the timer should have
>> fired. If delays are long (that can happen, for example, with low
>> kbps_limit), they might be constantly overriden without having a chance
>> to run the discard work.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>   fs/btrfs/ctree.h   |  1 +
>>   fs/btrfs/discard.c | 11 +++++++++++
>>   2 files changed, 12 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index d43a82dcdfc0..ad71c8c769de 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -469,6 +469,7 @@ struct btrfs_discard_ctl {
>>       struct btrfs_block_group *block_group;
>>       struct list_head discard_list[BTRFS_NR_DISCARD_LISTS];
>>       u64 prev_discard;
>> +    u64 prev_discard_time;
>>       atomic_t discardable_extents;
>>       atomic64_t discardable_bytes;
>>       u64 max_discard_size;
>> diff --git a/fs/btrfs/discard.c b/fs/btrfs/discard.c
>> index b6c68e5711f0..c9018b9ccf99 100644
>> --- a/fs/btrfs/discard.c
>> +++ b/fs/btrfs/discard.c
>> @@ -381,6 +381,15 @@ void btrfs_discard_schedule_work(struct btrfs_discard_ctl *discard_ctl,
>>               delay = max(delay, bg_timeout);
>>           }
>>   +        if (override && discard_ctl->prev_discard) {
>> +            u64 elapsed = now - discard_ctl->prev_discard_time;
>> +
>> +            if (delay > elapsed)
>> +                delay -= elapsed;
>> +            else
>> +                delay = 0;
>> +        }
>> +
>>           mod_delayed_work(discard_ctl->discard_workers,
>>                    &discard_ctl->work, nsecs_to_jiffies(delay));
>>       }
>> @@ -466,6 +475,7 @@ static void btrfs_discard_workfn(struct work_struct *work)
>>       }
>>         discard_ctl->prev_discard = trimmed;
>> +    discard_ctl->prev_discard_time = ktime_get_ns();
> 
> I noticed these weren't protected by the discard_ctl->lock, so I went to look at if that was ok.  It appears to be ok, since this is the workfn, and we only read them if there's no pending work, so we're protected there.  Just a note for anybody else who finds it weird, though I wouldn't argue with protecting it with a lock just to remove any ambiguity.

Agree, together with ->prev_discard. Or at least there should be
a comment.

> 
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Thanks

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/4] fixes for btrfs async discards
  2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
                   ` (3 preceding siblings ...)
  2020-11-04  9:45 ` [PATCH 4/4] btrfs: discard: reschedule work after param update Pavel Begunkov
@ 2020-11-05 22:23 ` David Sterba
  2020-11-06 13:20   ` Pavel Begunkov
  4 siblings, 1 reply; 24+ messages in thread
From: David Sterba @ 2020-11-05 22:23 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Wed, Nov 04, 2020 at 09:45:50AM +0000, Pavel Begunkov wrote:
> Several fixes for async discards. The first patch might increase discard
> rate, drastically in some cases. That may be a suprise for those
> assuming that hitting iops_limit is rare and rarther outliers. Though,
> it still stays in allowed range, so should be fine.

I think this highly depends on the workload, if you really need to issue
the discards fast because of the rate of the change in the regular data.
That was the point of the async discard and the knobs, the defaults
should be ok for most users and allow adjusting for specific loads.

My testing of the original discard patchset was tailored towards the
default usecase and likely intense than yours. I did not observe the
corner cases where the work queue scheduling was off, or changing the
sysfs values did not poke the right kthreads.

Patches look ok to me and I've added them to topic branch for testing,
going to misc-next later. Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/4] fixes for btrfs async discards
  2020-11-05 22:23 ` [PATCH 0/4] fixes for btrfs async discards David Sterba
@ 2020-11-06 13:20   ` Pavel Begunkov
  2020-11-06 13:56     ` David Sterba
  2020-11-06 14:19     ` Chris Mason
  0 siblings, 2 replies; 24+ messages in thread
From: Pavel Begunkov @ 2020-11-06 13:20 UTC (permalink / raw)
  To: dsterba, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On 05/11/2020 22:23, David Sterba wrote:
> On Wed, Nov 04, 2020 at 09:45:50AM +0000, Pavel Begunkov wrote:
>> Several fixes for async discards. The first patch might increase discard
>> rate, drastically in some cases. That may be a suprise for those
>> assuming that hitting iops_limit is rare and rarther outliers. Though,
>> it still stays in allowed range, so should be fine.
> 
> I think this highly depends on the workload, if you really need to issue
> the discards fast because of the rate of the change in the regular data.
> That was the point of the async discard and the knobs, the defaults
> should be ok for most users and allow adjusting for specific loads.

Chris mentioned that _there are_ problems with faster drives though.
The problem is that this iops_limit knot just clamps the chosen delay.
Ultimately, I want to find later a better delay function than

delay = CONSTANT_INTERVAL_MS / nr_extents.

But that will take some thinking.

For instance, one of the cases I've seen is recycling large extents
like deletion of subvolumes. There we have a small number of extents
but each takes a lot of space, so there are, say, 100+GB queued to be
discarded. But because there are few extents, delay is calculated
to >10s that's then clamped to a constant max limit.
That was taking a long to recycle. Not sure though how many bytes/extents
are discarded on each iteration of btrfs_discard_workfn().

> 
> My testing of the original discard patchset was tailored towards the
> default usecase and likely intense than yours. I did not observe the
> corner cases where the work queue scheduling was off, or changing the
> sysfs values did not poke the right kthreads.
> 
> Patches look ok to me and I've added them to topic branch for testing,
> going to misc-next later. Thanks.
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/4] fixes for btrfs async discards
  2020-11-06 13:20   ` Pavel Begunkov
@ 2020-11-06 13:56     ` David Sterba
  2020-11-06 14:19     ` Chris Mason
  1 sibling, 0 replies; 24+ messages in thread
From: David Sterba @ 2020-11-06 13:56 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: dsterba, Chris Mason, Josef Bacik, David Sterba, linux-btrfs

On Fri, Nov 06, 2020 at 01:20:25PM +0000, Pavel Begunkov wrote:
> On 05/11/2020 22:23, David Sterba wrote:
> > On Wed, Nov 04, 2020 at 09:45:50AM +0000, Pavel Begunkov wrote:
> >> Several fixes for async discards. The first patch might increase discard
> >> rate, drastically in some cases. That may be a suprise for those
> >> assuming that hitting iops_limit is rare and rarther outliers. Though,
> >> it still stays in allowed range, so should be fine.
> > 
> > I think this highly depends on the workload, if you really need to issue
> > the discards fast because of the rate of the change in the regular data.
> > That was the point of the async discard and the knobs, the defaults
> > should be ok for most users and allow adjusting for specific loads.
> 
> Chris mentioned that _there are_ problems with faster drives though.
> The problem is that this iops_limit knot just clamps the chosen delay.
> Ultimately, I want to find later a better delay function than
> 
> delay = CONSTANT_INTERVAL_MS / nr_extents.
> 
> But that will take some thinking.
> 
> For instance, one of the cases I've seen is recycling large extents
> like deletion of subvolumes. There we have a small number of extents
> but each takes a lot of space, so there are, say, 100+GB queued to be
> discarded. But because there are few extents, delay is calculated
> to >10s that's then clamped to a constant max limit.
> That was taking a long to recycle. Not sure though how many bytes/extents
> are discarded on each iteration of btrfs_discard_workfn().

BTRFS_ASYNC_DISCARD_DEFAULT_MAX_SIZE 64M

So the few large extents do not fit current scheme. I thought some
translation to "logical" units could do the same discard io scheduling.
100G of size would become N times maximum discard extent units
(N = 100G / max) and submitted for discard until N is 0 for a given
input range.

But if you know you'll have ranges that are orders of magnitude larger
than the extent size, then setting the sysfs value accordingly seems
like the right approach. I'm not sure if the freed ranges are coalesced
before adding them to the discard list. If not, then icreasing the max
size should work, otherwise the coalescing could be adjusted.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/4] fixes for btrfs async discards
  2020-11-06 13:20   ` Pavel Begunkov
  2020-11-06 13:56     ` David Sterba
@ 2020-11-06 14:19     ` Chris Mason
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Mason @ 2020-11-06 14:19 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: dsterba, Josef Bacik, David Sterba, linux-btrfs

On 6 Nov 2020, at 8:20, Pavel Begunkov wrote:

> On 05/11/2020 22:23, David Sterba wrote:
>> On Wed, Nov 04, 2020 at 09:45:50AM +0000, Pavel Begunkov wrote:
>>> Several fixes for async discards. The first patch might increase 
>>> discard
>>> rate, drastically in some cases. That may be a suprise for those
>>> assuming that hitting iops_limit is rare and rarther outliers. 
>>> Though,
>>> it still stays in allowed range, so should be fine.
>>
>> I think this highly depends on the workload, if you really need to 
>> issue
>> the discards fast because of the rate of the change in the regular 
>> data.
>> That was the point of the async discard and the knobs, the defaults
>> should be ok for most users and allow adjusting for specific loads.
>
> Chris mentioned that _there are_ problems with faster drives though.
> The problem is that this iops_limit knot just clamps the chosen delay.
> Ultimately, I want to find later a better delay function than
>
> delay = CONSTANT_INTERVAL_MS / nr_extents.
>
> But that will take some thinking.

Yeah, we have drives where latencies increase as the drive fills up.  
Async discard moves the knee on utilization percentage before the 
latencies increase, but without patches like these, we do still have 
garbage collection related stalls.  We still need to measure if the GC 
related stalls turn into discard-saturation stalls, but at least now we 
can try.

-chris

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-11-06 14:19 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-04  9:45 [PATCH 0/4] fixes for btrfs async discards Pavel Begunkov
2020-11-04  9:45 ` [PATCH 1/4] btrfs: discard: speed up discard up to iops_limit Pavel Begunkov
2020-11-04 15:29   ` Amy Parker
2020-11-04 17:19     ` Pavel Begunkov
2020-11-04 17:33       ` Amy Parker
2020-11-04 17:47         ` Pavel Begunkov
2020-11-04 17:55           ` Amy Parker
2020-11-04 18:06             ` Pavel Begunkov
2020-11-04 18:14               ` Amy Parker
2020-11-04 20:52   ` Josef Bacik
2020-11-04  9:45 ` [PATCH 2/4] btrfs: discard: save discard delay as ns not jiffy Pavel Begunkov
2020-11-04 15:35   ` Amy Parker
2020-11-04 15:48     ` Pavel Begunkov
2020-11-04 16:46       ` Amy Parker
2020-11-04 20:54   ` Josef Bacik
2020-11-04  9:45 ` [PATCH 3/4] btrfs: don't miss discards after override-schedule Pavel Begunkov
2020-11-04 20:59   ` Josef Bacik
2020-11-04 21:23     ` Pavel Begunkov
2020-11-04  9:45 ` [PATCH 4/4] btrfs: discard: reschedule work after param update Pavel Begunkov
2020-11-04 21:00   ` Josef Bacik
2020-11-05 22:23 ` [PATCH 0/4] fixes for btrfs async discards David Sterba
2020-11-06 13:20   ` Pavel Begunkov
2020-11-06 13:56     ` David Sterba
2020-11-06 14:19     ` Chris Mason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.