All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Preemptive flushing improvements
@ 2021-04-28 17:38 Josef Bacik
  2021-04-28 17:38 ` [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim Josef Bacik
                   ` (8 more replies)
  0 siblings, 9 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

The following patch series is a set of improvements around our preemptive
flushing infrastructure.  A user reported[1] a problem where the preemptive
flushing was running non-stop on his full file system.  I was not able to
reproduce this behavior, but noticed a few fundamental issues with how we
decided to do preemptive flushing.  The main relevant thing is that we were not
taking into account the global reserve when deciding if we needed to
preemptively flush.  In cases such as the users file system where the majority
of the free metadata space is taken up by the global reserve we could
potentially always want to flush, which is annoying and not what we want.

Furthermore I noticed issues where we would clamp entirely too quickly, and
where we made some poor decisions around delalloc.  None of these are ground
breaking or huge improvements, but offer some better performance in some of the
fsperf test cases.  This is the results of my recent run with the whole
patchset.  You'll notice a "regression" in the 16g buffered write, this is
mostly because of the test, if you look at the results on the nightly
performance tests you'll see this test varies more than others.  Re-testing with
just that test didn't show the same regression after multiple runs, so it's just
noise.  I could have chosen to run all the tests multiple times to get an
average over several runs, but that takes a fair bit of time.  Individual runs
of the test showed no regression and often showed an improvement, so I feel
comfortable calling it noise.  The full results are as follows

dbench60 results
  metric     baseline   current         diff
==================================================
qpathinfo       11.53     13.25    14.90%
throughput     446.23    434.52    -2.62%
flush         2502.92   1682.43   -32.78%
qfileinfo        0.92      1.17    27.29%
ntcreatex     1359.76    519.42   -61.80%
qfsinfo          1.77      3.76   112.64%
close            1.90      1.64   -13.91%å
sfileinfo      209.76     76.43   -63.56%
rename        1110.08    518.40   -53.30%
find            13.84     13.13    -5.15%
unlink        1192.89    521.53   -56.28%
writex        1713.39   1321.39   -22.88%
deltree        280.34    296.33     5.70%
readx            3.16      2.91    -8.10%
mkdir            0.03      0.02   -46.67%
lockx            0.78      0.20   -73.89%
unlockx          0.16      0.12   -23.81%

emptyfiles500k results
     metric         baseline   current         diff
=========================================================
write_io_kbytes       125000     125000    0.00%
read_clat_ns_p99           0          0    0.00%
write_bw_bytes      1.79e+08   1.85e+08    3.04%
read_iops                  0          0    0.00%
write_clat_ns_p50      17728      17280   -2.53%
read_io_kbytes             0          0    0.00%
read_io_bytes              0          0    0.00%
write_clat_ns_p99      72704      68096   -6.34%
read_bw_bytes              0          0    0.00%
elapsed                    1          1    0.00%
write_lat_ns_min           0          0    0.00%
sys_cpu                91.23      89.16   -2.27%
write_lat_ns_max           0          0    0.00%
read_lat_ns_min            0          0    0.00%
write_iops          43763.97   45093.80    3.04%
read_lat_ns_max            0          0    0.00%
read_clat_ns_p50           0          0    0.00%

smallfiles100k results
     metric         baseline   current         diff
=========================================================
write_io_kbytes     2.04e+08   2.04e+08    0.00%
read_clat_ns_p99           0          0    0.00%
write_bw_bytes      1.40e+08   1.40e+08    0.50%
read_iops                  0          0    0.00%
write_clat_ns_p50       6712       6944    3.46%
read_io_kbytes             0          0    0.00%
read_io_bytes              0          0    0.00%
write_clat_ns_p99      16000      16512    3.20%
read_bw_bytes              0          0    0.00%
elapsed              1498.88       1491   -0.53%
write_lat_ns_min     2858.38       2919    2.12%
sys_cpu                 6.22       6.51    4.61%
write_lat_ns_max    1.31e+08   1.27e+08   -2.77%
read_lat_ns_min            0          0    0.00%
write_iops          34081.44   34253.51    0.50%
read_lat_ns_max            0          0    0.00%
read_clat_ns_p50           0          0    0.00%

dio4kbs16threads results
     metric          baseline     current          diff
=============================================================
write_io_kbytes         4360879    5312908    21.83%
read_clat_ns_p99              0          0     0.00%
write_bw_bytes      74302497.38   90667585    22.02%
read_iops                     0          0     0.00%
write_clat_ns_p50        243968     238592    -2.20%
read_io_kbytes                0          0     0.00%
read_io_bytes                 0          0     0.00%
write_clat_ns_p99      21094400   15007744   -28.85%
read_bw_bytes                 0          0     0.00%
elapsed                      61         61     0.00%
write_lat_ns_min       38183.25      37949    -0.61%
sys_cpu                    4.03       4.72    17.11%
write_lat_ns_max       1.68e+09   8.46e+08   -49.55%
read_lat_ns_min               0          0     0.00%
write_iops             18140.26   22135.64    22.02%
read_lat_ns_max               0          0     0.00%
read_clat_ns_p50              0          0     0.00%

randwrite2xram results
     metric          baseline     current          diff
=============================================================
write_io_kbytes        27720434   36563300    31.90%
read_clat_ns_p99              0          0     0.00%
write_bw_bytes      93268100.75   1.16e+08    24.83%
read_iops                     0          0     0.00%
write_clat_ns_p50         13168      16512    25.39%
read_io_kbytes                0          0     0.00%
read_io_bytes                 0          0     0.00%
write_clat_ns_p99         39360     125440   218.70%
read_bw_bytes                 0          0     0.00%
elapsed                  334.25        333    -0.37%
write_lat_ns_min        5436.75       5682     4.51%
sys_cpu                    8.22      12.57    52.96%
write_lat_ns_max       4.05e+10   2.73e+10   -32.65%
read_lat_ns_min               0          0     0.00%
write_iops             22770.53   28425.17    24.83%
read_lat_ns_max               0          0     0.00%
read_clat_ns_p50              0          0     0.00%

untarfirefox results
metric    baseline   current        diff
==============================================
elapsed      47.23     46.82   -0.87%

I'm still waiting on feedback from the user to make sure the patches fix the
reported problem, but they're worthy on their own if they do not resolve the
original reported issue.  Thanks,

Josef

[1] https://bugzilla.kernel.org/show_bug.cgi?id=212185

Josef Bacik (7):
  btrfs: check worker before need_preemptive_reclaim
  btrfs: only clamp the first time we have to start flushing
  btrfs: take into account global rsv in need_preemptive_reclaim
  btrfs: use the global rsv size in the preemptive thresh calculation
  btrfs: don't include the global rsv size in the preemptive used amount
  btrfs: only ignore delalloc if delalloc is much smaller than ordered
  btrfs: handle preemptive delalloc flushing slightly differently

 fs/btrfs/space-info.c | 56 +++++++++++++++++++++++++++++--------------
 1 file changed, 38 insertions(+), 18 deletions(-)

-- 
2.26.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-29 13:35   ` Johannes Thumshirn
  2021-04-28 17:38 ` [PATCH 2/7] btrfs: only clamp the first time we have to start flushing Josef Bacik
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

need_preemptive_reclaim() does some calculations, which aren't heavy,
but if we're already running preemptive reclaim there's no reason to do
them at all, so re-order the checks so that we don't do the calculation
if we're already doing reclaim.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 2dc674b7c3b1..c9a5e003bcfa 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1588,8 +1588,8 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 		 * the async reclaim as we will panic.
 		 */
 		if (!test_bit(BTRFS_FS_LOG_RECOVERING, &fs_info->flags) &&
-		    need_preemptive_reclaim(fs_info, space_info) &&
-		    !work_busy(&fs_info->preempt_reclaim_work)) {
+		    !work_busy(&fs_info->preempt_reclaim_work) &&
+		    need_preemptive_reclaim(fs_info, space_info)) {
 			trace_btrfs_trigger_flush(fs_info, space_info->flags,
 						  orig_bytes, flush, "preempt");
 			queue_work(system_unbound_wq,
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/7] btrfs: only clamp the first time we have to start flushing
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
  2021-04-28 17:38 ` [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-28 17:38 ` [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim Josef Bacik
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We were clamping the threshold for preemptive reclaim any time we added
a ticket to wait on, which if we have a lot of threads means we'd
essentially max out the clamp the first time we start to flush.  Instead
of doing this, simply do it every time we have to start flushing, this
will make us ramp up gradually instead of going to max clamping as soon
as we start needing to do flushing.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index c9a5e003bcfa..33edab17af0d 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1561,6 +1561,15 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 		    flush == BTRFS_RESERVE_FLUSH_DATA) {
 			list_add_tail(&ticket.list, &space_info->tickets);
 			if (!space_info->flush) {
+				/*
+				 * We were forced to add a reserve ticket, so
+				 * our preemptive flushing is unable to keep
+				 * up.  Clamp down on the threshold for the
+				 * preemptive flushing in order to keep up with
+				 * the workload.
+				 */
+				maybe_clamp_preempt(fs_info, space_info);
+
 				space_info->flush = 1;
 				trace_btrfs_trigger_flush(fs_info,
 							  space_info->flags,
@@ -1572,14 +1581,6 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info,
 			list_add_tail(&ticket.list,
 				      &space_info->priority_tickets);
 		}
-
-		/*
-		 * We were forced to add a reserve ticket, so our preemptive
-		 * flushing is unable to keep up.  Clamp down on the threshold
-		 * for the preemptive flushing in order to keep up with the
-		 * workload.
-		 */
-		maybe_clamp_preempt(fs_info, space_info);
 	} else if (!ret && space_info->flags & BTRFS_BLOCK_GROUP_METADATA) {
 		used += orig_bytes;
 		/*
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
  2021-04-28 17:38 ` [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim Josef Bacik
  2021-04-28 17:38 ` [PATCH 2/7] btrfs: only clamp the first time we have to start flushing Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-29 13:50   ` Johannes Thumshirn
  2021-04-29 20:03   ` David Sterba
  2021-04-28 17:38 ` [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation Josef Bacik
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Global rsv can't be used for normal allocations, and for very full file
systems we can decide to try and async flush constantly even though
there's really not a lot of space to reclaim.  Deal with this by
including the global block rsv size in the "total used" calculation.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 33edab17af0d..e341f995a7dd 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -792,12 +792,14 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
 static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
 				    struct btrfs_space_info *space_info)
 {
+	u64 global_rsv_size = fs_info->global_block_rsv.reserved;
 	u64 ordered, delalloc;
 	u64 thresh = div_factor_fine(space_info->total_bytes, 98);
 	u64 used;
 
 	/* If we're just plain full then async reclaim just slows us down. */
-	if ((space_info->bytes_used + space_info->bytes_reserved) >= thresh)
+	if ((space_info->bytes_used + space_info->bytes_reserved +
+	     global_rsv_size) >= thresh)
 		return false;
 
 	/*
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
                   ` (2 preceding siblings ...)
  2021-04-28 17:38 ` [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-29 14:04   ` Johannes Thumshirn
  2021-04-28 17:38 ` [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount Josef Bacik
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

We calculate the amount of "free" space available for normal
reservations by taking the total space and subtracting out the hard used
space, which is readonly, used, and reserved space.  However we weren't
taking into account the global block rsv, which is essentially hard used
space.  Handle this by subtracting it from the available free space, so
that our threshold more closely mirrors reality.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index e341f995a7dd..b0dd9b57d352 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -840,8 +840,10 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
 
 	thresh = calc_available_free_space(fs_info, space_info,
 					   BTRFS_RESERVE_FLUSH_ALL);
-	thresh += (space_info->total_bytes - space_info->bytes_used -
-		   space_info->bytes_reserved - space_info->bytes_readonly);
+	used = space_info->bytes_used + space_info->bytes_reserved +
+		space_info->bytes_readonly + global_rsv_size;
+	if (used < space_info->total_bytes)
+		thresh += space_info->total_bytes - used;
 	thresh >>= space_info->clamp;
 
 	used = space_info->bytes_pinned;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
                   ` (3 preceding siblings ...)
  2021-04-28 17:38 ` [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-29 14:19   ` Johannes Thumshirn
  2021-04-28 17:38 ` [PATCH 6/7] btrfs: only ignore delalloc if delalloc is much smaller than ordered Josef Bacik
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When deciding if we should preemptively flush space, we will add in the
amount of space used by all block rsvs.  However this also includes the
global block rsv, which isn't flushable so shouldn't be accounted for in
this calculation.  If we decide to use ->bytes_may_use in our used
calculation we need to subtract the global rsv size from this amount so
it most closely matches the flushable space.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b0dd9b57d352..4e3857474cfd 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -871,7 +871,7 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
 		used += fs_info->delayed_refs_rsv.reserved +
 			fs_info->delayed_block_rsv.reserved;
 	else
-		used += space_info->bytes_may_use;
+		used += space_info->bytes_may_use - global_rsv_size;
 
 	return (used >= thresh && !btrfs_fs_closing(fs_info) &&
 		!test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state));
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/7] btrfs: only ignore delalloc if delalloc is much smaller than ordered
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
                   ` (4 preceding siblings ...)
  2021-04-28 17:38 ` [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
  2021-04-28 17:38 ` [PATCH 7/7] btrfs: handle preemptive delalloc flushing slightly differently Josef Bacik
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While testing heavy delalloc workloads I noticed that sometimes we'd
just stop preemptively flushing when we had loads of delalloc available
to flush.  This is because we skip preemptive flushing if delalloc <=
ordered.  However if we start with say 4gib of delalloc, and we flush
2gib of that, we'll stop flushing there, when we still have 2gib of
delalloc to flush.

Instead adjust the ordered bytes down by half, this way if 2/3 of our
outstanding delalloc reservations are tied up by ordered extents we
don't bother preemptive flushing, as we're getting close to the state
where we need to wait on ordered extents.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 4e3857474cfd..cf09b23f3448 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -864,8 +864,14 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
 	 * clearly be heavy enough to warrant preemptive flushing.  In the case
 	 * of heavy DIO or ordered reservations, preemptive flushing will just
 	 * waste time and cause us to slow down.
+	 *
+	 * We want to make sure we truly are maxed out on ordered however, so
+	 * cut ordered in half, and if it's still higher than delalloc then we
+	 * can keep flushing.  This is to avoid the case where we start
+	 * flushing, and now delalloc == ordered and we stop preemptively
+	 * flushing when we could still have several gigs of delalloc to flush.
 	 */
-	ordered = percpu_counter_read_positive(&fs_info->ordered_bytes);
+	ordered = percpu_counter_read_positive(&fs_info->ordered_bytes) >> 1;
 	delalloc = percpu_counter_read_positive(&fs_info->delalloc_bytes);
 	if (ordered >= delalloc)
 		used += fs_info->delayed_refs_rsv.reserved +
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/7] btrfs: handle preemptive delalloc flushing slightly differently
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
                   ` (5 preceding siblings ...)
  2021-04-28 17:38 ` [PATCH 6/7] btrfs: only ignore delalloc if delalloc is much smaller than ordered Josef Bacik
@ 2021-04-28 17:38 ` Josef Bacik
       [not found] ` <20210429094852.DAC3.409509F4@e16-tech.com>
  2021-04-30 15:47 ` David Sterba
  8 siblings, 0 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-28 17:38 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

If we decide to flush delalloc from the preemptive flusher, we really do
not want to wait on ordered extents, as it gains us nothing.  However
there was logic to go ahead and wait on ordered extents if there was
more ordered bytes than delalloc bytes.  We do not want this behavior,
so pass through whether this flushing is for preemption, and do not wait
for ordered extents if that's the case.  Also break out of the shrink
loop after the first flushing, as we just want to one shot shrink
delalloc.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index cf09b23f3448..b2d834b92811 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -495,7 +495,8 @@ static inline u64 calc_reclaim_items_nr(struct btrfs_fs_info *fs_info,
  */
 static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 			    struct btrfs_space_info *space_info,
-			    u64 to_reclaim, bool wait_ordered)
+			    u64 to_reclaim, bool wait_ordered,
+			    bool for_preempt)
 {
 	struct btrfs_trans_handle *trans;
 	u64 delalloc_bytes;
@@ -532,7 +533,7 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 	 * ordered extents, otherwise we'll waste time trying to flush delalloc
 	 * that likely won't give us the space back we need.
 	 */
-	if (ordered_bytes > delalloc_bytes)
+	if (ordered_bytes > delalloc_bytes && !for_preempt)
 		wait_ordered = true;
 
 	loops = 0;
@@ -551,6 +552,14 @@ static void shrink_delalloc(struct btrfs_fs_info *fs_info,
 				break;
 		}
 
+		/*
+		 * If we are for preemption we just want a one-shot of delalloc
+		 * flushing so we can stop flushing if we decide we don't need
+		 * to anymore.
+		 */
+		if (for_preempt)
+			break;
+
 		spin_lock(&space_info->lock);
 		if (list_empty(&space_info->tickets) &&
 		    list_empty(&space_info->priority_tickets)) {
@@ -702,7 +711,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
 	case FLUSH_DELALLOC:
 	case FLUSH_DELALLOC_WAIT:
 		shrink_delalloc(fs_info, space_info, num_bytes,
-				state == FLUSH_DELALLOC_WAIT);
+				state == FLUSH_DELALLOC_WAIT, for_preempt);
 		break;
 	case FLUSH_DELAYED_REFS_NR:
 	case FLUSH_DELAYED_REFS:
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim
  2021-04-28 17:38 ` [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim Josef Bacik
@ 2021-04-29 13:35   ` Johannes Thumshirn
  0 siblings, 0 replies; 21+ messages in thread
From: Johannes Thumshirn @ 2021-04-29 13:35 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim
  2021-04-28 17:38 ` [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim Josef Bacik
@ 2021-04-29 13:50   ` Johannes Thumshirn
  2021-04-29 20:03   ` David Sterba
  1 sibling, 0 replies; 21+ messages in thread
From: Johannes Thumshirn @ 2021-04-29 13:50 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation
  2021-04-28 17:38 ` [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation Josef Bacik
@ 2021-04-29 14:04   ` Johannes Thumshirn
  2021-04-29 15:19     ` Josef Bacik
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Thumshirn @ 2021-04-29 14:04 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

On 28/04/2021 19:40, Josef Bacik wrote:
> -	thresh += (space_info->total_bytes - space_info->bytes_used -
> -		   space_info->bytes_reserved - space_info->bytes_readonly);
> +	used = space_info->bytes_used + space_info->bytes_reserved +
> +		space_info->bytes_readonly + global_rsv_size;
> +	if (used < space_info->total_bytes)
> +		thresh += space_info->total_bytes - used;
>  	thresh >>= space_info->clamp;


I don't quite understand why you introduced the 'if' here. Now we're only
adding the new free space to the threshold if we're using less than our
total free space, which kinda makes sense that we're not going over our
total free space.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount
  2021-04-28 17:38 ` [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount Josef Bacik
@ 2021-04-29 14:19   ` Johannes Thumshirn
  0 siblings, 0 replies; 21+ messages in thread
From: Johannes Thumshirn @ 2021-04-29 14:19 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
       [not found] ` <20210429094852.DAC3.409509F4@e16-tech.com>
@ 2021-04-29 15:06   ` Josef Bacik
  2021-04-29 21:40     ` Wang Yugui
  0 siblings, 1 reply; 21+ messages in thread
From: Josef Bacik @ 2021-04-29 15:06 UTC (permalink / raw)
  To: Wang Yugui; +Cc: linux-btrfs, kernel-team

On 4/28/21 9:48 PM, Wang Yugui wrote:
> Hi,
> 
> xfstests generic/562 failed with this patch.
> 
> This is the file 562.out.bad.
> 

It's not failing for me, what is your config like?  Both the local.config and 
the type/size of your devices.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation
  2021-04-29 14:04   ` Johannes Thumshirn
@ 2021-04-29 15:19     ` Josef Bacik
  0 siblings, 0 replies; 21+ messages in thread
From: Josef Bacik @ 2021-04-29 15:19 UTC (permalink / raw)
  To: Johannes Thumshirn, linux-btrfs, kernel-team

On 4/29/21 10:04 AM, Johannes Thumshirn wrote:
> On 28/04/2021 19:40, Josef Bacik wrote:
>> -	thresh += (space_info->total_bytes - space_info->bytes_used -
>> -		   space_info->bytes_reserved - space_info->bytes_readonly);
>> +	used = space_info->bytes_used + space_info->bytes_reserved +
>> +		space_info->bytes_readonly + global_rsv_size;
>> +	if (used < space_info->total_bytes)
>> +		thresh += space_info->total_bytes - used;
>>   	thresh >>= space_info->clamp;
> 
> 
> I don't quite understand why you introduced the 'if' here. Now we're only
> adding the new free space to the threshold if we're using less than our
> total free space, which kinda makes sense that we're not going over our
> total free space.
> 

Because it's possible that the global_rsv_size + used is > total_bytes, because 
sometimes the global reserve can end up being calculated as larger than the 
available size (think stupid small fs'es where we only have the original 8mib 
chunk of metadata).  It doesn't usually happen, but that sort of thinking gets 
me into trouble, so this is safer.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim
  2021-04-28 17:38 ` [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim Josef Bacik
  2021-04-29 13:50   ` Johannes Thumshirn
@ 2021-04-29 20:03   ` David Sterba
  2021-04-29 20:05     ` David Sterba
  1 sibling, 1 reply; 21+ messages in thread
From: David Sterba @ 2021-04-29 20:03 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, Apr 28, 2021 at 01:38:44PM -0400, Josef Bacik wrote:
> Global rsv can't be used for normal allocations, and for very full file
> systems we can decide to try and async flush constantly even though
> there's really not a lot of space to reclaim.  Deal with this by
> including the global block rsv size in the "total used" calculation.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/space-info.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index 33edab17af0d..e341f995a7dd 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -792,12 +792,14 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
>  static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
>  				    struct btrfs_space_info *space_info)
>  {
> +	u64 global_rsv_size = fs_info->global_block_rsv.reserved;
>  	u64 ordered, delalloc;
>  	u64 thresh = div_factor_fine(space_info->total_bytes, 98);
>  	u64 used;
>  
>  	/* If we're just plain full then async reclaim just slows us down. */
> -	if ((space_info->bytes_used + space_info->bytes_reserved) >= thresh)
> +	if ((space_info->bytes_used + space_info->bytes_reserved +
> +	     global_rsv_size) >= thresh)

That's an extra variable for one time use. Do you have plans to use it
in further patches? Regarding readability, it has to be put on a
separate line anyway so there's space for the whole identifier bit if
you intend to use it then it's ok to keep it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim
  2021-04-29 20:03   ` David Sterba
@ 2021-04-29 20:05     ` David Sterba
  0 siblings, 0 replies; 21+ messages in thread
From: David Sterba @ 2021-04-29 20:05 UTC (permalink / raw)
  To: dsterba, Josef Bacik, linux-btrfs, kernel-team

On Thu, Apr 29, 2021 at 10:03:10PM +0200, David Sterba wrote:
> On Wed, Apr 28, 2021 at 01:38:44PM -0400, Josef Bacik wrote:
> > Global rsv can't be used for normal allocations, and for very full file
> > systems we can decide to try and async flush constantly even though
> > there's really not a lot of space to reclaim.  Deal with this by
> > including the global block rsv size in the "total used" calculation.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> >  fs/btrfs/space-info.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> > index 33edab17af0d..e341f995a7dd 100644
> > --- a/fs/btrfs/space-info.c
> > +++ b/fs/btrfs/space-info.c
> > @@ -792,12 +792,14 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info,
> >  static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info,
> >  				    struct btrfs_space_info *space_info)
> >  {
> > +	u64 global_rsv_size = fs_info->global_block_rsv.reserved;
> >  	u64 ordered, delalloc;
> >  	u64 thresh = div_factor_fine(space_info->total_bytes, 98);
> >  	u64 used;
> >  
> >  	/* If we're just plain full then async reclaim just slows us down. */
> > -	if ((space_info->bytes_used + space_info->bytes_reserved) >= thresh)
> > +	if ((space_info->bytes_used + space_info->bytes_reserved +
> > +	     global_rsv_size) >= thresh)
> 
> That's an extra variable for one time use. Do you have plans to use it
> in further patches? Regarding readability, it has to be put on a
> separate line anyway so there's space for the whole identifier bit if
> you intend to use it then it's ok to keep it.

Never mind, I got fooled by the diff -p that shows the name of the
previous function (btrfs_calc_reclaim_metadata_size) and not the actual
function where it is being added (need_preemptive_reclaim) in case it's
in the hunk context.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
  2021-04-29 15:06   ` [PATCH 0/7] Preemptive flushing improvements Josef Bacik
@ 2021-04-29 21:40     ` Wang Yugui
  2021-04-29 22:23       ` Wang Yugui
  0 siblings, 1 reply; 21+ messages in thread
From: Wang Yugui @ 2021-04-29 21:40 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

Hi,

> On 4/28/21 9:48 PM, Wang Yugui wrote:
> > Hi,
> >
> > xfstests generic/562 failed with this patch.
> >
> > This is the file 562.out.bad.
> > 
> It's not failing for me, what is your config like?  Both the local.config and the type/size of your devices.  Thanks,
> 

# cat xfstests/local.config

dev_base_disk=/dev/disk/by-id/nvme-Dell_Express_Flash_NVMe_SM1715_800GB_SFF__S29HNYAH500207

export TEST_DIR=/mnt/test

export TEST_DEV=${dev_base_disk}-part6
export TEST_DEV=$(realpath $TEST_DEV)


export SCRATCH_MNT=/mnt/scratch

export SCRATCH_DEV_POOL="${dev_base_disk}-part1 ${dev_base_disk}-part2
${dev_base_disk}-part3 ${dev_base_disk}-part4 ${dev_base_disk}-part5"
export SCRATCH_DEV_POOL=$(realpath $SCRATCH_DEV_POOL | xargs /bin/echo )

#LOGWRITES_DEV=


# parted /dev/disk/by-id/nvme-Dell_Express_Flash_NVMe_SM1715_800GB_SFF__S29HNYAH500207 unit s print
Model: NVMe Device (nvme)
Disk /dev/nvme0n1: 1562824368s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start       End         Size       File system  Name     Flags
 1      2097152s    54525951s   52428800s  btrfs        primary
 2      54525952s   106954751s  52428800s               primary
 3      106954752s  159383551s  52428800s               primary
 4      159383552s  211812351s  52428800s               primary
 5      211812352s  264241151s  52428800s               primary
 6      264241152s  316669951s  52428800s  btrfs        primary

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/04/30



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
  2021-04-29 21:40     ` Wang Yugui
@ 2021-04-29 22:23       ` Wang Yugui
  2021-04-30 15:28         ` David Sterba
  0 siblings, 1 reply; 21+ messages in thread
From: Wang Yugui @ 2021-04-29 22:23 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs, kernel-team

Hi,

> > On 4/28/21 9:48 PM, Wang Yugui wrote:
> > > Hi,
> > >
> > > xfstests generic/562 failed with this patch.
> > >
> > > This is the file 562.out.bad.
> > > 
> > It's not failing for me, what is your config like?  Both the local.config and the type/size of your devices.  Thanks,
> > 
> 
> # cat xfstests/local.config
> 
> dev_base_disk=/dev/disk/by-id/nvme-Dell_Express_Flash_NVMe_SM1715_800GB_SFF__S29HNYAH500207
> 
> export TEST_DIR=/mnt/test
> 
> export TEST_DEV=${dev_base_disk}-part6
> export TEST_DEV=$(realpath $TEST_DEV)
> 
> 
> export SCRATCH_MNT=/mnt/scratch
> 
> export SCRATCH_DEV_POOL="${dev_base_disk}-part1 ${dev_base_disk}-part2
> ${dev_base_disk}-part3 ${dev_base_disk}-part4 ${dev_base_disk}-part5"
> export SCRATCH_DEV_POOL=$(realpath $SCRATCH_DEV_POOL | xargs /bin/echo )
> 
> #LOGWRITES_DEV=
> 
> 
> # parted /dev/disk/by-id/nvme-Dell_Express_Flash_NVMe_SM1715_800GB_SFF__S29HNYAH500207 unit s print
> Model: NVMe Device (nvme)
> Disk /dev/nvme0n1: 1562824368s
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> Disk Flags:
> 
> Number  Start       End         Size       File system  Name     Flags
>  1      2097152s    54525951s   52428800s  btrfs        primary
>  2      54525952s   106954751s  52428800s               primary
>  3      106954752s  159383551s  52428800s               primary
>  4      159383552s  211812351s  52428800s               primary
>  5      211812352s  264241151s  52428800s               primary
>  6      264241152s  316669951s  52428800s  btrfs        primary


By the way, We enable '-O no-holes -R free-space-tree' default for
mkfs.btrfs.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/04/30



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
  2021-04-29 22:23       ` Wang Yugui
@ 2021-04-30 15:28         ` David Sterba
  2021-04-30 15:43           ` Wang Yugui
  0 siblings, 1 reply; 21+ messages in thread
From: David Sterba @ 2021-04-30 15:28 UTC (permalink / raw)
  To: Wang Yugui; +Cc: Josef Bacik, linux-btrfs, kernel-team

On Fri, Apr 30, 2021 at 06:23:08AM +0800, Wang Yugui wrote:
> > > > This is the file 562.out.bad.

Can you please post again the error?

> By the way, We enable '-O no-holes -R free-space-tree' default for
> mkfs.btrfs.

The fstests cases are not robust enough to catch differences in the
output when various features are enabled so it depends on the exact
error.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
  2021-04-30 15:28         ` David Sterba
@ 2021-04-30 15:43           ` Wang Yugui
  0 siblings, 0 replies; 21+ messages in thread
From: Wang Yugui @ 2021-04-30 15:43 UTC (permalink / raw)
  To: dsterba, Wang Yugui, Josef Bacik, linux-btrfs, kernel-team

Hi,

> On Fri, Apr 30, 2021 at 06:23:08AM +0800, Wang Yugui wrote:
> > > > > This is the file 562.out.bad.
> 
> Can you please post again the error?

This is the head part of the file of 562.out.bad(whole file is 2.55M).

QA output created by 562
XFS_IOC_CLONE: No space left on device
File foo data after cloning and remount:
0000000 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7 c7
*
25509888 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
25513984 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5
*
25518080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
25522176 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5 e5
<repeat of 00 00.../e5 e5...>

> > By the way, We enable '-O no-holes -R free-space-tree' default for
> > mkfs.btrfs.
> 
> The fstests cases are not robust enough to catch differences in the
> output when various features are enabled so it depends on the exact
> error.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/04/30



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/7] Preemptive flushing improvements
  2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
                   ` (7 preceding siblings ...)
       [not found] ` <20210429094852.DAC3.409509F4@e16-tech.com>
@ 2021-04-30 15:47 ` David Sterba
  8 siblings, 0 replies; 21+ messages in thread
From: David Sterba @ 2021-04-30 15:47 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, Apr 28, 2021 at 01:38:41PM -0400, Josef Bacik wrote:
> The following patch series is a set of improvements around our preemptive
> flushing infrastructure.  A user reported[1] a problem where the preemptive
> flushing was running non-stop on his full file system.  I was not able to
> reproduce this behavior, but noticed a few fundamental issues with how we
> decided to do preemptive flushing.  The main relevant thing is that we were not
> taking into account the global reserve when deciding if we needed to
> preemptively flush.  In cases such as the users file system where the majority
> of the free metadata space is taken up by the global reserve we could
> potentially always want to flush, which is annoying and not what we want.
> 
> Furthermore I noticed issues where we would clamp entirely too quickly, and
> where we made some poor decisions around delalloc.  None of these are ground
> breaking or huge improvements, but offer some better performance in some of the
> fsperf test cases.  This is the results of my recent run with the whole
> patchset.  You'll notice a "regression" in the 16g buffered write, this is
> mostly because of the test, if you look at the results on the nightly
> performance tests you'll see this test varies more than others.  Re-testing with
> just that test didn't show the same regression after multiple runs, so it's just
> noise.  I could have chosen to run all the tests multiple times to get an
> average over several runs, but that takes a fair bit of time.  Individual runs
> of the test showed no regression and often showed an improvement, so I feel
> comfortable calling it noise.  The full results are as follows
> 
> dbench60 results
>   metric     baseline   current         diff
> ==================================================
> qpathinfo       11.53     13.25    14.90%
> throughput     446.23    434.52    -2.62%
> flush         2502.92   1682.43   -32.78%
> qfileinfo        0.92      1.17    27.29%
> ntcreatex     1359.76    519.42   -61.80%
> qfsinfo          1.77      3.76   112.64%
> close            1.90      1.64   -13.91%å
> sfileinfo      209.76     76.43   -63.56%
> rename        1110.08    518.40   -53.30%
> find            13.84     13.13    -5.15%
> unlink        1192.89    521.53   -56.28%
> writex        1713.39   1321.39   -22.88%
> deltree        280.34    296.33     5.70%
> readx            3.16      2.91    -8.10%
> mkdir            0.03      0.02   -46.67%
> lockx            0.78      0.20   -73.89%
> unlockx          0.16      0.12   -23.81%
> 
> emptyfiles500k results
>      metric         baseline   current         diff
> =========================================================
> write_io_kbytes       125000     125000    0.00%
> read_clat_ns_p99           0          0    0.00%
> write_bw_bytes      1.79e+08   1.85e+08    3.04%
> read_iops                  0          0    0.00%
> write_clat_ns_p50      17728      17280   -2.53%
> read_io_kbytes             0          0    0.00%
> read_io_bytes              0          0    0.00%
> write_clat_ns_p99      72704      68096   -6.34%
> read_bw_bytes              0          0    0.00%
> elapsed                    1          1    0.00%
> write_lat_ns_min           0          0    0.00%
> sys_cpu                91.23      89.16   -2.27%
> write_lat_ns_max           0          0    0.00%
> read_lat_ns_min            0          0    0.00%
> write_iops          43763.97   45093.80    3.04%
> read_lat_ns_max            0          0    0.00%
> read_clat_ns_p50           0          0    0.00%
> 
> smallfiles100k results
>      metric         baseline   current         diff
> =========================================================
> write_io_kbytes     2.04e+08   2.04e+08    0.00%
> read_clat_ns_p99           0          0    0.00%
> write_bw_bytes      1.40e+08   1.40e+08    0.50%
> read_iops                  0          0    0.00%
> write_clat_ns_p50       6712       6944    3.46%
> read_io_kbytes             0          0    0.00%
> read_io_bytes              0          0    0.00%
> write_clat_ns_p99      16000      16512    3.20%
> read_bw_bytes              0          0    0.00%
> elapsed              1498.88       1491   -0.53%
> write_lat_ns_min     2858.38       2919    2.12%
> sys_cpu                 6.22       6.51    4.61%
> write_lat_ns_max    1.31e+08   1.27e+08   -2.77%
> read_lat_ns_min            0          0    0.00%
> write_iops          34081.44   34253.51    0.50%
> read_lat_ns_max            0          0    0.00%
> read_clat_ns_p50           0          0    0.00%
> 
> dio4kbs16threads results
>      metric          baseline     current          diff
> =============================================================
> write_io_kbytes         4360879    5312908    21.83%
> read_clat_ns_p99              0          0     0.00%
> write_bw_bytes      74302497.38   90667585    22.02%
> read_iops                     0          0     0.00%
> write_clat_ns_p50        243968     238592    -2.20%
> read_io_kbytes                0          0     0.00%
> read_io_bytes                 0          0     0.00%
> write_clat_ns_p99      21094400   15007744   -28.85%
> read_bw_bytes                 0          0     0.00%
> elapsed                      61         61     0.00%
> write_lat_ns_min       38183.25      37949    -0.61%
> sys_cpu                    4.03       4.72    17.11%
> write_lat_ns_max       1.68e+09   8.46e+08   -49.55%
> read_lat_ns_min               0          0     0.00%
> write_iops             18140.26   22135.64    22.02%
> read_lat_ns_max               0          0     0.00%
> read_clat_ns_p50              0          0     0.00%
> 
> randwrite2xram results
>      metric          baseline     current          diff
> =============================================================
> write_io_kbytes        27720434   36563300    31.90%
> read_clat_ns_p99              0          0     0.00%
> write_bw_bytes      93268100.75   1.16e+08    24.83%
> read_iops                     0          0     0.00%
> write_clat_ns_p50         13168      16512    25.39%
> read_io_kbytes                0          0     0.00%
> read_io_bytes                 0          0     0.00%
> write_clat_ns_p99         39360     125440   218.70%
> read_bw_bytes                 0          0     0.00%
> elapsed                  334.25        333    -0.37%
> write_lat_ns_min        5436.75       5682     4.51%
> sys_cpu                    8.22      12.57    52.96%
> write_lat_ns_max       4.05e+10   2.73e+10   -32.65%
> read_lat_ns_min               0          0     0.00%
> write_iops             22770.53   28425.17    24.83%
> read_lat_ns_max               0          0     0.00%
> read_clat_ns_p50              0          0     0.00%
> 
> untarfirefox results
> metric    baseline   current        diff
> ==============================================
> elapsed      47.23     46.82   -0.87%
> 
> I'm still waiting on feedback from the user to make sure the patches fix the
> reported problem, but they're worthy on their own if they do not resolve the
> original reported issue.  Thanks,
> 
> Josef
> 
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=212185
> 
> Josef Bacik (7):
>   btrfs: check worker before need_preemptive_reclaim
>   btrfs: only clamp the first time we have to start flushing
>   btrfs: take into account global rsv in need_preemptive_reclaim
>   btrfs: use the global rsv size in the preemptive thresh calculation
>   btrfs: don't include the global rsv size in the preemptive used amount
>   btrfs: only ignore delalloc if delalloc is much smaller than ordered
>   btrfs: handle preemptive delalloc flushing slightly differently

It would be good to summarize the noticeable improvements over all the
tests and put it at least into the last patch assuming that's where it
was produced.

The tests passed for me here so I'll add the patches to misc-next for
more coverage.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-04-30 15:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-28 17:38 [PATCH 0/7] Preemptive flushing improvements Josef Bacik
2021-04-28 17:38 ` [PATCH 1/7] btrfs: check worker before need_preemptive_reclaim Josef Bacik
2021-04-29 13:35   ` Johannes Thumshirn
2021-04-28 17:38 ` [PATCH 2/7] btrfs: only clamp the first time we have to start flushing Josef Bacik
2021-04-28 17:38 ` [PATCH 3/7] btrfs: take into account global rsv in need_preemptive_reclaim Josef Bacik
2021-04-29 13:50   ` Johannes Thumshirn
2021-04-29 20:03   ` David Sterba
2021-04-29 20:05     ` David Sterba
2021-04-28 17:38 ` [PATCH 4/7] btrfs: use the global rsv size in the preemptive thresh calculation Josef Bacik
2021-04-29 14:04   ` Johannes Thumshirn
2021-04-29 15:19     ` Josef Bacik
2021-04-28 17:38 ` [PATCH 5/7] btrfs: don't include the global rsv size in the preemptive used amount Josef Bacik
2021-04-29 14:19   ` Johannes Thumshirn
2021-04-28 17:38 ` [PATCH 6/7] btrfs: only ignore delalloc if delalloc is much smaller than ordered Josef Bacik
2021-04-28 17:38 ` [PATCH 7/7] btrfs: handle preemptive delalloc flushing slightly differently Josef Bacik
     [not found] ` <20210429094852.DAC3.409509F4@e16-tech.com>
2021-04-29 15:06   ` [PATCH 0/7] Preemptive flushing improvements Josef Bacik
2021-04-29 21:40     ` Wang Yugui
2021-04-29 22:23       ` Wang Yugui
2021-04-30 15:28         ` David Sterba
2021-04-30 15:43           ` Wang Yugui
2021-04-30 15:47 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.