fio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it
       [not found]   ` <CGME20220614155830uscas1p15aaa5d3bc8eec3da42c78a48520c6dd9@uscas1p1.samsung.com>
@ 2022-06-14 15:58     ` Vincent Fu
  2022-06-15 15:26       ` Nick Neumann
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

io_uring, io_uring_cmd, and libaio record issue_time inside the ioengine
code when their commit functions are called. So we don't need to record
issue_time again for these ioengines in td_io_queue.

If we do fill issue_time twice, then mean(slat) + mean(clat) !=
mean(lat):

user@ubuntu:~/fio-dev$ fio-canonical/fio --name=test --ioengine=io_uring --number_ios=1 --rw=randread --size=1M
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.30-48-g26fa
Starting 1 process

test: (groupid=0, jobs=1): err= 0: pid=172145: Mon Jun  6 16:12:42 2022
  read: IOPS=1000, BW=4000KiB/s (4096kB/s)(4096B/1msec)
    slat (nsec): min=61424, max=61424, avg=61424.00, stdev= 0.00
    clat (nsec): min=242709, max=242709, avg=242709.00, stdev= 0.00
     lat (nsec): min=308346, max=308346, avg=308346.00, stdev= 0.00

61424 + 242709 = 304133 != 308346

If we fill issue_time only once, then the equality will hold (as it
should):

user@ubuntu:~/fio-dev$ fio-latency/fio --name=test --ioengine=io_uring --number_ios=1 --rw=randread --size=1M
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.30-48-g26fa-dirty
Starting 1 process

test: (groupid=0, jobs=1): err= 0: pid=172220: Mon Jun  6 16:12:47 2022
  read: IOPS=1000, BW=4000KiB/s (4096kB/s)(4096B/1msec)
    slat (nsec): min=53701, max=53701, avg=53701.00, stdev= 0.00
    clat (nsec): min=259566, max=259566, avg=259566.00, stdev= 0.00
     lat (nsec): min=313267, max=313267, avg=313267.00, stdev= 0.00

53701 + 259566 = 313267

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
 engines/io_uring.c | 7 +++++--
 engines/libaio.c   | 3 ++-
 ioengines.c        | 3 ++-
 ioengines.h        | 2 ++
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/engines/io_uring.c b/engines/io_uring.c
index cceafe69..474d215c 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -1191,7 +1191,8 @@ static int fio_ioring_cmd_get_max_open_zones(struct thread_data *td,
 static struct ioengine_ops ioengine_uring = {
 	.name			= "io_uring",
 	.version		= FIO_IOOPS_VERSION,
-	.flags			= FIO_ASYNCIO_SYNC_TRIM | FIO_NO_OFFLOAD,
+	.flags			= FIO_ASYNCIO_SYNC_TRIM | FIO_NO_OFFLOAD |
+					FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.init			= fio_ioring_init,
 	.post_init		= fio_ioring_post_init,
 	.io_u_init		= fio_ioring_io_u_init,
@@ -1211,7 +1212,9 @@ static struct ioengine_ops ioengine_uring = {
 static struct ioengine_ops ioengine_uring_cmd = {
 	.name			= "io_uring_cmd",
 	.version		= FIO_IOOPS_VERSION,
-	.flags			= FIO_ASYNCIO_SYNC_TRIM | FIO_NO_OFFLOAD | FIO_MEMALIGN | FIO_RAWIO,
+	.flags			= FIO_ASYNCIO_SYNC_TRIM | FIO_NO_OFFLOAD |
+					FIO_MEMALIGN | FIO_RAWIO |
+					FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.init			= fio_ioring_init,
 	.post_init		= fio_ioring_cmd_post_init,
 	.io_u_init		= fio_ioring_io_u_init,
diff --git a/engines/libaio.c b/engines/libaio.c
index 9c278d06..da5279f4 100644
--- a/engines/libaio.c
+++ b/engines/libaio.c
@@ -511,7 +511,8 @@ static int fio_libaio_init(struct thread_data *td)
 FIO_STATIC struct ioengine_ops ioengine = {
 	.name			= "libaio",
 	.version		= FIO_IOOPS_VERSION,
-	.flags			= FIO_ASYNCIO_SYNC_TRIM,
+	.flags			= FIO_ASYNCIO_SYNC_TRIM |
+					FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.init			= fio_libaio_init,
 	.post_init		= fio_libaio_post_init,
 	.prep			= fio_libaio_prep,
diff --git a/ioengines.c b/ioengines.c
index 280da3c8..e4ad698c 100644
--- a/ioengines.c
+++ b/ioengines.c
@@ -442,7 +442,8 @@ enum fio_q_status td_io_queue(struct thread_data *td, struct io_u *io_u)
 
 	if (!td_ioengine_flagged(td, FIO_SYNCIO) &&
 		!async_ioengine_sync_trim(td, io_u)) {
-		if (fio_fill_issue_time(td))
+		if (fio_fill_issue_time(td) &&
+			!td_ioengine_flagged(td, FIO_ASYNCIO_SETS_ISSUE_TIME))
 			fio_gettime(&io_u->issue_time, NULL);
 
 		/*
diff --git a/ioengines.h b/ioengines.h
index acdb0071..fafa1e48 100644
--- a/ioengines.h
+++ b/ioengines.h
@@ -83,6 +83,8 @@ enum fio_ioengine_flags {
 	FIO_ASYNCIO_SYNC_TRIM
 			= 1 << 14,	/* io engine has async ->queue except for trim */
 	FIO_NO_OFFLOAD	= 1 << 15,	/* no async offload */
+	FIO_ASYNCIO_SETS_ISSUE_TIME
+			= 1 << 16,	/* async ioengine with commit function that sets issue_time */
 };
 
 /*
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] ioengines: clean up latency accounting for 3 ioengines
       [not found]   ` <CGME20220614155831uscas1p16ab7450c13a1587f2058617130f4f122@uscas1p1.samsung.com>
@ 2022-06-14 15:58     ` Vincent Fu
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

The librpma_apm_client, librpma_gpspm_client, and rdma ioengines have
commit functions that record submission latency. In order to avoid
setting issue_time twice add the FIO_ASYNCIO_SETS_ISSUE_TIME flag. Also
add code to update the iolog issue time when needed.

I don't have the means to test this patch.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
 engines/librpma_apm.c   | 2 +-
 engines/librpma_fio.c   | 9 ++++++++-
 engines/librpma_gpspm.c | 2 +-
 engines/rdma.c          | 9 ++++++++-
 4 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/engines/librpma_apm.c b/engines/librpma_apm.c
index d1166ad8..896240dd 100644
--- a/engines/librpma_apm.c
+++ b/engines/librpma_apm.c
@@ -208,7 +208,7 @@ FIO_STATIC struct ioengine_ops ioengine_client = {
 	.errdetails		= librpma_fio_client_errdetails,
 	.close_file		= librpma_fio_file_nop,
 	.cleanup		= client_cleanup,
-	.flags			= FIO_DISKLESSIO,
+	.flags			= FIO_DISKLESSIO | FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.options		= librpma_fio_options,
 	.option_struct_size	= sizeof(struct librpma_fio_options_values),
 };
diff --git a/engines/librpma_fio.c b/engines/librpma_fio.c
index 34818904..a78a1e57 100644
--- a/engines/librpma_fio.c
+++ b/engines/librpma_fio.c
@@ -621,9 +621,16 @@ int librpma_fio_client_commit(struct thread_data *td)
 		}
 	}
 
-	if ((fill_time = fio_fill_issue_time(td)))
+	if ((fill_time = fio_fill_issue_time(td))) {
 		fio_gettime(&now, NULL);
 
+		/*
+		 * only used for iolog
+		 */
+		if (td->o.read_iolog_file)
+			memcpy(&td->last_issue, &now, sizeof(now));
+
+	}
 	/* move executed io_us from queued[] to flight[] */
 	for (i = 0; i < ccd->io_u_queued_nr; i++) {
 		struct io_u *io_u = ccd->io_us_queued[i];
diff --git a/engines/librpma_gpspm.c b/engines/librpma_gpspm.c
index 5cf97472..f00717a7 100644
--- a/engines/librpma_gpspm.c
+++ b/engines/librpma_gpspm.c
@@ -352,7 +352,7 @@ FIO_STATIC struct ioengine_ops ioengine_client = {
 	.errdetails		= librpma_fio_client_errdetails,
 	.close_file		= librpma_fio_file_nop,
 	.cleanup		= client_cleanup,
-	.flags			= FIO_DISKLESSIO,
+	.flags			= FIO_DISKLESSIO | FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.options		= librpma_fio_options,
 	.option_struct_size	= sizeof(struct librpma_fio_options_values),
 };
diff --git a/engines/rdma.c b/engines/rdma.c
index 4eb86652..e3bb2567 100644
--- a/engines/rdma.c
+++ b/engines/rdma.c
@@ -832,6 +832,12 @@ static void fio_rdmaio_queued(struct thread_data *td, struct io_u **io_us,
 		memcpy(&io_u->issue_time, &now, sizeof(now));
 		io_u_queued(td, io_u);
 	}
+
+	/*
+	 * only used for iolog
+	 */
+	if (td->o.read_iolog_file)
+		memcpy(&td->last_issue, &now, sizeof(now));
 }
 
 static int fio_rdmaio_commit(struct thread_data *td)
@@ -1404,7 +1410,8 @@ FIO_STATIC struct ioengine_ops ioengine = {
 	.cleanup		= fio_rdmaio_cleanup,
 	.open_file		= fio_rdmaio_open_file,
 	.close_file		= fio_rdmaio_close_file,
-	.flags			= FIO_DISKLESSIO | FIO_UNIDIR | FIO_PIPEIO,
+	.flags			= FIO_DISKLESSIO | FIO_UNIDIR | FIO_PIPEIO |
+					FIO_ASYNCIO_SETS_ISSUE_TIME,
 	.options		= options,
 	.option_struct_size	= sizeof(struct rdmaio_options),
 };
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] HOWTO: improve description of latency measures
       [not found]   ` <CGME20220614155831uscas1p2312cb623924e5df60abefb210affa1f4@uscas1p2.samsung.com>
@ 2022-06-14 15:58     ` Vincent Fu
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

- clarify how submission latency is calculated for async ioengines
- it is slat (not clat) that is near zero for sync ioengines
- Note that submission latency + completion latency = total latency

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
 HOWTO.rst | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/HOWTO.rst b/HOWTO.rst
index 28ac2b7c..470777e2 100644
--- a/HOWTO.rst
+++ b/HOWTO.rst
@@ -4165,24 +4165,31 @@ writes in the example above).  In the order listed, they denote:
 **slat**
 		Submission latency (**min** being the minimum, **max** being the
 		maximum, **avg** being the average, **stdev** being the standard
-		deviation).  This is the time it took to submit the I/O.  For
-		sync I/O this row is not displayed as the slat is really the
-		completion latency (since queue/complete is one operation there).
-		This value can be in nanoseconds, microseconds or milliseconds ---
-		fio will choose the most appropriate base and print that (in the
-		example above nanoseconds was the best scale).  Note: in :option:`--minimal` mode
-		latencies are always expressed in microseconds.
+                deviation).  This is the time from when fio initialized the I/O
+                to submission.  For synchronous ioengines this includes the time
+                up until just before the ioengine's queue function is called.
+                For asynchronous ioengines this includes the time up through the
+                completion of the ioengine's queue function (and commit function
+                if it is defined). For sync I/O this row is not displayed as the
+                slat is negligible.  This value can be in nanoseconds,
+                microseconds or milliseconds --- fio will choose the most
+                appropriate base and print that (in the example above
+                nanoseconds was the best scale).  Note: in :option:`--minimal`
+                mode latencies are always expressed in microseconds.
 
 **clat**
 		Completion latency. Same names as slat, this denotes the time from
-		submission to completion of the I/O pieces. For sync I/O, clat will
-		usually be equal (or very close) to 0, as the time from submit to
-		complete is basically just CPU time (I/O has already been done, see slat
-		explanation).
+                submission to completion of the I/O pieces. For sync I/O, this
+                represents the time from when the I/O was submitted to the
+                operating system to when it was completed. For asynchronous
+                ioengines this is the time from when the ioengine's queue (and
+                commit if available) functions were completed to when the I/O's
+                completion was reaped by fio.
 
 **lat**
 		Total latency. Same names as slat and clat, this denotes the time from
 		when fio created the I/O unit to completion of the I/O operation.
+                It is the sum of submission and completion latency.
 
 **bw**
 		Bandwidth statistics based on samples. Same names as the xlat stats,
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] ioengines: update last_issue if we set issue_time
       [not found]   ` <CGME20220614155831uscas1p12c7b262538a47a417dc783a45e8df049@uscas1p1.samsung.com>
@ 2022-06-14 15:58     ` Vincent Fu
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

If we're not updating issue_time it doesn't make sense to update
last_issue. We should also be updating last_issue in libaio and io_uring
when we record issue_time.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
 engines/io_uring.c |  6 ++++++
 engines/libaio.c   |  6 ++++++
 ioengines.c        | 30 ++++++++++++++++--------------
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/engines/io_uring.c b/engines/io_uring.c
index 474d215c..cffc7371 100644
--- a/engines/io_uring.c
+++ b/engines/io_uring.c
@@ -608,6 +608,12 @@ static void fio_ioring_queued(struct thread_data *td, int start, int nr)
 
 		start++;
 	}
+
+	/*
+	 * only used for iolog
+	 */
+	if (td->o.read_iolog_file)
+		memcpy(&td->last_issue, &now, sizeof(now));
 }
 
 static int fio_ioring_commit(struct thread_data *td)
diff --git a/engines/libaio.c b/engines/libaio.c
index da5279f4..33b8c12f 100644
--- a/engines/libaio.c
+++ b/engines/libaio.c
@@ -368,6 +368,12 @@ static void fio_libaio_queued(struct thread_data *td, struct io_u **io_us,
 		memcpy(&io_u->issue_time, &now, sizeof(now));
 		io_u_queued(td, io_u);
 	}
+
+	/*
+	 * only used for iolog
+	 */
+	if (td->o.read_iolog_file)
+		memcpy(&td->last_issue, &now, sizeof(now));
 }
 
 static int fio_libaio_commit(struct thread_data *td)
diff --git a/ioengines.c b/ioengines.c
index e4ad698c..e2316ee4 100644
--- a/ioengines.c
+++ b/ioengines.c
@@ -358,15 +358,16 @@ enum fio_q_status td_io_queue(struct thread_data *td, struct io_u *io_u)
 
 	if (td_ioengine_flagged(td, FIO_SYNCIO) ||
 		async_ioengine_sync_trim(td, io_u)) {
-		if (fio_fill_issue_time(td))
+		if (fio_fill_issue_time(td)) {
 			fio_gettime(&io_u->issue_time, NULL);
 
-		/*
-		 * only used for iolog
-		 */
-		if (td->o.read_iolog_file)
-			memcpy(&td->last_issue, &io_u->issue_time,
-					sizeof(io_u->issue_time));
+			/*
+			 * only used for iolog
+			 */
+			if (td->o.read_iolog_file)
+				memcpy(&td->last_issue, &io_u->issue_time,
+						sizeof(io_u->issue_time));
+		}
 	}
 
 
@@ -443,15 +444,16 @@ enum fio_q_status td_io_queue(struct thread_data *td, struct io_u *io_u)
 	if (!td_ioengine_flagged(td, FIO_SYNCIO) &&
 		!async_ioengine_sync_trim(td, io_u)) {
 		if (fio_fill_issue_time(td) &&
-			!td_ioengine_flagged(td, FIO_ASYNCIO_SETS_ISSUE_TIME))
+			!td_ioengine_flagged(td, FIO_ASYNCIO_SETS_ISSUE_TIME)) {
 			fio_gettime(&io_u->issue_time, NULL);
 
-		/*
-		 * only used for iolog
-		 */
-		if (td->o.read_iolog_file)
-			memcpy(&td->last_issue, &io_u->issue_time,
-					sizeof(io_u->issue_time));
+			/*
+			 * only used for iolog
+			 */
+			if (td->o.read_iolog_file)
+				memcpy(&td->last_issue, &io_u->issue_time,
+						sizeof(io_u->issue_time));
+		}
 	}
 
 	return ret;
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 0/5] Latency cleanups
       [not found] <CGME20220614155830uscas1p1e1d6595b29fb6960b405d5b295a32ffc@uscas1p1.samsung.com>
@ 2022-06-14 15:58 ` Vincent Fu
       [not found]   ` <CGME20220614155830uscas1p15aaa5d3bc8eec3da42c78a48520c6dd9@uscas1p1.samsung.com>
                     ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

Jens, here are a few latency cleanup patches. The main improvement is to
avoid recording issue_time twice for the ioengines that record
issue_time themselves when their commit hook is called. Recording
issue_time twice underestimates clat since the first issue_time value is
used for slat and then the later second issue_time is used when
calculating clat.

I'm not able to test the rdma and librpma ioengines but the fix for
libaio and io_uring appears to work properly.

Vincent Fu (5):
  ioengines: add helper for trims with async ioengines
  ioengines: don't record issue_time if ioengines already do it
  HOWTO: improve description of latency measures
  ioengines: update last_issue if we set issue_time
  ioengines: clean up latency accounting for 3 ioengines

 HOWTO.rst               | 29 ++++++++++++++++-----------
 engines/io_uring.c      | 13 ++++++++++--
 engines/libaio.c        |  9 ++++++++-
 engines/librpma_apm.c   |  2 +-
 engines/librpma_fio.c   |  9 ++++++++-
 engines/librpma_gpspm.c |  2 +-
 engines/rdma.c          |  9 ++++++++-
 ioengines.c             | 44 ++++++++++++++++++++++++-----------------
 ioengines.h             |  2 ++
 9 files changed, 83 insertions(+), 36 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] ioengines: add helper for trims with async ioengines
       [not found]   ` <CGME20220614155830uscas1p2586c5e612430bdd2ce455a1081ce08e4@uscas1p2.samsung.com>
@ 2022-06-14 15:58     ` Vincent Fu
  0 siblings, 0 replies; 10+ messages in thread
From: Vincent Fu @ 2022-06-14 15:58 UTC (permalink / raw)
  To: axboe, fio; +Cc: Vincent Fu

Async ioengines support trim commands but trims are synchronous
operations. We need to provide special handling when measuring latency
for these commands. Create a helper function to help us identify when an
async ioengine is issuing a sync trim command. This makes the code more
readable.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
---
 ioengines.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/ioengines.c b/ioengines.c
index 68f307e5..280da3c8 100644
--- a/ioengines.c
+++ b/ioengines.c
@@ -24,6 +24,13 @@
 
 static FLIST_HEAD(engine_list);
 
+static inline bool async_ioengine_sync_trim(struct thread_data *td,
+					    struct io_u	*io_u)
+{
+	return td_ioengine_flagged(td, FIO_ASYNCIO_SYNC_TRIM) &&
+		io_u->ddir == DDIR_TRIM;
+}
+
 static bool check_engine_ops(struct thread_data *td, struct ioengine_ops *ops)
 {
 	if (ops->version != FIO_IOOPS_VERSION) {
@@ -350,8 +357,7 @@ enum fio_q_status td_io_queue(struct thread_data *td, struct io_u *io_u)
 	io_u->resid = 0;
 
 	if (td_ioengine_flagged(td, FIO_SYNCIO) ||
-		(td_ioengine_flagged(td, FIO_ASYNCIO_SYNC_TRIM) && 
-		io_u->ddir == DDIR_TRIM)) {
+		async_ioengine_sync_trim(td, io_u)) {
 		if (fio_fill_issue_time(td))
 			fio_gettime(&io_u->issue_time, NULL);
 
@@ -435,8 +441,7 @@ enum fio_q_status td_io_queue(struct thread_data *td, struct io_u *io_u)
 	}
 
 	if (!td_ioengine_flagged(td, FIO_SYNCIO) &&
-		(!td_ioengine_flagged(td, FIO_ASYNCIO_SYNC_TRIM) ||
-		 io_u->ddir != DDIR_TRIM)) {
+		!async_ioengine_sync_trim(td, io_u)) {
 		if (fio_fill_issue_time(td))
 			fio_gettime(&io_u->issue_time, NULL);
 
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it
  2022-06-14 15:58     ` [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it Vincent Fu
@ 2022-06-15 15:26       ` Nick Neumann
  2022-06-15 17:06         ` Vincent Fu
  0 siblings, 1 reply; 10+ messages in thread
From: Nick Neumann @ 2022-06-15 15:26 UTC (permalink / raw)
  To: Vincent Fu; +Cc: axboe, fio

On Tue, Jun 14, 2022 at 10:59 AM Vincent Fu <vincent.fu@samsung.com> wrote:
>
> io_uring, io_uring_cmd, and libaio record issue_time inside the ioengine
> code when their commit functions are called. So we don't need to record
> issue_time again for these ioengines in td_io_queue.
>
> If we do fill issue_time twice, then mean(slat) + mean(clat) !=
> mean(lat):

I'm a little bit confused though why not recording issue_time again in
td_io_queue is considered the right fix. By having some ioengines that
record issue_time in their commit functions, and others that do not
(and thus have it recorded in td_io_queue), comparing clat between two
different engines isn't an apples-to-apples comparison. An ioengine
that does not record issue_time in its commit gets a "discount" of
sorts in its measured clat, since its issue_time is not recorded until
later. Does that make sense, or am I missing something? (Has been a
few months since I've been in the code in detail, so that's possible.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it
  2022-06-15 15:26       ` Nick Neumann
@ 2022-06-15 17:06         ` Vincent Fu
  2022-06-15 17:51           ` Nick Neumann
  0 siblings, 1 reply; 10+ messages in thread
From: Vincent Fu @ 2022-06-15 17:06 UTC (permalink / raw)
  To: Nick Neumann; +Cc: axboe, fio

> -----Original Message-----
> From: Nick Neumann [mailto:nick@pcpartpicker.com]
> Sent: Wednesday, June 15, 2022 11:27 AM
> To: Vincent Fu <vincent.fu@samsung.com>
> Cc: axboe@kernel.dk; fio@vger.kernel.org
> Subject: Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines
> already do it
> 
> On Tue, Jun 14, 2022 at 10:59 AM Vincent Fu <vincent.fu@samsung.com>
> wrote:
> >
> > io_uring, io_uring_cmd, and libaio record issue_time inside the
> ioengine
> > code when their commit functions are called. So we don't need to
> record
> > issue_time again for these ioengines in td_io_queue.
> >
> > If we do fill issue_time twice, then mean(slat) + mean(clat) !=
> > mean(lat):
> 
> I'm a little bit confused though why not recording issue_time again in
> td_io_queue is considered the right fix. By having some ioengines that
> record issue_time in their commit functions, and others that do not
> (and thus have it recorded in td_io_queue), comparing clat between two
> different engines isn't an apples-to-apples comparison. An ioengine
> that does not record issue_time in its commit gets a "discount" of
> sorts in its measured clat, since its issue_time is not recorded until
> later. Does that make sense, or am I missing something? (Has been a
> few months since I've been in the code in detail, so that's possible.)

You make a good point and I'm glad you brought this up for discussion.

We currently have three latency pathways for async ioengines:

(1) async w/o commit: record issue_time in td_io_queue after queue
(2) async w/commit that *does not* record issue_time: records issue_time in
td_io_queue after queue (slat not reported)
(3) async w/commit that *does* record issue_time: issue_time recorded in
both td_io_queue and commit

I think the right long-term solution is to make the ioengines in (2) conform to
(1) or (3). Then we can have fair comparisons. But what 'queue' and 'commit'
mean for them doesn't completely match what the two mean for libaio and
io_uring.

For libaio and io_uring 'queue' prepares an IO and 'commit' actually sends it
to the OS. So it's appropriate to record issue_time after commit. But, for
example, take a look at the nfs ioengine's commit:

https://github.com/axboe/fio/blob/master/engines/nfs.c#L285

nfs isn't doing what libaio and io_uring are doing with commit.

There are four ioengines that fall into category (2): ime_aio, nfs, null, and
cpp_null. It would take some time to test the changes but probably it would be
straightforward to make nfs conform to (1) and {null, cpp_null} conform to (3)
but I have no idea about ime_aio.

I think the current patch is an improvement over what we currently have.  Plus
it really bothers me that mean(slat) + mean(clat) != mean(lat) and this
resolves that issue.  If someone comes along to fix everything up then we can
get rid of FIO_ASYNCIO_SETS_ISSUE_TIME and instead of td_io_queue checking for
that flag before setting issue_time it can instead check for the presence of a
commit hook.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it
  2022-06-15 17:06         ` Vincent Fu
@ 2022-06-15 17:51           ` Nick Neumann
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Neumann @ 2022-06-15 17:51 UTC (permalink / raw)
  To: Vincent Fu; +Cc: axboe, fio

On Wed, Jun 15, 2022 at 12:06 PM Vincent Fu <vincent.fu@samsung.com> wrote:
>
> > -----Original Message-----
> > From: Nick Neumann [mailto:nick@pcpartpicker.com]
> > Sent: Wednesday, June 15, 2022 11:27 AM
> > To: Vincent Fu <vincent.fu@samsung.com>
> > Cc: axboe@kernel.dk; fio@vger.kernel.org
> > Subject: Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines
> > already do it
> >
> > On Tue, Jun 14, 2022 at 10:59 AM Vincent Fu <vincent.fu@samsung.com>
> > wrote:
> > >
> > > io_uring, io_uring_cmd, and libaio record issue_time inside the
> > ioengine
> > > code when their commit functions are called. So we don't need to
> > record
> > > issue_time again for these ioengines in td_io_queue.
> > >
> > > If we do fill issue_time twice, then mean(slat) + mean(clat) !=
> > > mean(lat):
> >
> > I'm a little bit confused though why not recording issue_time again in
> > td_io_queue is considered the right fix. By having some ioengines that
> > record issue_time in their commit functions, and others that do not
> > (and thus have it recorded in td_io_queue), comparing clat between two
> > different engines isn't an apples-to-apples comparison. An ioengine
> > that does not record issue_time in its commit gets a "discount" of
> > sorts in its measured clat, since its issue_time is not recorded until
> > later. Does that make sense, or am I missing something? (Has been a
> > few months since I've been in the code in detail, so that's possible.)
>
> You make a good point and I'm glad you brought this up for discussion.
>
> We currently have three latency pathways for async ioengines:
>
> (1) async w/o commit: record issue_time in td_io_queue after queue
> (2) async w/commit that *does not* record issue_time: records issue_time in
> td_io_queue after queue (slat not reported)
> (3) async w/commit that *does* record issue_time: issue_time recorded in
> both td_io_queue and commit
>
> I think the right long-term solution is to make the ioengines in (2) conform to
> (1) or (3). Then we can have fair comparisons. But what 'queue' and 'commit'
> mean for them doesn't completely match what the two mean for libaio and
> io_uring.
>
> For libaio and io_uring 'queue' prepares an IO and 'commit' actually sends it
> to the OS. So it's appropriate to record issue_time after commit. But, for
> example, take a look at the nfs ioengine's commit:
>
> https://github.com/axboe/fio/blob/master/engines/nfs.c#L285
>
> nfs isn't doing what libaio and io_uring are doing with commit.
>
> There are four ioengines that fall into category (2): ime_aio, nfs, null, and
> cpp_null. It would take some time to test the changes but probably it would be
> straightforward to make nfs conform to (1) and {null, cpp_null} conform to (3)
> but I have no idea about ime_aio.
>
> I think the current patch is an improvement over what we currently have.  Plus
> it really bothers me that mean(slat) + mean(clat) != mean(lat) and this
> resolves that issue.  If someone comes along to fix everything up then we can
> get rid of FIO_ASYNCIO_SETS_ISSUE_TIME and instead of td_io_queue checking for
> that flag before setting issue_time it can instead check for the presence of a
> commit hook.

Thanks for the detailed explanation. I actually read your blog post
about how fio measures latency too, and that was quite helpful as
well.

You've definitely convinced me that your change is an improvement, and
that there isn't necessarily a one-size-fits-all solution because the
different engines do different kinds of things in commit.

The fact that, before your change, mean(slat)+mean(clat) != mean(lat)
is very surprising. I (and I imagine many others) just assumed it was,
and it definitely should be, especially with libaio and io_uring,
which I imagine are the main async engines used to test raw ssd
performance. I'm glad you caught that the sum was not equal to the
total, and grateful you figured out why and fixed it. Thanks again.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] Latency cleanups
  2022-06-14 15:58 ` [PATCH 0/5] Latency cleanups Vincent Fu
                     ` (4 preceding siblings ...)
       [not found]   ` <CGME20220614155831uscas1p16ab7450c13a1587f2058617130f4f122@uscas1p1.samsung.com>
@ 2022-06-15 21:30   ` Jens Axboe
  5 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2022-06-15 21:30 UTC (permalink / raw)
  To: vincent.fu, fio

On Tue, 14 Jun 2022 15:58:29 +0000, Vincent Fu wrote:
> Jens, here are a few latency cleanup patches. The main improvement is to
> avoid recording issue_time twice for the ioengines that record
> issue_time themselves when their commit hook is called. Recording
> issue_time twice underestimates clat since the first issue_time value is
> used for slat and then the later second issue_time is used when
> calculating clat.
> 
> [...]

Applied, thanks!

[1/5] ioengines: add helper for trims with async ioengines
      commit: d46b4565dfdaddbfdd91976a57b2564aa7a52f62
[2/5] ioengines: don't record issue_time if ioengines already do it
      commit: 4e7e78980ff32627e12b48d72496b701dd200a42
[3/5] HOWTO: improve description of latency measures
      commit: 13ddd98b2a70c55657f952096785ccc64479f0eb
[4/5] ioengines: update last_issue if we set issue_time
      commit: 39f56400f63e43fbe33e391d52f24eb6c79d0098
[5/5] ioengines: clean up latency accounting for 3 ioengines
      commit: 2b82135e42fc5872338ee16e16d317b16abfe308

Best regards,
-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-15 21:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20220614155830uscas1p1e1d6595b29fb6960b405d5b295a32ffc@uscas1p1.samsung.com>
2022-06-14 15:58 ` [PATCH 0/5] Latency cleanups Vincent Fu
     [not found]   ` <CGME20220614155830uscas1p15aaa5d3bc8eec3da42c78a48520c6dd9@uscas1p1.samsung.com>
2022-06-14 15:58     ` [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it Vincent Fu
2022-06-15 15:26       ` Nick Neumann
2022-06-15 17:06         ` Vincent Fu
2022-06-15 17:51           ` Nick Neumann
     [not found]   ` <CGME20220614155831uscas1p2312cb623924e5df60abefb210affa1f4@uscas1p2.samsung.com>
2022-06-14 15:58     ` [PATCH 3/5] HOWTO: improve description of latency measures Vincent Fu
     [not found]   ` <CGME20220614155831uscas1p12c7b262538a47a417dc783a45e8df049@uscas1p1.samsung.com>
2022-06-14 15:58     ` [PATCH 4/5] ioengines: update last_issue if we set issue_time Vincent Fu
     [not found]   ` <CGME20220614155830uscas1p2586c5e612430bdd2ce455a1081ce08e4@uscas1p2.samsung.com>
2022-06-14 15:58     ` [PATCH 1/5] ioengines: add helper for trims with async ioengines Vincent Fu
     [not found]   ` <CGME20220614155831uscas1p16ab7450c13a1587f2058617130f4f122@uscas1p1.samsung.com>
2022-06-14 15:58     ` [PATCH 5/5] ioengines: clean up latency accounting for 3 ioengines Vincent Fu
2022-06-15 21:30   ` [PATCH 0/5] Latency cleanups Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).