All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] v1 Patchset : Simple Copy Command support
       [not found] <CGME20201201114048epcas5p3e12de26128ce442bbe8406082eaccde9@epcas5p3.samsung.com>
@ 2020-12-01 11:40 ` Krishna Kanth Reddy
       [not found]   ` <CGME20201201114051epcas5p4b9c67cd0ad4b55fc9334dfde59ae349c@epcas5p4.samsung.com>
                     ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Krishna Kanth Reddy

This patchset adds support for TP4065a ("Simple Copy Command") v2020.05.04 ("Ratified")

The Specification can be found in following link.
https://nvmexpress.org/wp-content/uploads/NVM-Express-1.4-Ratified-TPs-1.zip

Simple copy command is a copy offloading operation and is used to copy
multiple contiguous ranges (source_ranges) of LBA's to a single destination
LBA within the device, reducing traffic between host and device.

In this implementation we introduce a new copy operation.
The copy operation can be enabled by passing
rw=copy (Generates sequential source ranges)
rw=randcopy (Generates random source ranges)

copy operation requires two new options to be passed
num_range - Number of ranges per copy operation.
dest_offset - Start of the destination offset.

The existing FIO options will be used as
offset - Starting offset of the source range.
blocksize - Number of logical blocks per source range.

Introduced a new synchronous ioengine "sctl".
This is a basic ioengine which supports only copy command.

Added a basic sequential copy support for Zoned block devices.

This is a RFC and the Linux Kernel interface is under submission and review.
https://lore.kernel.org/linux-nvme/20201201053949.143175-1-selvakuma.s1@samsung.com/

Feedback / Comments will help in improving this further.

Ankit Kumar (9):
  Adding the necessary ddir changes to introduce copy operation.
  Introducing new offsets for the copy operation.
  Added support for printing of stats and estimate time for copy
    operation.
  Adding a new copy operation.
  Added the changes for copy operation support in FIO.
  New ioctl based synchronous IO engine. Only supports copy command
  Example configuration for simple copy command
  Support copy operation for zoned block devices.
  Add a new test case to test the copy operation.

 HOWTO                  |  28 +++++-
 Makefile               |   2 +-
 backend.c              |  25 ++++-
 cconv.c                |   6 ++
 engines/sctl.c         | 215 +++++++++++++++++++++++++++++++++++++++++
 eta.c                  |  31 ++++--
 examples/fio-copy.fio  |  32 ++++++
 file.h                 |   7 ++
 filesetup.c            |  79 +++++++++++++++
 fio.1                  |  25 ++++-
 fio.h                  |  12 +++
 init.c                 |  44 ++++++---
 io_ddir.h              |  22 +++--
 io_u.c                 | 105 ++++++++++++++++----
 io_u.h                 |   3 +
 options.c              |  64 +++++++++++-
 os/os-linux.h          |   1 +
 parse.c                |  28 ++++++
 parse.h                |   2 +
 rate-submit.c          |   2 +
 stat.c                 |  20 +++-
 stat.h                 |   1 -
 t/zbd/functions        |   9 ++
 t/zbd/test-zbd-support |  54 +++++++++++
 thread_options.h       |   9 +-
 zbd.c                  |  90 +++++++++++++++--
 26 files changed, 842 insertions(+), 74 deletions(-)
 create mode 100644 engines/sctl.c
 create mode 100644 examples/fio-copy.fio

-- 
2.17.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/9] Adding the necessary ddir changes to introduce copy operation.
       [not found]   ` <CGME20201201114051epcas5p4b9c67cd0ad4b55fc9334dfde59ae349c@epcas5p4.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 io_ddir.h | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/io_ddir.h b/io_ddir.h
index a42da97a..22e95f70 100644
--- a/io_ddir.h
+++ b/io_ddir.h
@@ -5,22 +5,23 @@ enum fio_ddir {
 	DDIR_READ = 0,
 	DDIR_WRITE = 1,
 	DDIR_TRIM = 2,
-	DDIR_SYNC = 3,
+	DDIR_COPY = 3,
+	DDIR_SYNC = 4,
 	DDIR_DATASYNC,
 	DDIR_SYNC_FILE_RANGE,
 	DDIR_WAIT,
 	DDIR_LAST,
 	DDIR_INVAL = -1,
 
-	DDIR_RWDIR_CNT = 3,
-	DDIR_RWDIR_SYNC_CNT = 4,
+	DDIR_RWDIR_CNT = 4,
+	DDIR_RWDIR_SYNC_CNT = 5,
 };
 
 #define for_each_rw_ddir(ddir)	for (enum fio_ddir ddir = 0; ddir < DDIR_RWDIR_CNT; ddir++)
 
 static inline const char *io_ddir_name(enum fio_ddir ddir)
 {
-	static const char *name[] = { "read", "write", "trim", "sync",
+	static const char *name[] = { "read", "write", "trim", "copy", "sync"
 					"datasync", "sync_file_range",
 					"wait", };
 
@@ -35,17 +36,20 @@ enum td_ddir {
 	TD_DDIR_WRITE		= 1 << 1,
 	TD_DDIR_RAND		= 1 << 2,
 	TD_DDIR_TRIM		= 1 << 3,
+	TD_DDIR_COPY		= 1 << 4,
 	TD_DDIR_RW		= TD_DDIR_READ | TD_DDIR_WRITE,
 	TD_DDIR_RANDREAD	= TD_DDIR_READ | TD_DDIR_RAND,
 	TD_DDIR_RANDWRITE	= TD_DDIR_WRITE | TD_DDIR_RAND,
 	TD_DDIR_RANDRW		= TD_DDIR_RW | TD_DDIR_RAND,
 	TD_DDIR_RANDTRIM	= TD_DDIR_TRIM | TD_DDIR_RAND,
 	TD_DDIR_TRIMWRITE	= TD_DDIR_TRIM | TD_DDIR_WRITE,
+	TD_DDIR_RANDCOPY	= TD_DDIR_COPY | TD_DDIR_RAND,
 };
 
 #define td_read(td)		((td)->o.td_ddir & TD_DDIR_READ)
 #define td_write(td)		((td)->o.td_ddir & TD_DDIR_WRITE)
 #define td_trim(td)		((td)->o.td_ddir & TD_DDIR_TRIM)
+#define td_copy(td)		((td)->o.td_ddir & TD_DDIR_COPY)
 #define td_rw(td)		(((td)->o.td_ddir & TD_DDIR_RW) == TD_DDIR_RW)
 #define td_random(td)		((td)->o.td_ddir & TD_DDIR_RAND)
 #define file_randommap(td, f)	(!(td)->o.norandommap && fio_file_axmap((f)))
@@ -60,19 +64,23 @@ static inline int ddir_sync(enum fio_ddir ddir)
 
 static inline int ddir_rw(enum fio_ddir ddir)
 {
-	return ddir == DDIR_READ || ddir == DDIR_WRITE || ddir == DDIR_TRIM;
+	return ddir == DDIR_READ || ddir == DDIR_WRITE || ddir == DDIR_TRIM ||
+	       ddir == DDIR_COPY;
 }
 
 static inline const char *ddir_str(enum td_ddir ddir)
 {
 	static const char *__str[] = { NULL, "read", "write", "rw", "rand",
 				"randread", "randwrite", "randrw",
-				"trim", NULL, "trimwrite", NULL, "randtrim" };
+				"trim", NULL, "trimwrite", NULL, "randtrim",
+				 NULL,  NULL, NULL, "copy",  NULL,  NULL,
+				 NULL, "randcopy" };
 
 	return __str[ddir];
 }
 
 #define ddir_rw_sum(arr)	\
-	((arr)[DDIR_READ] + (arr)[DDIR_WRITE] + (arr)[DDIR_TRIM])
+	((arr)[DDIR_READ] + (arr)[DDIR_WRITE] + (arr)[DDIR_TRIM] \
+	 + (arr)[DDIR_COPY])
 
 #endif
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/9] Introducing new offsets for the copy operation.
       [not found]   ` <CGME20201201114054epcas5p2cf4bff491f02c4a29a386ae44d5e42c7@epcas5p2.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 options.c | 11 ++++++++++-
 parse.c   | 28 ++++++++++++++++++++++++++++
 parse.h   |  2 ++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/options.c b/options.c
index 1e91b3e9..f68ae8c2 100644
--- a/options.c
+++ b/options.c
@@ -2154,6 +2154,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, bs[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, bs[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, bs[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, bs[DDIR_COPY]),
 		.minval = 1,
 		.help	= "Block size unit",
 		.def	= "4096",
@@ -2171,6 +2172,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, ba[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, ba[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, ba[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, ba[DDIR_COPY]),
 		.minval	= 1,
 		.help	= "IO block offset alignment",
 		.parent	= "rw",
@@ -2190,6 +2192,8 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off4	= offsetof(struct thread_options, max_bs[DDIR_WRITE]),
 		.off5	= offsetof(struct thread_options, min_bs[DDIR_TRIM]),
 		.off6	= offsetof(struct thread_options, max_bs[DDIR_TRIM]),
+		.off7	= offsetof(struct thread_options, min_bs[DDIR_COPY]),
+		.off8	= offsetof(struct thread_options, max_bs[DDIR_COPY]),
 		.minval = 1,
 		.help	= "Set block size range (in more detail than bs)",
 		.parent = "rw",
@@ -2349,9 +2353,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, perc_rand[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, perc_rand[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, perc_rand[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, perc_rand[DDIR_COPY]),
 		.maxval	= 100,
 		.help	= "Percentage of seq/random mix that should be random",
-		.def	= "100,100,100",
+		.def	= "100,100,100,100",
 		.interval = 5,
 		.inverse = "percentage_sequential",
 		.category = FIO_OPT_C_IO,
@@ -3579,6 +3584,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, rate[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, rate[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, rate[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, rate[DDIR_COPY]),
 		.help	= "Set bandwidth rate",
 		.category = FIO_OPT_C_IO,
 		.group	= FIO_OPT_G_RATE,
@@ -3591,6 +3597,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, ratemin[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, ratemin[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, ratemin[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, ratemin[DDIR_COPY]),
 		.help	= "Job must meet this rate or it will be shutdown",
 		.parent	= "rate",
 		.hide	= 1,
@@ -3604,6 +3611,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, rate_iops[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, rate_iops[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, rate_iops[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, rate_iops[DDIR_COPY]),
 		.help	= "Limit IO used to this number of IO operations/sec",
 		.hide	= 1,
 		.category = FIO_OPT_C_IO,
@@ -3616,6 +3624,7 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.off1	= offsetof(struct thread_options, rate_iops_min[DDIR_READ]),
 		.off2	= offsetof(struct thread_options, rate_iops_min[DDIR_WRITE]),
 		.off3	= offsetof(struct thread_options, rate_iops_min[DDIR_TRIM]),
+		.off4	= offsetof(struct thread_options, rate_iops_min[DDIR_COPY]),
 		.help	= "Job must meet this rate or it will be shut down",
 		.parent	= "rate_iops",
 		.hide	= 1,
diff --git a/parse.c b/parse.c
index f4cefcf6..84d24435 100644
--- a/parse.c
+++ b/parse.c
@@ -669,6 +669,10 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 					if (o->off3)
 						val_store(ilp, ull, o->off3, 0, data, o);
 				}
+				if (curr == 3) {
+					if (o->off4)
+						val_store(ilp, ull, o->off4, 0, data, o);
+				}
 				if (!more) {
 					if (curr < 1) {
 						if (o->off2)
@@ -678,6 +682,10 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 						if (o->off3)
 							val_store(ilp, ull, o->off3, 0, data, o);
 					}
+					if (curr < 3) {
+						if (o->off4)
+							val_store(ilp, ull, o->off4, 0, data, o);
+					}
 				}
 			} else if (o->type == FIO_OPT_ULL) {
 				if (first)
@@ -690,6 +698,10 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 					if (o->off3)
 						val_store(ullp, ull, o->off3, 0, data, o);
 				}
+				if (curr == 3) {
+					if (o->off4)
+						val_store(ullp, ull, o->off4, 0, data, o);
+				}
 				if (!more) {
 					if (curr < 1) {
 						if (o->off2)
@@ -699,6 +711,10 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 						if (o->off3)
 							val_store(ullp, ull, o->off3, 0, data, o);
 					}
+					if (curr < 3) {
+						if (o->off4)
+							val_store(ullp, ull, o->off4, 0, data, o);
+					}
 				}
 			} else {
 				if (first)
@@ -879,6 +895,12 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 					val_store(ullp, ull2, o->off6, 0, data, o);
 				}
 			}
+			if (curr == 3) {
+				if (o->off7 && o->off8) {
+					val_store(ullp, ull1, o->off7, 0, data, o);
+					val_store(ullp, ull2, o->off8, 0, data, o);
+				}
+			}
 			if (!more) {
 				if (curr < 1) {
 					if (o->off3 && o->off4) {
@@ -892,6 +914,12 @@ static int __handle_option(const struct fio_option *o, const char *ptr,
 						val_store(ullp, ull2, o->off6, 0, data, o);
 					}
 				}
+				if (curr < 3) {
+					if (o->off7 && o->off8) {
+						val_store(ullp, ull1, o->off7, 0, data, o);
+						val_store(ullp, ull2, o->off8, 0, data, o);
+					}
+				}
 			}
 		}
 
diff --git a/parse.h b/parse.h
index e6663ed4..643cd8dc 100644
--- a/parse.h
+++ b/parse.h
@@ -54,6 +54,8 @@ struct fio_option {
 	unsigned int off4;
 	unsigned int off5;
 	unsigned int off6;
+	unsigned int off7;
+	unsigned int off8;
 	unsigned long long maxval;		/* max and min value */
 	int minval;
 	double maxfp;			/* max and min floating value */
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/9] Added support for printing of stats and estimate time for copy operation.
       [not found]   ` <CGME20201201114057epcas5p1aa9d8e1a56197e55251191a0a5985e3d@epcas5p1.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 backend.c |  3 +++
 eta.c     | 31 ++++++++++++++++++++++---------
 stat.c    | 20 +++++++++++++++-----
 stat.h    |  1 -
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/backend.c b/backend.c
index 2e6a377c..2f4d6ac4 100644
--- a/backend.c
+++ b/backend.c
@@ -1828,6 +1828,8 @@ static void *thread_main(void *data)
 			update_runtime(td, elapsed_us, DDIR_WRITE);
 		if (td_trim(td) && td->io_bytes[DDIR_TRIM])
 			update_runtime(td, elapsed_us, DDIR_TRIM);
+		if (td_copy(td) && td->io_bytes[DDIR_COPY])
+			update_runtime(td, elapsed_us, DDIR_COPY);
 		fio_gettime(&td->start, NULL);
 		fio_sem_up(stat_sem);
 
@@ -2491,6 +2493,7 @@ int fio_backend(struct sk_out *sk_out)
 		setup_log(&agg_io_log[DDIR_READ], &p, "agg-read_bw.log");
 		setup_log(&agg_io_log[DDIR_WRITE], &p, "agg-write_bw.log");
 		setup_log(&agg_io_log[DDIR_TRIM], &p, "agg-trim_bw.log");
+		setup_log(&agg_io_log[DDIR_COPY], &p, "agg-copy_bw.log");
 	}
 
 	startup_sem = fio_sem_init(FIO_SEM_LOCKED);
diff --git a/eta.c b/eta.c
index d1c9449f..cb89013c 100644
--- a/eta.c
+++ b/eta.c
@@ -82,6 +82,11 @@ static void check_str_update(struct thread_data *td)
 				c = 'w';
 			else
 				c = 'W';
+		} else if (td_copy(td)) {
+			if (td_random(td))
+				c = 'q';
+			else
+				c = 'Q';
 		} else {
 			if (td_random(td))
 				c = 'd';
@@ -292,6 +297,8 @@ static unsigned long thread_eta(struct thread_data *td)
 			rate_bytes += td->o.rate[DDIR_WRITE];
 		if (td_trim(td))
 			rate_bytes += td->o.rate[DDIR_TRIM];
+		if (td_copy(td))
+			rate_bytes += td->o.rate[DDIR_COPY];
 
 		if (rate_bytes) {
 			r_eta = bytes_total / rate_bytes;
@@ -445,6 +452,12 @@ bool calc_thread_status(struct jobs_eta *je, int force)
 				je->m_rate[2] += td->o.ratemin[DDIR_TRIM];
 				je->m_iops[2] += td->o.rate_iops_min[DDIR_TRIM];
 			}
+			if (td_copy(td)) {
+				je->t_rate[3] += td->o.rate[DDIR_COPY];
+				je->t_iops[3] += td->o.rate_iops[DDIR_COPY];
+				je->m_rate[3] += td->o.ratemin[DDIR_COPY];
+				je->m_iops[3] += td->o.rate_iops_min[DDIR_COPY];
+			}
 
 			je->files_open += td->nr_open_files;
 		} else if (td->runstate == TD_RAMP) {
@@ -534,7 +547,7 @@ bool calc_thread_status(struct jobs_eta *je, int force)
 static int gen_eta_str(struct jobs_eta *je, char *p, size_t left,
 		       char **rate_str, char **iops_str)
 {
-	static const char c[DDIR_RWDIR_CNT] = {'r', 'w', 't'};
+	static const char c[DDIR_RWDIR_CNT] = {'r', 'w', 't', 'c'};
 	bool has[DDIR_RWDIR_CNT];
 	bool has_any = false;
 	const char *sep;
@@ -594,23 +607,23 @@ void display_thread_status(struct jobs_eta *je)
 	p += sprintf(p, "Jobs: %d (f=%d)", je->nr_running, je->files_open);
 
 	/* rate limits, if any */
-	if (je->m_rate[0] || je->m_rate[1] || je->m_rate[2] ||
-	    je->t_rate[0] || je->t_rate[1] || je->t_rate[2]) {
+	if (je->m_rate[0] || je->m_rate[1] || je->m_rate[2] || je->m_rate[3] ||
+	    je->t_rate[0] || je->t_rate[1] || je->t_rate[2] || je->t_rate[3]) {
 		char *tr, *mr;
 
-		mr = num2str(je->m_rate[0] + je->m_rate[1] + je->m_rate[2],
+		mr = num2str(je->m_rate[0] + je->m_rate[1] + je->m_rate[2] + je->m_rate[3],
 				je->sig_figs, 0, je->is_pow2, N2S_BYTEPERSEC);
-		tr = num2str(je->t_rate[0] + je->t_rate[1] + je->t_rate[2],
+		tr = num2str(je->t_rate[0] + je->t_rate[1] + je->t_rate[2] + je->t_rate[3],
 				je->sig_figs, 0, je->is_pow2, N2S_BYTEPERSEC);
 
 		p += sprintf(p, ", %s-%s", mr, tr);
 		free(tr);
 		free(mr);
-	} else if (je->m_iops[0] || je->m_iops[1] || je->m_iops[2] ||
-		   je->t_iops[0] || je->t_iops[1] || je->t_iops[2]) {
+	} else if (je->m_iops[0] || je->m_iops[1] || je->m_iops[2] || je->m_iops[3] ||
+		   je->t_iops[0] || je->t_iops[1] || je->t_iops[2] || je->t_iops[3]) {
 		p += sprintf(p, ", %d-%d IOPS",
-					je->m_iops[0] + je->m_iops[1] + je->m_iops[2],
-					je->t_iops[0] + je->t_iops[1] + je->t_iops[2]);
+					je->m_iops[0] + je->m_iops[1] + je->m_iops[2] + je->m_iops[3],
+					je->t_iops[0] + je->t_iops[1] + je->t_iops[2] + je->t_iops[3]);
 	}
 
 	/* current run string, % done, bandwidth, iops, eta */
diff --git a/stat.c b/stat.c
index eb40bd7f..0e348bf6 100644
--- a/stat.c
+++ b/stat.c
@@ -286,7 +286,7 @@ void show_group_stats(struct group_run_stats *rs, struct buf_output *out)
 {
 	char *io, *agg, *min, *max;
 	char *ioalt, *aggalt, *minalt, *maxalt;
-	const char *str[] = { "   READ", "  WRITE" , "   TRIM"};
+	const char *str[] = { "   READ", "  WRITE" , "   TRIM", "   COPY"};
 	int i;
 
 	log_buf(out, "\nRun status group %d (all jobs):\n", rs->groupid);
@@ -1124,19 +1124,22 @@ static void show_thread_status_normal(struct thread_stat *ts,
 					io_u_dist[1], io_u_dist[2],
 					io_u_dist[3], io_u_dist[4],
 					io_u_dist[5], io_u_dist[6]);
-	log_buf(out, "     issued rwts: total=%llu,%llu,%llu,%llu"
-				 " short=%llu,%llu,%llu,0"
-				 " dropped=%llu,%llu,%llu,0\n",
+	log_buf(out, "     issued rwts: total=%llu,%llu,%llu,%llu,%llu"
+				 " short=%llu,%llu,%llu,%llu,0"
+				 " dropped=%llu,%llu,%llu,%llu,0\n",
 					(unsigned long long) ts->total_io_u[0],
 					(unsigned long long) ts->total_io_u[1],
 					(unsigned long long) ts->total_io_u[2],
 					(unsigned long long) ts->total_io_u[3],
+					(unsigned long long) ts->total_io_u[4],
 					(unsigned long long) ts->short_io_u[0],
 					(unsigned long long) ts->short_io_u[1],
 					(unsigned long long) ts->short_io_u[2],
+					(unsigned long long) ts->short_io_u[3],
 					(unsigned long long) ts->drop_io_u[0],
 					(unsigned long long) ts->drop_io_u[1],
-					(unsigned long long) ts->drop_io_u[2]);
+					(unsigned long long) ts->drop_io_u[2],
+					(unsigned long long) ts->drop_io_u[3]);
 	if (ts->continue_on_error) {
 		log_buf(out, "     errors    : total=%llu, first_error=%d/<%s>\n",
 					(unsigned long long)ts->total_err_count,
@@ -1442,6 +1445,8 @@ static void show_thread_status_terse_all(struct thread_stat *ts,
 	/* Log Trim Status */
 	if (ver == 2 || ver == 4 || ver == 5)
 		show_ddir_status_terse(ts, rs, DDIR_TRIM, ver, out);
+	/* Log Copy Status */
+	show_ddir_status_terse(ts, rs, DDIR_COPY, ver, out);
 
 	/* CPU Usage */
 	if (ts->total_run_time) {
@@ -1545,6 +1550,7 @@ static struct json_object *show_thread_status_json(struct thread_stat *ts,
 	add_ddir_status_json(ts, rs, DDIR_READ, root);
 	add_ddir_status_json(ts, rs, DDIR_WRITE, root);
 	add_ddir_status_json(ts, rs, DDIR_TRIM, root);
+	add_ddir_status_json(ts, rs, DDIR_COPY, root);
 	add_ddir_status_json(ts, rs, DDIR_SYNC, root);
 
 	/* CPU Usage */
@@ -2325,6 +2331,8 @@ int __show_running_run_stats(void)
 			td->ts.runtime[DDIR_WRITE] += rt[i];
 		if (td_trim(td) && td->ts.io_bytes[DDIR_TRIM])
 			td->ts.runtime[DDIR_TRIM] += rt[i];
+		if (td_copy(td) && td->ts.io_bytes[DDIR_COPY])
+			td->ts.runtime[DDIR_COPY] += rt[i];
 	}
 
 	for_each_td(td, i) {
@@ -2346,6 +2354,8 @@ int __show_running_run_stats(void)
 			td->ts.runtime[DDIR_WRITE] -= rt[i];
 		if (td_trim(td) && td->ts.io_bytes[DDIR_TRIM])
 			td->ts.runtime[DDIR_TRIM] -= rt[i];
+		if (td_copy(td) && td->ts.io_bytes[DDIR_COPY])
+			td->ts.runtime[DDIR_COPY] -= rt[i];
 	}
 
 	free(rt);
diff --git a/stat.h b/stat.h
index 6dd5ef74..3d439e24 100644
--- a/stat.h
+++ b/stat.h
@@ -281,7 +281,6 @@ struct thread_stat {
 	uint32_t m_iops[DDIR_RWDIR_CNT];				\
 	uint32_t t_iops[DDIR_RWDIR_CNT];				\
 	uint32_t iops[DDIR_RWDIR_CNT];					\
-	uint32_t pad;							\
 	uint64_t elapsed_sec;						\
 	uint64_t eta_sec;						\
 	uint32_t is_pow2;						\
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/9] Adding a new copy operation.
       [not found]   ` <CGME20201201114100epcas5p2f02995779f5172f711cc6ad3d362d50a@epcas5p2.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Changes for rw=copy and rw=randcopy
The copy operation requires two new FIO options.
dest_offset : The starting destination offset for the copy operation.
num_range : The number of source ranges for each copy operation.
The existing FIO option offset will be the starting source offset.

Data to be copied from source range can be random or sequential
according to rw=copy or randcopy option. The destination location
for the copy operation will be sequential.

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 HOWTO            | 24 ++++++++++++++++++++++--
 cconv.c          |  6 ++++++
 fio.1            | 21 ++++++++++++++++++++-
 options.c        | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 thread_options.h |  9 +++++++--
 5 files changed, 103 insertions(+), 5 deletions(-)

diff --git a/HOWTO b/HOWTO
index 386fd12a..5cbd46a2 100644
--- a/HOWTO
+++ b/HOWTO
@@ -177,8 +177,8 @@ Command line options
 	``--readonly`` option is an extra safety guard to prevent users from
 	accidentally starting a write or trim workload when that is not desired.
 	Fio will only modify the device under test if
-	`rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite` is given.  This
-	safety net can be used as an extra precaution.
+	`rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite/copy/randcopy` is given.
+	This safety net can be used as an extra precaution.
 
 .. option:: --eta=when
 
@@ -1096,6 +1096,9 @@ I/O type
 		**trim**
 				Sequential trims (Linux block devices and SCSI
 				character devices only).
+		**copy**
+				Sequential copy (Linux NVMe block device. Source offset
+				will be sequential).
 		**randread**
 				Random reads.
 		**randwrite**
@@ -1103,6 +1106,9 @@ I/O type
 		**randtrim**
 				Random trims (Linux block devices and SCSI
 				character devices only).
+		**randcopy**
+				Random copy (Linux NVMe block device. Source offset
+				will be random).
 		**rw,readwrite**
 				Sequential mixed reads and writes.
 		**randrw**
@@ -1280,6 +1286,20 @@ I/O type
 	If a percentage is given, the generated offset will be aligned to the minimum
 	``blocksize`` or to the value of ``offset_align`` if provided.
 
+.. option:: dest_offset=int
+
+	Destination offset in the file for copy command, given as either a fixed size
+	in bytes or a percentage. The option 'offset' is now used as source offset for
+	copy command. If a percentage is given, the generated offset will be aligned
+	to the minimum ``blocksize`` or to the value of ``offset_align`` if provided.
+	A percentage can be specified by a number between 1 and 100 followed by '%',
+	for example, ``dest_offset=40%`` to specify 40%.
+
+.. option:: num_range=int
+	For copy command this must be the number of source ranges to copy at a time.
+	The number of logical blocks per source range is determined by the option 'bs'
+	and it must be a multiple of logical block size.
+
 .. option:: number_ios=int
 
 	Fio will normally perform I/Os until it has exhausted the size of the region
diff --git a/cconv.c b/cconv.c
index 488dd799..08174af0 100644
--- a/cconv.c
+++ b/cconv.c
@@ -101,6 +101,7 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->serialize_overlap = le32_to_cpu(top->serialize_overlap);
 	o->size = le64_to_cpu(top->size);
 	o->io_size = le64_to_cpu(top->io_size);
+	o->num_range = le32_to_cpu(top->num_range);
 	o->size_percent = le32_to_cpu(top->size_percent);
 	o->io_size_percent = le32_to_cpu(top->io_size_percent);
 	o->fill_device = le32_to_cpu(top->fill_device);
@@ -110,6 +111,8 @@ void convert_thread_options_to_cpu(struct thread_options *o,
 	o->start_offset = le64_to_cpu(top->start_offset);
 	o->start_offset_align = le64_to_cpu(top->start_offset_align);
 	o->start_offset_percent = le32_to_cpu(top->start_offset_percent);
+	o->dest_offset = le64_to_cpu(top->dest_offset);
+	o->dest_offset_percent = le32_to_cpu(top->dest_offset_percent);
 
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
 		o->bs[i] = le64_to_cpu(top->bs[i]);
@@ -553,6 +556,7 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 
 	top->size = __cpu_to_le64(o->size);
 	top->io_size = __cpu_to_le64(o->io_size);
+	top->num_range = __cpu_to_le32(o->num_range);
 	top->verify_backlog = __cpu_to_le64(o->verify_backlog);
 	top->start_delay = __cpu_to_le64(o->start_delay);
 	top->start_delay_high = __cpu_to_le64(o->start_delay_high);
@@ -574,6 +578,8 @@ void convert_thread_options_to_net(struct thread_options_pack *top,
 	top->start_offset = __cpu_to_le64(o->start_offset);
 	top->start_offset_align = __cpu_to_le64(o->start_offset_align);
 	top->start_offset_percent = __cpu_to_le32(o->start_offset_percent);
+	top->dest_offset = __cpu_to_le64(o->dest_offset);
+	top->dest_offset_percent = __cpu_to_le32(o->dest_offset_percent);
 	top->trim_backlog = __cpu_to_le64(o->trim_backlog);
 	top->offset_increment_percent = __cpu_to_le32(o->offset_increment_percent);
 	top->offset_increment = __cpu_to_le64(o->offset_increment);
diff --git a/fio.1 b/fio.1
index 48119325..d4da1b0c 100644
--- a/fio.1
+++ b/fio.1
@@ -74,7 +74,7 @@ Convert \fIjobfile\fR to a set of command\-line options.
 Turn on safety read\-only checks, preventing writes and trims. The \fB\-\-readonly\fR
 option is an extra safety guard to prevent users from accidentally starting
 a write or trim workload when that is not desired. Fio will only modify the
-device under test if `rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite'
+device under test if `rw=write/randwrite/rw/randrw/trim/randtrim/trimwrite/copy/randcopy'
 is given. This safety net can be used as an extra precaution.
 .TP
 .BI \-\-eta \fR=\fPwhen
@@ -860,6 +860,9 @@ Sequential writes.
 .B trim
 Sequential trims (Linux block devices and SCSI character devices only).
 .TP
+.B copy
+Sequential copy (Linux NVMe block device. Source offset will be sequential).
+.TP
 .B randread
 Random reads.
 .TP
@@ -869,6 +872,9 @@ Random writes.
 .B randtrim
 Random trims (Linux block devices and SCSI character devices only).
 .TP
+.B randcopy
+Random copy (Linux NVMe block device. Source offset will be random).
+.TP
 .B rw,readwrite
 Sequential mixed reads and writes.
 .TP
@@ -1055,6 +1061,19 @@ spacing between the starting points. Percentages can be used for this option.
 If a percentage is given, the generated offset will be aligned to the minimum
 \fBblocksize\fR or to the value of \fBoffset_align\fR if provided.
 .TP
+.BI dest_offset \fR=\fPint
+Destination offset in the file for copy command, given as either a fixed size in
+bytes or a percentage. The \fBoffset\fR is now used as source offset for copy command.
+If a percentage is given, the generated destination offset will be aligned to the
+minimum \fBblocksize\fR or to the value of \fBoffset_align\fR if provided.
+A percentage can be specified by a number between 1 and 100 followed by '%',
+for example, `dest_offset=40%' to specify 40%.
+.TP
+.BI num_range \fR=\fPint
+For copy command this must be the number of source ranges to copy at a time.
+The number of logical blocks per source range is determined by the option\fBbs\fR
+and it must be a multiple of logical block size.
+.TP
 .BI number_ios \fR=\fPint
 Fio will normally perform I/Os until it has exhausted the size of the region
 set by \fBsize\fR, or if it exhaust the allocated time (or hits an error
diff --git a/options.c b/options.c
index f68ae8c2..e085ab25 100644
--- a/options.c
+++ b/options.c
@@ -1460,6 +1460,22 @@ static int str_offset_increment_cb(void *data, unsigned long long *__val)
 	return 0;
 }
 
+static int str_dest_offset_cb(void *data, unsigned long long *__val)
+{
+	struct thread_data *td = cb_data_to_td(data);
+	unsigned long long v = *__val;
+
+	if (parse_is_percent(v)) {
+		td->o.dest_offset = 0;
+		td->o.dest_offset_percent = -1ULL - v;
+		dprint(FD_PARSE, "SET dest_offset_percent %d\n",
+					td->o.dest_offset_percent);
+	} else
+		td->o.dest_offset = v;
+
+	return 0;
+}
+
 static int str_size_cb(void *data, unsigned long long *__val)
 {
 	struct thread_data *td = cb_data_to_td(data);
@@ -1737,6 +1753,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 			    .oval = TD_DDIR_TRIM,
 			    .help = "Sequential trim",
 			  },
+			  { .ival = "copy",
+			    .oval = TD_DDIR_COPY,
+			    .help = "Sequential copy",
+			  },
 			  { .ival = "randread",
 			    .oval = TD_DDIR_RANDREAD,
 			    .help = "Random read",
@@ -1749,6 +1769,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 			    .oval = TD_DDIR_RANDTRIM,
 			    .help = "Random trim",
 			  },
+			  { .ival = "randcopy",
+			    .oval = TD_DDIR_RANDCOPY,
+			    .help = "Random copy",
+			  },
 			  { .ival = "rw",
 			    .oval = TD_DDIR_RW,
 			    .help = "Sequential read and write mix",
@@ -2111,6 +2135,30 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 		.category = FIO_OPT_C_IO,
 		.group	= FIO_OPT_G_INVALID,
 	},
+	{
+		.name	= "dest_offset",
+		.lname	= "Destination offset",
+		.type	= FIO_OPT_STR_VAL,
+		.cb	= str_dest_offset_cb,
+		.off1	= offsetof(struct thread_options, dest_offset),
+		.help	= "Start of the destination offset",
+		.def	= "0",
+		.interval = 1024 * 1024,
+		.category = FIO_OPT_C_IO,
+		.group	= FIO_OPT_G_INVALID,
+	},
+	{
+		.name	= "num_range",
+		.lname	= "Number of ranges",
+		.type	= FIO_OPT_INT,
+		.off1	= offsetof(struct thread_options, num_range),
+		.help	= "Number of ranges for copy command",
+		.minval = 0,
+		.interval = 1,
+		.def	= "0",
+		.category = FIO_OPT_C_IO,
+		.group	= FIO_OPT_G_IO_BASIC,
+	},
 	{
 		.name	= "offset_align",
 		.lname	= "IO offset alignment",
diff --git a/thread_options.h b/thread_options.h
index 97c400fe..dac82144 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -77,6 +77,7 @@ struct thread_options {
 	unsigned int iodepth_batch_complete_min;
 	unsigned int iodepth_batch_complete_max;
 	unsigned int serialize_overlap;
+	unsigned int num_range;
 
 	unsigned int unique_filename;
 
@@ -90,6 +91,7 @@ struct thread_options {
 	unsigned long long file_size_high;
 	unsigned long long start_offset;
 	unsigned long long start_offset_align;
+	unsigned long long dest_offset;
 
 	unsigned long long bs[DDIR_RWDIR_CNT];
 	unsigned long long ba[DDIR_RWDIR_CNT];
@@ -217,6 +219,7 @@ struct thread_options {
 	char *numa_memnodes;
 	unsigned int gpu_dev_id;
 	unsigned int start_offset_percent;
+	unsigned int dest_offset_percent;
 
 	unsigned int iolog;
 	unsigned int rwmixcycle;
@@ -379,6 +382,7 @@ struct thread_options_pack {
 	uint32_t serialize_overlap;
 	uint32_t pad;
 
+	uint32_t num_range;
 	uint64_t size;
 	uint64_t io_size;
 	uint32_t size_percent;
@@ -390,6 +394,7 @@ struct thread_options_pack {
 	uint64_t file_size_high;
 	uint64_t start_offset;
 	uint64_t start_offset_align;
+	uint64_t dest_offset;
 
 	uint64_t bs[DDIR_RWDIR_CNT];
 	uint64_t ba[DDIR_RWDIR_CNT];
@@ -462,8 +467,6 @@ struct thread_options_pack {
 	struct zone_split zone_split[DDIR_RWDIR_CNT][ZONESPLIT_MAX];
 	uint32_t zone_split_nr[DDIR_RWDIR_CNT];
 
-	uint8_t pad1[4];
-
 	fio_fp64_t zipf_theta;
 	fio_fp64_t pareto_h;
 	fio_fp64_t gauss_dev;
@@ -514,6 +517,7 @@ struct thread_options_pack {
 #endif
 	uint32_t gpu_dev_id;
 	uint32_t start_offset_percent;
+	uint32_t dest_offset_percent;
 	uint32_t cpus_allowed_policy;
 	uint32_t iolog;
 	uint32_t rwmixcycle;
@@ -552,6 +556,7 @@ struct thread_options_pack {
 	uint32_t lat_percentiles;
 	uint32_t slat_percentiles;
 	uint32_t percentile_precision;
+	uint32_t pad3;
 	fio_fp64_t percentile_list[FIO_IO_U_LIST_MAX_LEN];
 
 	uint8_t read_iolog_file[FIO_TOP_STR_MAX];
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/9] Added the changes for copy operation support in FIO.
       [not found]   ` <CGME20201201114103epcas5p1bbf3d8ca05252935c14fed68f44dab2d@epcas5p1.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

The source ranges for copy operation uses the existing offset generation
algorithm. As each copy operation has num_range source ranges
the get_next_offset will be called that many times. The data buffer
will now contain a range list, with each entry having a start offset
and the length in bytes for that source range.
For generating the destination offset added a new function.
Each successful copy operation will copy 'num_range' * 'block_size'
amount of data.

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 backend.c     |  22 +++++++++--
 file.h        |   3 ++
 filesetup.c   |  61 ++++++++++++++++++++++++++++++
 fio.h         |  12 ++++++
 init.c        |  39 +++++++++++--------
 io_u.c        | 103 +++++++++++++++++++++++++++++++++++++++++---------
 rate-submit.c |   2 +
 zbd.c         |   2 +
 8 files changed, 207 insertions(+), 37 deletions(-)

diff --git a/backend.c b/backend.c
index 2f4d6ac4..c0569c59 100644
--- a/backend.c
+++ b/backend.c
@@ -521,8 +521,13 @@ sync_done:
 		 */
 		if (td->io_ops->commit == NULL)
 			io_u_queued(td, io_u);
-		if (bytes_issued)
-			*bytes_issued += io_u->xfer_buflen;
+		if (bytes_issued) {
+			if (io_u->ddir == DDIR_COPY) {
+				*bytes_issued += (((io_u->xfer_buflen) * td->o.bs[DDIR_COPY]) /
+						   sizeof(struct source_range));
+			} else
+				*bytes_issued += io_u->xfer_buflen;
+		}
 		break;
 	case FIO_Q_BUSY:
 		if (!from_verify)
@@ -721,6 +726,10 @@ static void do_verify(struct thread_data *td, uint64_t verify_bytes)
 					io_u->ddir = DDIR_READ;
 					populate_verify_io_u(td, io_u);
 					break;
+				} else if (io_u->ddir == DDIR_COPY) {
+					td->io_issues[DDIR_COPY]++;
+					put_io_u(td, io_u);
+					continue;
 				} else {
 					put_io_u(td, io_u);
 					continue;
@@ -802,6 +811,8 @@ static bool io_bytes_exceeded(struct thread_data *td, uint64_t *this_bytes)
 		bytes = this_bytes[DDIR_WRITE];
 	else if (td_read(td))
 		bytes = this_bytes[DDIR_READ];
+	else if (td_copy(td))
+		bytes = this_bytes[DDIR_COPY];
 	else
 		bytes = this_bytes[DDIR_TRIM];
 
@@ -1278,7 +1289,8 @@ int init_io_u_buffers(struct thread_data *td)
 	td->orig_buffer_size = (unsigned long long) max_bs
 					* (unsigned long long) max_units;
 
-	if (td_ioengine_flagged(td, FIO_NOIO) || !(td_read(td) || td_write(td)))
+	if (td_ioengine_flagged(td, FIO_NOIO) || !(td_read(td) ||
+	    td_write(td) || td_copy(td)))
 		data_xfer = 0;
 
 	/*
@@ -1751,13 +1763,15 @@ static void *thread_main(void *data)
 	memcpy(&td->ss.prev_time, &td->epoch, sizeof(td->epoch));
 
 	if (o->ratemin[DDIR_READ] || o->ratemin[DDIR_WRITE] ||
-			o->ratemin[DDIR_TRIM]) {
+	    o->ratemin[DDIR_TRIM] || o->ratemin[DDIR_COPY]) {
 	        memcpy(&td->lastrate[DDIR_READ], &td->bw_sample_time,
 					sizeof(td->bw_sample_time));
 	        memcpy(&td->lastrate[DDIR_WRITE], &td->bw_sample_time,
 					sizeof(td->bw_sample_time));
 	        memcpy(&td->lastrate[DDIR_TRIM], &td->bw_sample_time,
 					sizeof(td->bw_sample_time));
+	        memcpy(&td->lastrate[DDIR_COPY], &td->bw_sample_time,
+					sizeof(td->bw_sample_time));
 	}
 
 	memset(bytes_done, 0, sizeof(bytes_done));
diff --git a/file.h b/file.h
index 493ec04a..f5a794e4 100644
--- a/file.h
+++ b/file.h
@@ -99,6 +99,7 @@ struct fio_file {
 	 */
 	uint64_t real_file_size;
 	uint64_t file_offset;
+	uint64_t file_dest_offset;
 	uint64_t io_size;
 
 	/*
@@ -113,6 +114,7 @@ struct fio_file {
 	 * Track last end and last start of IO for a given data direction
 	 */
 	uint64_t last_pos[DDIR_RWDIR_CNT];
+	uint64_t last_pos_dest[DDIR_RWDIR_CNT];
 	uint64_t last_start[DDIR_RWDIR_CNT];
 
 	uint64_t first_write;
@@ -199,6 +201,7 @@ struct thread_data;
 extern void close_files(struct thread_data *);
 extern void close_and_free_files(struct thread_data *);
 extern uint64_t get_start_offset(struct thread_data *, struct fio_file *);
+extern uint64_t get_dest_offset(struct thread_data *, struct fio_file *);
 extern int __must_check setup_files(struct thread_data *);
 extern int __must_check file_invalidate_cache(struct thread_data *, struct fio_file *);
 #ifdef __cplusplus
diff --git a/filesetup.c b/filesetup.c
index 42c5f630..68a21fac 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -679,6 +679,14 @@ open_again:
 		else
 			flags |= O_RDONLY;
 
+		if (is_std)
+			f->fd = dup(STDIN_FILENO);
+		else
+			from_hash = file_lookup_open(f, flags);
+	} else if (td_copy(td)) {
+		if (!read_only)
+			flags |= O_RDWR;
+
 		if (is_std)
 			f->fd = dup(STDIN_FILENO);
 		else
@@ -911,6 +919,54 @@ uint64_t get_start_offset(struct thread_data *td, struct fio_file *f)
 	return offset;
 }
 
+uint64_t get_dest_offset(struct thread_data *td, struct fio_file *f)
+{
+	bool align = false;
+	struct thread_options *o = &td->o;
+	unsigned long long align_bs;
+	unsigned long long offset;
+	unsigned long long increment;
+
+	if (o->offset_increment_percent) {
+		assert(!o->offset_increment);
+		increment = o->offset_increment_percent * f->real_file_size / 100;
+		align = true;
+	} else
+		increment = o->offset_increment;
+
+	if (o->dest_offset_percent > 0) {
+		/* calculate the raw offset */
+		offset = (f->real_file_size * o->dest_offset_percent / 100) +
+			(td->subjob_number * increment);
+
+		align = true;
+	} else {
+		/* start_offset_percent not set */
+		offset = o->dest_offset +
+				td->subjob_number * increment;
+	}
+
+	if (align) {
+		/*
+		 * if offset_align is provided, use it
+		 */
+		if (fio_option_is_set(o, start_offset_align)) {
+			align_bs = o->start_offset_align;
+		} else {
+			/* else take the minimum block size */
+			align_bs = td_min_bs(td);
+		}
+
+		/*
+		 * block align the offset at the next available boundary at
+		 * ceiling(offset / align_bs) * align_bs
+		 */
+		offset = (offset / align_bs + (offset % align_bs != 0)) * align_bs;
+	}
+
+	return offset;
+}
+
 /*
  * Find longest path component that exists and return its length
  */
@@ -1172,6 +1228,9 @@ int setup_files(struct thread_data *td)
 				    td_ioengine_flagged(td, FIO_FAKEIO)))
 				f->real_file_size = f->io_size + f->file_offset;
 		}
+
+		if (td_copy(td))
+			f->file_dest_offset = get_dest_offset(td, f);
 	}
 
 	if (td->o.block_error_hist) {
@@ -1310,6 +1369,7 @@ static void __init_rand_distribution(struct thread_data *td, struct fio_file *f)
 	uint64_t fsize;
 
 	range_size = min(td->o.min_bs[DDIR_READ], td->o.min_bs[DDIR_WRITE]);
+	range_size = min((unsigned long long)range_size, td->o.min_bs[DDIR_COPY]);
 	fsize = min(f->real_file_size, f->io_size);
 
 	nranges = (fsize + range_size - 1ULL) / range_size;
@@ -1956,6 +2016,7 @@ void fio_file_reset(struct thread_data *td, struct fio_file *f)
 	for (i = 0; i < DDIR_RWDIR_CNT; i++) {
 		f->last_pos[i] = f->file_offset;
 		f->last_start[i] = -1ULL;
+		f->last_pos_dest[i] = f->file_dest_offset;
 	}
 
 	if (fio_file_axmap(f))
diff --git a/fio.h b/fio.h
index fffec001..5c1a7c88 100644
--- a/fio.h
+++ b/fio.h
@@ -70,6 +70,14 @@
 
 struct fio_sem;
 
+/*
+ * Source range data for copy command
+ */
+struct source_range {
+       uint64_t  start;
+       uint64_t  len;
+};
+
 /*
  * offset generator types
  */
@@ -123,6 +131,7 @@ enum {
 	FIO_RAND_BS_OFF		= 0,
 	FIO_RAND_BS1_OFF,
 	FIO_RAND_BS2_OFF,
+	FIO_RAND_BS3_OFF,
 	FIO_RAND_VER_OFF,
 	FIO_RAND_MIX_OFF,
 	FIO_RAND_FILE_OFF,
@@ -133,6 +142,7 @@ enum {
 	FIO_RAND_SEQ_RAND_READ_OFF,
 	FIO_RAND_SEQ_RAND_WRITE_OFF,
 	FIO_RAND_SEQ_RAND_TRIM_OFF,
+	FIO_RAND_SEQ_RAND_COPY_OFF,
 	FIO_RAND_START_DELAY,
 	FIO_DEDUPE_OFF,
 	FIO_RAND_POISSON_OFF,
@@ -774,6 +784,7 @@ static inline unsigned long long td_max_bs(struct thread_data *td)
 	unsigned long long max_bs;
 
 	max_bs = max(td->o.max_bs[DDIR_READ], td->o.max_bs[DDIR_WRITE]);
+	max_bs = max(td->o.max_bs[DDIR_COPY], max_bs);
 	return max(td->o.max_bs[DDIR_TRIM], max_bs);
 }
 
@@ -782,6 +793,7 @@ static inline unsigned long long td_min_bs(struct thread_data *td)
 	unsigned long long min_bs;
 
 	min_bs = min(td->o.min_bs[DDIR_READ], td->o.min_bs[DDIR_WRITE]);
+	min_bs = min(td->o.min_bs[DDIR_COPY], min_bs);
 	return min(td->o.min_bs[DDIR_TRIM], min_bs);
 }
 
diff --git a/init.c b/init.c
index f9c20bdb..e5835b7b 100644
--- a/init.c
+++ b/init.c
@@ -592,8 +592,10 @@ static int fixed_block_size(struct thread_options *o)
 	return o->min_bs[DDIR_READ] == o->max_bs[DDIR_READ] &&
 		o->min_bs[DDIR_WRITE] == o->max_bs[DDIR_WRITE] &&
 		o->min_bs[DDIR_TRIM] == o->max_bs[DDIR_TRIM] &&
+		o->min_bs[DDIR_COPY] == o->max_bs[DDIR_COPY] &&
 		o->min_bs[DDIR_READ] == o->min_bs[DDIR_WRITE] &&
-		o->min_bs[DDIR_READ] == o->min_bs[DDIR_TRIM];
+		o->min_bs[DDIR_READ] == o->min_bs[DDIR_TRIM] &&
+		o->min_bs[DDIR_READ] == o->min_bs[DDIR_COPY];
 }
 
 /*
@@ -616,8 +618,8 @@ static int fixup_options(struct thread_data *td)
 	struct thread_options *o = &td->o;
 	int ret = 0;
 
-	if (read_only && (td_write(td) || td_trim(td))) {
-		log_err("fio: trim and write operations are not allowed"
+	if (read_only && (td_write(td) || td_trim(td) || td_copy(td))) {
+		log_err("fio: trim, copy and write operations are not allowed"
 			 " with the --readonly parameter.\n");
 		ret |= 1;
 	}
@@ -670,9 +672,9 @@ static int fixup_options(struct thread_data *td)
 		o->zone_range = o->zone_size;
 
 	/*
-	 * Reads can do overwrites, we always need to pre-create the file
+	 * Reads and copies can do overwrites, we always need to pre-create the file
 	 */
-	if (td_read(td))
+	if (td_read(td) && td_copy(td))
 		o->overwrite = 1;
 
 	for_each_rw_ddir(ddir) {
@@ -697,7 +699,8 @@ static int fixup_options(struct thread_data *td)
 
 	if ((o->ba[DDIR_READ] != o->min_bs[DDIR_READ] ||
 	    o->ba[DDIR_WRITE] != o->min_bs[DDIR_WRITE] ||
-	    o->ba[DDIR_TRIM] != o->min_bs[DDIR_TRIM]) &&
+	    o->ba[DDIR_TRIM] != o->min_bs[DDIR_TRIM] ||
+	    o->ba[DDIR_COPY] != o->min_bs[DDIR_COPY]) &&
 	    !o->norandommap) {
 		log_err("fio: Any use of blockalign= turns off randommap\n");
 		o->norandommap = 1;
@@ -765,10 +768,10 @@ static int fixup_options(struct thread_data *td)
 	if (o->open_files > o->nr_files || !o->open_files)
 		o->open_files = o->nr_files;
 
-	if (((o->rate[DDIR_READ] + o->rate[DDIR_WRITE] + o->rate[DDIR_TRIM]) &&
-	    (o->rate_iops[DDIR_READ] + o->rate_iops[DDIR_WRITE] + o->rate_iops[DDIR_TRIM])) ||
-	    ((o->ratemin[DDIR_READ] + o->ratemin[DDIR_WRITE] + o->ratemin[DDIR_TRIM]) &&
-	    (o->rate_iops_min[DDIR_READ] + o->rate_iops_min[DDIR_WRITE] + o->rate_iops_min[DDIR_TRIM]))) {
+	if (((o->rate[DDIR_READ] + o->rate[DDIR_WRITE] + o->rate[DDIR_TRIM] + o->rate[DDIR_COPY]) &&
+	    (o->rate_iops[DDIR_READ] + o->rate_iops[DDIR_WRITE] + o->rate_iops[DDIR_TRIM] + o->rate_iops[DDIR_COPY])) ||
+	    ((o->ratemin[DDIR_READ] + o->ratemin[DDIR_WRITE] + o->ratemin[DDIR_TRIM] + o->ratemin[DDIR_COPY]) &&
+	    (o->rate_iops_min[DDIR_READ] + o->rate_iops_min[DDIR_WRITE] + o->rate_iops_min[DDIR_TRIM] + o->rate_iops_min[DDIR_COPY]))) {
 		log_err("fio: rate and rate_iops are mutually exclusive\n");
 		ret |= 1;
 	}
@@ -1000,6 +1003,7 @@ static void td_fill_rand_seeds_internal(struct thread_data *td, bool use64)
 	uint64_t read_seed = td->rand_seeds[FIO_RAND_BS_OFF];
 	uint64_t write_seed = td->rand_seeds[FIO_RAND_BS1_OFF];
 	uint64_t trim_seed = td->rand_seeds[FIO_RAND_BS2_OFF];
+	uint64_t copy_seed = td->rand_seeds[FIO_RAND_BS3_OFF];
 	int i;
 
 	/*
@@ -1017,6 +1021,7 @@ static void td_fill_rand_seeds_internal(struct thread_data *td, bool use64)
 	init_rand_seed(&td->bsrange_state[DDIR_READ], read_seed, use64);
 	init_rand_seed(&td->bsrange_state[DDIR_WRITE], write_seed, use64);
 	init_rand_seed(&td->bsrange_state[DDIR_TRIM], trim_seed, use64);
+	init_rand_seed(&td->bsrange_state[DDIR_COPY], copy_seed, use64);
 
 	td_fill_verify_state_seed(td);
 	init_rand_seed(&td->rwmix_state, td->rand_seeds[FIO_RAND_MIX_OFF], false);
@@ -1675,7 +1680,7 @@ static int add_job(struct thread_data *td, const char *jobname, int job_add_num,
 				fio_server_send_add_job(td);
 
 			if (!td_ioengine_flagged(td, FIO_NOIO)) {
-				char *c1, *c2, *c3, *c4;
+				char *c1, *c2, *c3, *c4, *c7, *c8;
 				char *c5 = NULL, *c6 = NULL;
 				int i2p = is_power_of_2(o->kb_base);
 				struct buf_output out;
@@ -1684,6 +1689,8 @@ static int add_job(struct thread_data *td, const char *jobname, int job_add_num,
 				c2 = num2str(o->max_bs[DDIR_READ], o->sig_figs, 1, i2p, N2S_BYTE);
 				c3 = num2str(o->min_bs[DDIR_WRITE], o->sig_figs, 1, i2p, N2S_BYTE);
 				c4 = num2str(o->max_bs[DDIR_WRITE], o->sig_figs, 1, i2p, N2S_BYTE);
+				c7 = num2str(o->min_bs[DDIR_COPY], o->sig_figs, 1, i2p, N2S_BYTE);
+				c8 = num2str(o->max_bs[DDIR_COPY], o->sig_figs, 1, i2p, N2S_BYTE);
 
 				if (!o->bs_is_seq_rand) {
 					c5 = num2str(o->min_bs[DDIR_TRIM], o->sig_figs, 1, i2p, N2S_BYTE);
@@ -1696,11 +1703,11 @@ static int add_job(struct thread_data *td, const char *jobname, int job_add_num,
 							ddir_str(o->td_ddir));
 
 				if (o->bs_is_seq_rand)
-					__log_buf(&out, "bs=(R) %s-%s, (W) %s-%s, bs_is_seq_rand, ",
-							c1, c2, c3, c4);
+					__log_buf(&out, "bs=(R) %s-%s, (W) %s-%s, (C) %s-%s, bs_is_seq_rand, ",
+							c1, c2, c3, c4, c7, c8);
 				else
-					__log_buf(&out, "bs=(R) %s-%s, (W) %s-%s, (T) %s-%s, ",
-							c1, c2, c3, c4, c5, c6);
+					__log_buf(&out, "bs=(R) %s-%s, (W) %s-%s, (T) %s-%s, (C) %s-%s, ",
+							c1, c2, c3, c4, c5, c6, c7, c8);
 
 				__log_buf(&out, "ioengine=%s, iodepth=%u\n",
 						td->io_ops->name, o->iodepth);
@@ -1713,6 +1720,8 @@ static int add_job(struct thread_data *td, const char *jobname, int job_add_num,
 				free(c4);
 				free(c5);
 				free(c6);
+				free(c7);
+				free(c8);
 			}
 		} else if (job_add_num == 1)
 			log_info("...\n");
diff --git a/io_u.c b/io_u.c
index f30fc037..83c7960a 100644
--- a/io_u.c
+++ b/io_u.c
@@ -405,6 +405,29 @@ static int get_next_seq_offset(struct thread_data *td, struct fio_file *f,
 	return 1;
 }
 
+static void get_next_dest_seq_offset(struct thread_data *td, struct fio_file *f,
+				     enum fio_ddir ddir, uint64_t num_range,
+				     uint64_t *offset)
+{
+	struct thread_options *o = &td->o;
+
+	assert(ddir_rw(ddir));
+
+	if (f->last_pos_dest[ddir] >= f->io_size + f->file_dest_offset &&
+	    o->time_based) {
+		f->last_pos_dest[ddir] =  f->file_dest_offset;
+		loop_cache_invalidate(td, f);
+	}
+	*offset = f->last_pos_dest[ddir];
+	if (f->last_pos_dest[ddir] >= f->real_file_size)
+		f->last_pos_dest[ddir] = f->file_dest_offset;
+	else {
+		f->last_pos_dest[ddir] += (num_range) * (td->o.bs[ddir]);
+		if (f->last_pos_dest[ddir] >= f->real_file_size)
+			f->last_pos_dest[ddir] =  f->file_dest_offset;
+	}
+}
+
 static int get_next_block(struct thread_data *td, struct io_u *io_u,
 			  enum fio_ddir ddir, int rw_seq,
 			  bool *is_random)
@@ -752,6 +775,8 @@ static enum fio_ddir get_rw_ddir(struct thread_data *td)
 		ddir = DDIR_WRITE;
 	else if (td_trim(td))
 		ddir = DDIR_TRIM;
+	else if (td_copy(td))
+		ddir = DDIR_COPY;
 	else
 		ddir = DDIR_INVAL;
 
@@ -905,8 +930,12 @@ static void setup_strided_zone_mode(struct thread_data *td, struct io_u *io_u)
 static int fill_io_u(struct thread_data *td, struct io_u *io_u)
 {
 	bool is_random;
-	uint64_t offset;
+	uint64_t offset, dest_offset, i = 0;
 	enum io_u_action ret;
+	struct fio_file *f = io_u->file;
+	enum fio_ddir ddir = io_u->ddir;
+	uint8_t *buf_point;
+	struct source_range entry;
 
 	if (td_ioengine_flagged(td, FIO_NOIO))
 		goto out;
@@ -928,22 +957,52 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
 	else if (td->o.zone_mode == ZONE_MODE_ZBD)
 		setup_zbd_zone_mode(td, io_u);
 
-	/*
-	 * No log, let the seq/rand engine retrieve the next buflen and
-	 * position.
-	 */
-	if (get_next_offset(td, io_u, &is_random)) {
-		dprint(FD_IO, "io_u %p, failed getting offset\n", io_u);
-		return 1;
-	}
+	if (io_u->ddir == DDIR_COPY) {
+		buf_point = io_u->buf;
+		offset = 0;
 
-	io_u->buflen = get_next_buflen(td, io_u, is_random);
-	if (!io_u->buflen) {
-		dprint(FD_IO, "io_u %p, failed getting buflen\n", io_u);
-		return 1;
+		while (i < td->o.num_range) {
+			if (get_next_offset(td, io_u, &is_random)) {
+				dprint(FD_IO, "io_u %p, failed getting offset\n",
+				       io_u);
+				return 1;
+			}
+
+			offset = io_u->offset;
+			entry.start = offset;
+			entry.len = td->o.bs[ddir];
+			memcpy(buf_point, &entry, sizeof(struct source_range));
+			buf_point += sizeof(struct source_range);
+			f->last_start[io_u->ddir] = io_u->offset;
+			f->last_pos[io_u->ddir] = io_u->offset + entry.len;
+			i++;
+
+			if (td_random(td) && file_randommap(td, io_u->file))
+				mark_random_map(td, io_u, offset, td->o.bs[ddir]);
+		}
+		get_next_dest_seq_offset(td, f, io_u->ddir, td->o.num_range, &dest_offset);
+		io_u->offset = dest_offset;
+
+		io_u->buflen = i * sizeof(struct source_range);
+	} else {
+		/*
+		 * No log, let the seq/rand engine retrieve the next buflen and
+		 * position.
+		 */
+		if (get_next_offset(td, io_u, &is_random)) {
+			dprint(FD_IO, "io_u %p, failed getting offset\n", io_u);
+			return 1;
+		}
+
+		io_u->buflen = get_next_buflen(td, io_u, is_random);
+		if (!io_u->buflen) {
+			dprint(FD_IO, "io_u %p, failed getting buflen\n", io_u);
+			return 1;
+		}
+
+		offset = io_u->offset;
 	}
 
-	offset = io_u->offset;
 	if (td->o.zone_mode == ZONE_MODE_ZBD) {
 		ret = zbd_adjust_block(td, io_u);
 		if (ret == io_u_eof)
@@ -961,13 +1020,16 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
 	/*
 	 * mark entry before potentially trimming io_u
 	 */
-	if (td_random(td) && file_randommap(td, io_u->file))
+	if (!(io_u->ddir == DDIR_COPY) && td_random(td) && file_randommap(td, io_u->file))
 		io_u->buflen = mark_random_map(td, io_u, offset, io_u->buflen);
 
 out:
 	dprint_io_u(io_u, "fill");
 	io_u->verify_offset = io_u->offset;
-	td->zone_bytes += io_u->buflen;
+	if (!(io_u->ddir == DDIR_COPY))
+		td->zone_bytes += io_u->buflen;
+	else
+		td->zone_bytes += (td->o.num_range * td->o.bs[DDIR_COPY]);
 	return 0;
 }
 
@@ -1759,7 +1821,7 @@ struct io_u *get_io_u(struct thread_data *td)
 
 	assert(fio_file_open(f));
 
-	if (ddir_rw(io_u->ddir)) {
+	if (ddir_rw(io_u->ddir) && !(io_u->ddir == DDIR_COPY)) {
 		if (!io_u->buflen && !td_ioengine_flagged(td, FIO_NOIO)) {
 			dprint(FD_IO, "get_io_u: zero buflen on %p\n", io_u);
 			goto err_put;
@@ -1982,9 +2044,14 @@ static void io_completed(struct thread_data *td, struct io_u **io_u_ptr,
 	td->last_ddir = ddir;
 
 	if (!io_u->error && ddir_rw(ddir)) {
-		unsigned long long bytes = io_u->xfer_buflen - io_u->resid;
+		unsigned long long bytes;
 		int ret;
 
+		if (io_u->ddir == DDIR_COPY)
+			bytes = (((io_u->xfer_buflen) * td->o.bs[DDIR_COPY]) /
+						   sizeof(struct source_range));
+		else
+			bytes = io_u->xfer_buflen - io_u->resid;
 		/*
 		 * Make sure we notice short IO from here, and requeue them
 		 * appropriately!
diff --git a/rate-submit.c b/rate-submit.c
index 13dbe7a2..de99906e 100644
--- a/rate-submit.c
+++ b/rate-submit.c
@@ -269,6 +269,8 @@ static void io_workqueue_update_acct_fn(struct submit_worker *sw)
 		sum_ddir(dst, src, DDIR_WRITE);
 	if (td_trim(src))
 		sum_ddir(dst, src, DDIR_TRIM);
+	if (td_copy(src))
+		sum_ddir(dst, src, DDIR_COPY);
 
 }
 
diff --git a/zbd.c b/zbd.c
index 9327816a..58fed98e 100644
--- a/zbd.c
+++ b/zbd.c
@@ -1682,6 +1682,8 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
 	case DDIR_LAST:
 	case DDIR_INVAL:
 		goto accept;
+	case DDIR_COPY:
+		goto eof;
 	}
 
 	assert(false);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/9] New ioctl based synchronous IO engine. Only supports copy command
       [not found]   ` <CGME20201201114105epcas5p4b99d6a66a543152a377461875aedf342@epcas5p4.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 HOWTO          |   4 +
 Makefile       |   2 +-
 engines/sctl.c | 215 +++++++++++++++++++++++++++++++++++++++++++++++++
 fio.1          |   4 +
 io_u.h         |   3 +
 options.c      |   5 ++
 os/os-linux.h  |   1 +
 7 files changed, 233 insertions(+), 1 deletion(-)
 create mode 100644 engines/sctl.c

diff --git a/HOWTO b/HOWTO
index 5cbd46a2..76b04918 100644
--- a/HOWTO
+++ b/HOWTO
@@ -1913,6 +1913,10 @@ I/O engine
 			character devices. This engine supports trim operations.
 			The sg engine includes engine specific options.
 
+		**sctl**
+			Synchronous ioengine. This engine only supports option ``rw=copy`` and
+			``rw=randcopy`` operations. The target should be a block device file.
+
 		**null**
 			Doesn't transfer any data, just pretends to.  This is mainly used to
 			exercise fio itself and for debugging/testing purposes.
diff --git a/Makefile b/Makefile
index ecfaa3e0..3dad92ab 100644
--- a/Makefile
+++ b/Makefile
@@ -189,7 +189,7 @@ endif
 
 ifeq ($(CONFIG_TARGET_OS), Linux)
   SOURCE += diskutil.c fifo.c blktrace.c cgroup.c trim.c engines/sg.c \
-		oslib/linux-dev-lookup.c engines/io_uring.c
+		oslib/linux-dev-lookup.c engines/io_uring.c engines/sctl.c
 ifdef CONFIG_HAS_BLKZONED
   SOURCE += oslib/linux-blkzoned.c
 endif
diff --git a/engines/sctl.c b/engines/sctl.c
new file mode 100644
index 00000000..7c51e9b0
--- /dev/null
+++ b/engines/sctl.c
@@ -0,0 +1,215 @@
+/*
+ * sctl engine
+ *
+ * IO engine using the Linux ioctl based interface for NVMe device
+ * This ioengine operates in sync mode with block devices (/dev/nvmeX)
+ *
+ */
+#include <sys/stat.h>
+#include "../fio.h"
+
+struct sctl_data {
+	struct copy_range *cr;
+};
+
+static enum fio_q_status fio_sctl_queue(struct thread_data *td,
+					struct io_u *io_u)
+{
+	struct copy_range *cr = &io_u->cr;
+	struct fio_file *f = io_u->file;
+	int ret;
+
+	ret = ioctl(f->fd, BLKCOPY, cr);
+
+	if (ret < 0)
+		io_u->error = errno;
+	else {
+		io_u_mark_submit(td, 1);
+		io_u_queued(td, io_u);
+	}
+
+	if (io_u->error) {
+		td_verror(td, io_u->error, "xfer");
+		return FIO_Q_COMPLETED;
+	}
+
+	return ret;
+}
+
+static int fio_sctl_prep(struct thread_data *td, struct io_u *io_u)
+{
+	struct copy_range *cr = &io_u->cr;
+
+	memset(cr, 0, sizeof(*cr));
+
+	cr->dest = io_u->offset;
+	cr->nr_range = io_u->xfer_buflen / sizeof(struct source_range);
+	cr->range_list = (__u64)io_u->xfer_buf;
+
+	return 0;
+}
+
+static void fio_sctl_cleanup(struct thread_data *td)
+{
+	struct sctl_data *sd = td->io_ops_data;
+
+	if (sd) {
+		free(sd->cr);
+		free(sd);
+	}
+}
+
+static int fio_sctl_init(struct thread_data *td)
+{
+	struct sctl_data *sd;
+
+	sd = calloc(1, sizeof(*sd));
+	sd->cr = calloc(td->o.iodepth, sizeof(struct copy_range));
+
+	td->io_ops_data = sd;
+
+	return 0;
+}
+
+static int fio_sctl_type_check(struct thread_data *td, struct fio_file *f) {
+	char cpath[PATH_MAX];
+	FILE *sfile;
+	uint32_t copy_sectors, num_ranges, copy_range_sectors;
+	struct stat st;
+	int rc;
+
+	if (f->filetype != FIO_TYPE_BLOCK)
+		return -EINVAL;
+
+	rc = stat(f->file_name, &st);
+	if (rc < 0) {
+		log_err("%s: failed to stat file %s (%s)\n",
+			td->o.name, f->file_name, strerror(errno));
+		return -errno;
+	}
+
+	snprintf(cpath, PATH_MAX, "/sys/dev/block/%d:%d/queue/max_copy_sectors",
+		 major(st.st_rdev), minor(st.st_rdev));
+
+	sfile = fopen(cpath, "r");
+	if (!sfile) {
+		log_err("%s: fopen on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		return 1;
+	}
+
+	rc = fscanf(sfile, "%u", &copy_sectors);
+	if (rc < 0) {
+		log_err("%s: fscanf on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		fclose(sfile);
+		return 1;
+	}
+
+	if (!copy_sectors) {
+		log_err("%s: Device doesn't support copy operation\n",
+			td->o.name);
+		fclose(sfile);
+		return 1;
+	}
+
+	fclose(sfile);
+
+	snprintf(cpath, PATH_MAX, "/sys/dev/block/%d:%d/queue/max_copy_nr_ranges",
+		 major(st.st_rdev), minor(st.st_rdev));
+
+	sfile = fopen(cpath, "r");
+	if (!sfile) {
+		log_err("%s: fopen on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		return 1;
+	}
+
+	rc = fscanf(sfile, "%u", &num_ranges);
+	if (rc < 0) {
+		log_err("%s: fscanf on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		fclose(sfile);
+		return 1;
+	}
+
+	if (td->o.num_range > num_ranges) {
+		log_err("%s: number of copy ranges is more than device supported"
+			" (%u > %u)\n", td->o.name,
+			td->o.num_range,
+			num_ranges);
+		fclose(sfile);
+		return 1;
+	}
+
+	fclose(sfile);
+
+	snprintf(cpath, PATH_MAX, "/sys/dev/block/%d:%d/queue/max_copy_range_sectors",
+		 major(st.st_rdev), minor(st.st_rdev));
+
+	sfile = fopen(cpath, "r");
+	if (!sfile) {
+		log_err("%s: fopen on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		return 1;
+	}
+
+	rc = fscanf(sfile, "%u", &copy_range_sectors);
+	if (rc < 0) {
+		log_err("%s: fscanf on %s failed (%s)\n",
+			td->o.name, cpath, strerror(errno));
+		fclose(sfile);
+		return 1;
+	}
+
+	if (td->o.bs[DDIR_COPY] > (((unsigned long long) copy_range_sectors) << 9)) {
+		log_err("%s: size of copy range is more than device supported"
+			" (%llu > %llu)\n", td->o.name,
+			td->o.bs[DDIR_COPY],
+			((unsigned long long) copy_range_sectors) << 9);
+		fclose(sfile);
+		return 1;
+	}
+
+	fclose(sfile);
+
+	return 0;
+}
+
+static int fio_sctl_open(struct thread_data *td, struct fio_file *f) {
+	int ret;
+
+	ret = generic_open_file(td, f);
+	if (ret)
+		return ret;
+
+	if (fio_sctl_type_check(td, f)) {
+		ret = generic_close_file(td, f);
+		return 1;
+	}
+
+	return 0;
+}
+
+static struct ioengine_ops ioengine = {
+	.name			= "sctl",
+	.version		= FIO_IOOPS_VERSION,
+	.init			= fio_sctl_init,
+	.prep			= fio_sctl_prep,
+	.queue			= fio_sctl_queue,
+	.cleanup		= fio_sctl_cleanup,
+	.open_file		= fio_sctl_open,
+	.close_file		= generic_close_file,
+	.get_file_size		= generic_get_file_size,
+	.flags			= FIO_SYNCIO
+};
+
+static void fio_init fio_sctl_register(void)
+{
+	register_ioengine(&ioengine);
+}
+
+static void fio_exit fio_sctl_unregister(void)
+{
+	unregister_ioengine(&ioengine);
+}
diff --git a/fio.1 b/fio.1
index d4da1b0c..c09d0288 100644
--- a/fio.1
+++ b/fio.1
@@ -1684,6 +1684,10 @@ I/O. Requires \fBfilename\fR option to specify either block or
 character devices. This engine supports trim operations. The
 sg engine includes engine specific options.
 .TP
+.B sctl
+Synchronous ioengine. This engine only supports `rw=copy' and
+`rw=randcopy' operations. The target should be a block device file.
+.TP
 .B libzbc
 Synchronous I/O engine for SMR hard-disks using the \fBlibzbc\fR
 library. The target can be either an sg character device or
diff --git a/io_u.h b/io_u.h
index d4c5be43..e0b10595 100644
--- a/io_u.h
+++ b/io_u.h
@@ -127,6 +127,9 @@ struct io_u {
 #endif
 #ifdef CONFIG_RDMA
 		struct ibv_mr *mr;
+#endif
+#ifdef FIO_HAVE_SCTL
+		struct copy_range cr;
 #endif
 		void *mmap_data;
 	};
diff --git a/options.c b/options.c
index e085ab25..1b3bed3b 100644
--- a/options.c
+++ b/options.c
@@ -1886,6 +1886,11 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
 			  { .ival = "sg",
 			    .help = "SCSI generic v3 IO",
 			  },
+#endif
+#ifdef FIO_HAVE_SCTL
+			  { .ival = "sctl",
+			    .help = "Simple copy IO engine",
+			  },
 #endif
 			  { .ival = "null",
 			    .help = "Testing engine (no data transfer)",
diff --git a/os/os-linux.h b/os/os-linux.h
index 5562b0da..d0a620cc 100644
--- a/os/os-linux.h
+++ b/os/os-linux.h
@@ -36,6 +36,7 @@
 #define FIO_HAVE_CPU_AFFINITY
 #define FIO_HAVE_DISK_UTIL
 #define FIO_HAVE_SGIO
+#define FIO_HAVE_SCTL
 #define FIO_HAVE_IOPRIO
 #define FIO_HAVE_IOPRIO_CLASS
 #define FIO_HAVE_IOSCHED_SWITCH
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/9] Example configuration for simple copy command
       [not found]   ` <CGME20201201114107epcas5p4694adc6b50a123a06a06411393395636@epcas5p4.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 examples/fio-copy.fio | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 examples/fio-copy.fio

diff --git a/examples/fio-copy.fio b/examples/fio-copy.fio
new file mode 100644
index 00000000..02078147
--- /dev/null
+++ b/examples/fio-copy.fio
@@ -0,0 +1,32 @@
+[global]
+threads=1
+filename=fio-copy
+size=128m
+
+[writer]
+rw=write
+bs=128k
+numjobs=1
+offset=0
+buffer_pattern=0xdeadbeef
+
+[copier]
+new_group
+wait_for=writer
+rw=copy
+bs=16k
+offset=0
+dest_offset=256m
+num_range=8
+numjobs=1
+ioengine=sctl
+
+[reader]
+new_group
+wait_for=copier
+rw=read
+offset=256m
+bs=128k
+numjobs=1
+verify=pattern
+verify_pattern=0xdeadbeef
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/9] Support copy operation for zoned block devices.
       [not found]   ` <CGME20201201114110epcas5p34032161c14f467576734346712d0c3db@epcas5p3.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  2020-12-01 12:11       ` Damien Le Moal
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Added a check so that source and destination zones don't overlap.
Source and destination offsets are aligned to zone start.
The source range zone data is copied sequentially to the destination
zones.
Added a function to reset the destination zones. Source zones won't
be reset.

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 file.h      |  4 +++
 filesetup.c | 18 +++++++++++
 init.c      |  5 +++
 io_u.c      |  2 +-
 zbd.c       | 88 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 109 insertions(+), 8 deletions(-)

diff --git a/file.h b/file.h
index f5a794e4..23012753 100644
--- a/file.h
+++ b/file.h
@@ -110,6 +110,10 @@ struct fio_file {
 	uint32_t min_zone;	/* inclusive */
 	uint32_t max_zone;	/* exclusive */
 
+	/* zonemode=zbd copy destination working area */
+	uint32_t min_dest_zone;	/* inclusive */
+	uint32_t max_dest_zone;	/* exclusive */
+
 	/*
 	 * Track last end and last start of IO for a given data direction
 	 */
diff --git a/filesetup.c b/filesetup.c
index 68a21fac..54752511 100644
--- a/filesetup.c
+++ b/filesetup.c
@@ -1231,6 +1231,24 @@ int setup_files(struct thread_data *td)
 
 		if (td_copy(td))
 			f->file_dest_offset = get_dest_offset(td, f);
+
+		if (td_copy(td) && (td->o.zone_mode == ZONE_MODE_ZBD)) {
+			if (f->file_offset > f->file_dest_offset) {
+				if (f->file_offset - f->file_dest_offset < f->io_size) {
+					log_err("%s: For copy operation on ZBD device "
+						 "source and destination area shouldn't overlap\n",
+						 o->name);
+					goto err_out;
+				}
+			} else {
+				if (f->file_dest_offset - f->file_offset < f->io_size) {
+					log_err("%s: For copy operation on ZBD device "
+						 "source and destination area shouldn't overlap\n",
+						 o->name);
+					goto err_out;
+				}
+			}
+		}
 	}
 
 	if (td->o.block_error_hist) {
diff --git a/init.c b/init.c
index e5835b7b..b5db65af 100644
--- a/init.c
+++ b/init.c
@@ -671,6 +671,11 @@ static int fixup_options(struct thread_data *td)
 	if (o->zone_mode == ZONE_MODE_STRIDED && !o->zone_range)
 		o->zone_range = o->zone_size;
 
+	if (o->zone_mode == ZONE_MODE_ZBD && td_copy(td) && td_random(td)) {
+		log_err("fio: --zonemode=zbd supports copy operation only in sequential mode.\n");
+		ret |= 1;
+	}
+
 	/*
 	 * Reads and copies can do overwrites, we always need to pre-create the file
 	 */
diff --git a/io_u.c b/io_u.c
index 83c7960a..2de91f2a 100644
--- a/io_u.c
+++ b/io_u.c
@@ -1003,7 +1003,7 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
 		offset = io_u->offset;
 	}
 
-	if (td->o.zone_mode == ZONE_MODE_ZBD) {
+	if ((td->o.zone_mode == ZONE_MODE_ZBD) && !(td_copy(td))) {
 		ret = zbd_adjust_block(td, io_u);
 		if (ret == io_u_eof)
 			return 1;
diff --git a/zbd.c b/zbd.c
index 58fed98e..8201665b 100644
--- a/zbd.c
+++ b/zbd.c
@@ -246,11 +246,11 @@ static bool zbd_is_seq_job(struct fio_file *f)
  */
 static bool zbd_verify_sizes(void)
 {
-	const struct fio_zone_info *z;
+	const struct fio_zone_info *z, *zd;
 	struct thread_data *td;
 	struct fio_file *f;
 	uint64_t new_offset, new_end;
-	uint32_t zone_idx;
+	uint32_t zone_idx, zone_didx;
 	int i, j;
 
 	for_each_td(td, i) {
@@ -259,6 +259,9 @@ static bool zbd_verify_sizes(void)
 				continue;
 			if (f->file_offset >= f->real_file_size)
 				continue;
+			if ((td->o.td_ddir == TD_DDIR_COPY) &&
+			    (f->file_dest_offset >= f->real_file_size))
+				continue;
 			if (!zbd_is_seq_job(f))
 				continue;
 
@@ -301,6 +304,15 @@ static bool zbd_verify_sizes(void)
 				f->io_size -= (new_offset - f->file_offset);
 				f->file_offset = new_offset;
 			}
+			if (td->o.td_ddir == TD_DDIR_COPY) {
+				zone_didx = zbd_zone_idx(f, f->file_dest_offset);
+				zd = &f->zbd_info->zone_info[zone_didx];
+				if (f->file_dest_offset != zd->start) {
+					new_offset = zbd_zone_end(zd);
+					f->file_dest_offset = new_offset;
+				}
+			}
+
 			zone_idx = zbd_zone_idx(f, f->file_offset + f->io_size);
 			z = &f->zbd_info->zone_info[zone_idx];
 			new_end = z->start;
@@ -320,6 +332,12 @@ static bool zbd_verify_sizes(void)
 			f->min_zone = zbd_zone_idx(f, f->file_offset);
 			f->max_zone = zbd_zone_idx(f, f->file_offset + f->io_size);
 			assert(f->min_zone < f->max_zone);
+
+			if (td->o.td_ddir == TD_DDIR_COPY) {
+				f->min_dest_zone = zbd_zone_idx(f, f->file_dest_offset);
+				f->max_dest_zone = zbd_zone_idx(f, f->file_dest_offset + f->io_size);
+				assert(f->min_dest_zone < f->max_dest_zone);
+			}
 		}
 	}
 
@@ -823,6 +841,42 @@ static int zbd_reset_zones(struct thread_data *td, struct fio_file *f,
 	return res;
 }
 
+/*
+ * Reset a range of zones.
+ * @td: fio thread data.
+ * @f: fio file for which to reset zones
+ */
+static void zbd_reset_dest_zones(struct thread_data *td, struct fio_file *f)
+{
+	struct fio_zone_info *z, *zb, *ze;
+	int ret = 0;
+	uint64_t offset, length;
+
+	zb = &f->zbd_info->zone_info[f->min_dest_zone];
+	ze = &f->zbd_info->zone_info[f->max_dest_zone];
+
+	for (z = zb; z < ze; z++) {
+		offset = z->start;
+		length = (z+1)->start - offset;
+
+		dprint(FD_ZBD, "%s: resetting wp of zone %u.\n", f->file_name,
+		       zbd_zone_nr(f->zbd_info, z));
+		switch (f->zbd_info->model) {
+		case ZBD_HOST_AWARE:
+		case ZBD_HOST_MANAGED:
+			ret = zbd_reset_wp(td, f, offset, length);
+			break;
+		default:
+			break;
+		}
+
+		if (ret < 0)
+			continue;
+
+		td->ts.nr_zone_resets++;
+	}
+}
+
 /*
  * Reset zbd_info.write_cnt, the counter that counts down towards the next
  * zone reset.
@@ -924,9 +978,14 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
 {
 	struct fio_zone_info *zb, *ze;
 
-	if (!f->zbd_info || !td_write(td))
+	if (!f->zbd_info || !(td_write(td) || td_copy(td)))
 		return;
 
+	if (td_copy(td)) {
+		zbd_reset_dest_zones(td, f);
+		return;
+	}
+
 	zb = &f->zbd_info->zone_info[f->min_zone];
 	ze = &f->zbd_info->zone_info[f->max_zone];
 	zbd_init_swd(f);
@@ -1410,8 +1469,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
 {
 	struct fio_file *f = io_u->file;
 	enum fio_ddir ddir = io_u->ddir;
-	struct fio_zone_info *z;
-	uint32_t zone_idx;
+	struct fio_zone_info *z, *zd;
+	uint32_t zone_idx, zone_didx;
 
 	assert(td->o.zone_mode == ZONE_MODE_ZBD);
 	assert(td->o.zone_size);
@@ -1419,13 +1478,18 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
 	zone_idx = zbd_zone_idx(f, f->last_pos[ddir]);
 	z = &f->zbd_info->zone_info[zone_idx];
 
+	if (ddir == DDIR_COPY) {
+		zone_didx = zbd_zone_idx(f, f->last_pos_dest[ddir]);
+		zd = &f->zbd_info->zone_info[zone_didx];
+	}
+
 	/*
 	 * When the zone capacity is smaller than the zone size and the I/O is
-	 * sequential write, skip to zone end if the latest position is at the
+	 * sequential write or copy, skip to zone end if the latest position is at the
 	 * zone capacity limit.
 	 */
 	if (z->capacity < f->zbd_info->zone_size && !td_random(td) &&
-	    ddir == DDIR_WRITE &&
+	    (ddir == DDIR_WRITE || ddir == DDIR_COPY) &&
 	    f->last_pos[ddir] >= zbd_zone_capacity_end(z)) {
 		dprint(FD_ZBD,
 		       "%s: Jump from zone capacity limit to zone end:"
@@ -1436,6 +1500,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
 		       (unsigned long long) z->capacity);
 		td->io_skip_bytes += zbd_zone_end(z) - f->last_pos[ddir];
 		f->last_pos[ddir] = zbd_zone_end(z);
+		if (ddir == DDIR_COPY)
+			f->last_pos_dest[ddir] = zbd_zone_end(zd);
 	}
 
 	/*
@@ -1461,13 +1527,21 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
 		td->zone_bytes = 0;
 		f->file_offset += td->o.zone_size + td->o.zone_skip;
 
+		if (ddir == DDIR_COPY)
+			f->file_dest_offset += td->o.zone_size + td->o.zone_skip;
 		/*
 		 * Wrap from the beginning, if we exceed the file size
 		 */
 		if (f->file_offset >= f->real_file_size)
 			f->file_offset = get_start_offset(td, f);
 
+		if ((ddir == DDIR_COPY) && f->file_dest_offset >= f->real_file_size)
+			f->file_dest_offset = get_dest_offset(td, f);
+
 		f->last_pos[ddir] = f->file_offset;
+		if (ddir == DDIR_COPY)
+			f->last_pos_dest[io_u->ddir] = f->file_dest_offset;
+
 		td->io_skip_bytes += td->o.zone_skip;
 	}
 }
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 9/9] Add a new test case to test the copy operation.
       [not found]   ` <CGME20201201114113epcas5p328eabe564e0bda24c29e285bba8d8fd2@epcas5p3.samsung.com>
@ 2020-12-01 11:40     ` Krishna Kanth Reddy
  0 siblings, 0 replies; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-01 11:40 UTC (permalink / raw)
  To: axboe; +Cc: fio, Ankit Kumar, Krishna Kanth Reddy

From: Ankit Kumar <ankit.kumar@samsung.com>

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
---
 t/zbd/functions        |  9 +++++++
 t/zbd/test-zbd-support | 54 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/t/zbd/functions b/t/zbd/functions
index 1a64a215..52eb697d 100644
--- a/t/zbd/functions
+++ b/t/zbd/functions
@@ -26,6 +26,15 @@ blkzone_reports_capacity() {
 		"${blkzone}" report -c 1 -o 0 "${dev}" | grep -q 'cap '
 }
 
+# Check whether or not $1 (/dev/...) supports copy operation
+check_copy_support() {
+	local copy_support
+
+	copy_support=$(<"/sys/block/$(basename "${1}")/queue/max_copy_sectors")
+
+	[[ "$copy_support" -gt 0 ]]
+}
+
 # Whether or not $1 (/dev/...) is a NVME ZNS device.
 is_nvme_zns() {
 	local s
diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index acde3b3a..0857034d 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -948,6 +948,60 @@ test49() {
     check_read $((capacity * 2)) || return $?
 }
 
+# Sequential write, copy and read test
+test50() {
+    local i off opts=() size
+
+    if ! $(check_copy_support "$dev"); then
+	echo "$dev doesn't support copy operation" \
+	     >>"${logfile}.${test_number}"
+	return 0
+    fi
+
+    for ((i=0;i<8;i++)); do
+	[ -n "$is_zbd" ] &&
+	    reset_zone "$dev" $((first_sequential_zone_sector +
+				 i*sectors_per_zone))
+    done
+
+    size=$((zone_size * 8))
+    zonesize=$((zone_size * 1))
+    off=$((first_sequential_zone_sector * 512))
+    dest_off=$((off + zone_size * 16))
+    capacity=$(total_zone_capacity 8 $off $dev)
+
+    # Start with writing to 8 zones (0 - 7).
+    opts+=("--name=$dev" "--filename=$dev" "--thread=1" "--direct=1")
+    opts+=("--offset=${off}" "--size=$size" "--ioengine=libaio" "--rw=write" "--bs=16K")
+    opts+=("--buffer_pattern=0xdeadbeef" "--iodepth=32")
+    opts+=("--zonemode=zbd" "--group_reporting=1")
+    "$(dirname "$0")/../../fio" "${opts[@]}" >> "${logfile}.${test_number}" 2>&1 ||
+	    return $?
+    check_written $capacity || return $?
+
+    # Next, run copy operation. 8 ranges per copy operation and each range is 16K.
+    # Loop around the source and destination offset
+    opts=()
+    opts+=("--name=$dev" "--filename=$dev" "--thread=1" "--direct=1")
+    opts+=("--offset=${off}" "--size=$size" "--ioengine=sctl" "--rw=copy" "--bs=16K")
+    opts+=("--iodepth=1" "--num_range=8")
+    opts+=("--dest_offset=${dest_off}" "--loops=2")
+    opts+=("--zonemode=zbd" "--zonesize=${zonesize}" "--group_reporting=1")
+    run_fio "${opts[@]}" >> "${logfile}.${test_number}" 2>&1 || return $?
+
+    # read and verify the copied data pattern. Cannot run with zonemode=zbd
+    off=$((first_sequential_zone_sector * 512 + zone_size * 16))
+
+    opts=()
+    opts+=("--name=$dev" "--filename=$dev" "--thread=1" "--direct=1")
+    opts+=("--offset=${off}" "--size=$size" "--ioengine=libaio" "--rw=read" "--bs=4K")
+    opts+=("--iodepth=32" "--group_reporting=1")
+    opts+=("--do_verify=1" "--verify=pattern" "--verify_pattern=0xdeadbeef")
+    run_fio "${opts[@]}" >> "${logfile}.${test_number}" 2>&1 || return $?
+    check_read $capacity || return $?
+}
+
+
 tests=()
 dynamic_analyzer=()
 reset_all_zones=
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 8/9] Support copy operation for zoned block devices.
  2020-12-01 11:40     ` [PATCH 8/9] Support copy operation for zoned block devices Krishna Kanth Reddy
@ 2020-12-01 12:11       ` Damien Le Moal
  2020-12-10 14:14         ` Krishna Kanth Reddy
  0 siblings, 1 reply; 13+ messages in thread
From: Damien Le Moal @ 2020-12-01 12:11 UTC (permalink / raw)
  To: Krishna Kanth Reddy, axboe; +Cc: fio, Ankit Kumar

On 2020/12/01 20:44, Krishna Kanth Reddy wrote:
> From: Ankit Kumar <ankit.kumar@samsung.com>
> 
> Added a check so that source and destination zones don't overlap.
> Source and destination offsets are aligned to zone start.
> The source range zone data is copied sequentially to the destination
> zones.
> Added a function to reset the destination zones. Source zones won't
> be reset.

I do not see how this can work correctly without relying on the zone locking
mechanism. Other jobs may be writing to the same zones too. That will either
result in the copy failing or the writes failing due to the zone wp unexpectedly
changing.

> 
> Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
> ---
>  file.h      |  4 +++
>  filesetup.c | 18 +++++++++++
>  init.c      |  5 +++
>  io_u.c      |  2 +-
>  zbd.c       | 88 ++++++++++++++++++++++++++++++++++++++++++++++++-----
>  5 files changed, 109 insertions(+), 8 deletions(-)
> 
> diff --git a/file.h b/file.h
> index f5a794e4..23012753 100644
> --- a/file.h
> +++ b/file.h
> @@ -110,6 +110,10 @@ struct fio_file {
>  	uint32_t min_zone;	/* inclusive */
>  	uint32_t max_zone;	/* exclusive */
>  
> +	/* zonemode=zbd copy destination working area */
> +	uint32_t min_dest_zone;	/* inclusive */
> +	uint32_t max_dest_zone;	/* exclusive */
> +
>  	/*
>  	 * Track last end and last start of IO for a given data direction
>  	 */
> diff --git a/filesetup.c b/filesetup.c
> index 68a21fac..54752511 100644
> --- a/filesetup.c
> +++ b/filesetup.c
> @@ -1231,6 +1231,24 @@ int setup_files(struct thread_data *td)
>  
>  		if (td_copy(td))
>  			f->file_dest_offset = get_dest_offset(td, f);
> +
> +		if (td_copy(td) && (td->o.zone_mode == ZONE_MODE_ZBD)) {
> +			if (f->file_offset > f->file_dest_offset) {
> +				if (f->file_offset - f->file_dest_offset < f->io_size) {
> +					log_err("%s: For copy operation on ZBD device "
> +						 "source and destination area shouldn't overlap\n",
> +						 o->name);
> +					goto err_out;
> +				}
> +			} else {
> +				if (f->file_dest_offset - f->file_offset < f->io_size) {
> +					log_err("%s: For copy operation on ZBD device "
> +						 "source and destination area shouldn't overlap\n",
> +						 o->name);
> +					goto err_out;
> +				}
> +			}
> +		}
>  	}
>  
>  	if (td->o.block_error_hist) {
> diff --git a/init.c b/init.c
> index e5835b7b..b5db65af 100644
> --- a/init.c
> +++ b/init.c
> @@ -671,6 +671,11 @@ static int fixup_options(struct thread_data *td)
>  	if (o->zone_mode == ZONE_MODE_STRIDED && !o->zone_range)
>  		o->zone_range = o->zone_size;
>  
> +	if (o->zone_mode == ZONE_MODE_ZBD && td_copy(td) && td_random(td)) {
> +		log_err("fio: --zonemode=zbd supports copy operation only in sequential mode.\n");
> +		ret |= 1;
> +	}
> +
>  	/*
>  	 * Reads and copies can do overwrites, we always need to pre-create the file
>  	 */
> diff --git a/io_u.c b/io_u.c
> index 83c7960a..2de91f2a 100644
> --- a/io_u.c
> +++ b/io_u.c
> @@ -1003,7 +1003,7 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
>  		offset = io_u->offset;
>  	}
>  
> -	if (td->o.zone_mode == ZONE_MODE_ZBD) {
> +	if ((td->o.zone_mode == ZONE_MODE_ZBD) && !(td_copy(td))) {
>  		ret = zbd_adjust_block(td, io_u);
>  		if (ret == io_u_eof)
>  			return 1;
> diff --git a/zbd.c b/zbd.c
> index 58fed98e..8201665b 100644
> --- a/zbd.c
> +++ b/zbd.c
> @@ -246,11 +246,11 @@ static bool zbd_is_seq_job(struct fio_file *f)
>   */
>  static bool zbd_verify_sizes(void)
>  {
> -	const struct fio_zone_info *z;
> +	const struct fio_zone_info *z, *zd;
>  	struct thread_data *td;
>  	struct fio_file *f;
>  	uint64_t new_offset, new_end;
> -	uint32_t zone_idx;
> +	uint32_t zone_idx, zone_didx;
>  	int i, j;
>  
>  	for_each_td(td, i) {
> @@ -259,6 +259,9 @@ static bool zbd_verify_sizes(void)
>  				continue;
>  			if (f->file_offset >= f->real_file_size)
>  				continue;
> +			if ((td->o.td_ddir == TD_DDIR_COPY) &&
> +			    (f->file_dest_offset >= f->real_file_size))
> +				continue;
>  			if (!zbd_is_seq_job(f))
>  				continue;
>  
> @@ -301,6 +304,15 @@ static bool zbd_verify_sizes(void)
>  				f->io_size -= (new_offset - f->file_offset);
>  				f->file_offset = new_offset;
>  			}
> +			if (td->o.td_ddir == TD_DDIR_COPY) {
> +				zone_didx = zbd_zone_idx(f, f->file_dest_offset);
> +				zd = &f->zbd_info->zone_info[zone_didx];
> +				if (f->file_dest_offset != zd->start) {
> +					new_offset = zbd_zone_end(zd);
> +					f->file_dest_offset = new_offset;
> +				}
> +			}
> +
>  			zone_idx = zbd_zone_idx(f, f->file_offset + f->io_size);
>  			z = &f->zbd_info->zone_info[zone_idx];
>  			new_end = z->start;
> @@ -320,6 +332,12 @@ static bool zbd_verify_sizes(void)
>  			f->min_zone = zbd_zone_idx(f, f->file_offset);
>  			f->max_zone = zbd_zone_idx(f, f->file_offset + f->io_size);
>  			assert(f->min_zone < f->max_zone);
> +
> +			if (td->o.td_ddir == TD_DDIR_COPY) {
> +				f->min_dest_zone = zbd_zone_idx(f, f->file_dest_offset);
> +				f->max_dest_zone = zbd_zone_idx(f, f->file_dest_offset + f->io_size);
> +				assert(f->min_dest_zone < f->max_dest_zone);
> +			}
>  		}
>  	}
>  
> @@ -823,6 +841,42 @@ static int zbd_reset_zones(struct thread_data *td, struct fio_file *f,
>  	return res;
>  }
>  
> +/*
> + * Reset a range of zones.
> + * @td: fio thread data.
> + * @f: fio file for which to reset zones
> + */
> +static void zbd_reset_dest_zones(struct thread_data *td, struct fio_file *f)
> +{
> +	struct fio_zone_info *z, *zb, *ze;
> +	int ret = 0;
> +	uint64_t offset, length;
> +
> +	zb = &f->zbd_info->zone_info[f->min_dest_zone];
> +	ze = &f->zbd_info->zone_info[f->max_dest_zone];
> +
> +	for (z = zb; z < ze; z++) {
> +		offset = z->start;
> +		length = (z+1)->start - offset;
> +
> +		dprint(FD_ZBD, "%s: resetting wp of zone %u.\n", f->file_name,
> +		       zbd_zone_nr(f->zbd_info, z));
> +		switch (f->zbd_info->model) {
> +		case ZBD_HOST_AWARE:
> +		case ZBD_HOST_MANAGED:
> +			ret = zbd_reset_wp(td, f, offset, length);
> +			break;
> +		default:
> +			break;
> +		}
> +
> +		if (ret < 0)
> +			continue;
> +
> +		td->ts.nr_zone_resets++;
> +	}
> +}
> +
>  /*
>   * Reset zbd_info.write_cnt, the counter that counts down towards the next
>   * zone reset.
> @@ -924,9 +978,14 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
>  {
>  	struct fio_zone_info *zb, *ze;
>  
> -	if (!f->zbd_info || !td_write(td))
> +	if (!f->zbd_info || !(td_write(td) || td_copy(td)))
>  		return;
>  
> +	if (td_copy(td)) {
> +		zbd_reset_dest_zones(td, f);
> +		return;
> +	}
> +
>  	zb = &f->zbd_info->zone_info[f->min_zone];
>  	ze = &f->zbd_info->zone_info[f->max_zone];
>  	zbd_init_swd(f);
> @@ -1410,8 +1469,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>  {
>  	struct fio_file *f = io_u->file;
>  	enum fio_ddir ddir = io_u->ddir;
> -	struct fio_zone_info *z;
> -	uint32_t zone_idx;
> +	struct fio_zone_info *z, *zd;
> +	uint32_t zone_idx, zone_didx;
>  
>  	assert(td->o.zone_mode == ZONE_MODE_ZBD);
>  	assert(td->o.zone_size);
> @@ -1419,13 +1478,18 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>  	zone_idx = zbd_zone_idx(f, f->last_pos[ddir]);
>  	z = &f->zbd_info->zone_info[zone_idx];
>  
> +	if (ddir == DDIR_COPY) {
> +		zone_didx = zbd_zone_idx(f, f->last_pos_dest[ddir]);
> +		zd = &f->zbd_info->zone_info[zone_didx];
> +	}
> +
>  	/*
>  	 * When the zone capacity is smaller than the zone size and the I/O is
> -	 * sequential write, skip to zone end if the latest position is at the
> +	 * sequential write or copy, skip to zone end if the latest position is at the
>  	 * zone capacity limit.
>  	 */
>  	if (z->capacity < f->zbd_info->zone_size && !td_random(td) &&
> -	    ddir == DDIR_WRITE &&
> +	    (ddir == DDIR_WRITE || ddir == DDIR_COPY) &&
>  	    f->last_pos[ddir] >= zbd_zone_capacity_end(z)) {
>  		dprint(FD_ZBD,
>  		       "%s: Jump from zone capacity limit to zone end:"
> @@ -1436,6 +1500,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>  		       (unsigned long long) z->capacity);
>  		td->io_skip_bytes += zbd_zone_end(z) - f->last_pos[ddir];
>  		f->last_pos[ddir] = zbd_zone_end(z);
> +		if (ddir == DDIR_COPY)
> +			f->last_pos_dest[ddir] = zbd_zone_end(zd);
>  	}
>  
>  	/*
> @@ -1461,13 +1527,21 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>  		td->zone_bytes = 0;
>  		f->file_offset += td->o.zone_size + td->o.zone_skip;
>  
> +		if (ddir == DDIR_COPY)
> +			f->file_dest_offset += td->o.zone_size + td->o.zone_skip;
>  		/*
>  		 * Wrap from the beginning, if we exceed the file size
>  		 */
>  		if (f->file_offset >= f->real_file_size)
>  			f->file_offset = get_start_offset(td, f);
>  
> +		if ((ddir == DDIR_COPY) && f->file_dest_offset >= f->real_file_size)
> +			f->file_dest_offset = get_dest_offset(td, f);
> +
>  		f->last_pos[ddir] = f->file_offset;
> +		if (ddir == DDIR_COPY)
> +			f->last_pos_dest[io_u->ddir] = f->file_dest_offset;
> +
>  		td->io_skip_bytes += td->o.zone_skip;
>  	}
>  }
> 


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 8/9] Support copy operation for zoned block devices.
  2020-12-01 12:11       ` Damien Le Moal
@ 2020-12-10 14:14         ` Krishna Kanth Reddy
  2020-12-11  0:37           ` Damien Le Moal
  0 siblings, 1 reply; 13+ messages in thread
From: Krishna Kanth Reddy @ 2020-12-10 14:14 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: axboe, fio, Ankit Kumar

[-- Attachment #1: Type: text/plain, Size: 9844 bytes --]

On Tue, Dec 01, 2020 at 12:11:48PM +0000, Damien Le Moal wrote:
>On 2020/12/01 20:44, Krishna Kanth Reddy wrote:
>> From: Ankit Kumar <ankit.kumar@samsung.com>
>>
>> Added a check so that source and destination zones don't overlap.
>> Source and destination offsets are aligned to zone start.
>> The source range zone data is copied sequentially to the destination
>> zones.
>> Added a function to reset the destination zones. Source zones won't
>> be reset.
>
>I do not see how this can work correctly without relying on the zone locking
>mechanism. Other jobs may be writing to the same zones too. That will either
>result in the copy failing or the writes failing due to the zone wp unexpectedly
>changing.
>
Sorry for the delay response, as I was hoping for comments for the other patches in this patchset too.

Yes, you are right. There are no locks in the current implementation.
Our initial focus is to introduce a new copy workload and get the infrastructure ready for FIO.

We will modify the existing patch to add the zone locking mechanism, so that it can be used in multi-threaded scenario.
Kindly provide your valuable feedback to the other patches in this patchset too.

>>
>> Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
>> ---
>>  file.h      |  4 +++
>>  filesetup.c | 18 +++++++++++
>>  init.c      |  5 +++
>>  io_u.c      |  2 +-
>>  zbd.c       | 88 ++++++++++++++++++++++++++++++++++++++++++++++++-----
>>  5 files changed, 109 insertions(+), 8 deletions(-)
>>
>> diff --git a/file.h b/file.h
>> index f5a794e4..23012753 100644
>> --- a/file.h
>> +++ b/file.h
>> @@ -110,6 +110,10 @@ struct fio_file {
>>  	uint32_t min_zone;	/* inclusive */
>>  	uint32_t max_zone;	/* exclusive */
>>
>> +	/* zonemode=zbd copy destination working area */
>> +	uint32_t min_dest_zone;	/* inclusive */
>> +	uint32_t max_dest_zone;	/* exclusive */
>> +
>>  	/*
>>  	 * Track last end and last start of IO for a given data direction
>>  	 */
>> diff --git a/filesetup.c b/filesetup.c
>> index 68a21fac..54752511 100644
>> --- a/filesetup.c
>> +++ b/filesetup.c
>> @@ -1231,6 +1231,24 @@ int setup_files(struct thread_data *td)
>>
>>  		if (td_copy(td))
>>  			f->file_dest_offset = get_dest_offset(td, f);
>> +
>> +		if (td_copy(td) && (td->o.zone_mode == ZONE_MODE_ZBD)) {
>> +			if (f->file_offset > f->file_dest_offset) {
>> +				if (f->file_offset - f->file_dest_offset < f->io_size) {
>> +					log_err("%s: For copy operation on ZBD device "
>> +						 "source and destination area shouldn't overlap\n",
>> +						 o->name);
>> +					goto err_out;
>> +				}
>> +			} else {
>> +				if (f->file_dest_offset - f->file_offset < f->io_size) {
>> +					log_err("%s: For copy operation on ZBD device "
>> +						 "source and destination area shouldn't overlap\n",
>> +						 o->name);
>> +					goto err_out;
>> +				}
>> +			}
>> +		}
>>  	}
>>
>>  	if (td->o.block_error_hist) {
>> diff --git a/init.c b/init.c
>> index e5835b7b..b5db65af 100644
>> --- a/init.c
>> +++ b/init.c
>> @@ -671,6 +671,11 @@ static int fixup_options(struct thread_data *td)
>>  	if (o->zone_mode == ZONE_MODE_STRIDED && !o->zone_range)
>>  		o->zone_range = o->zone_size;
>>
>> +	if (o->zone_mode == ZONE_MODE_ZBD && td_copy(td) && td_random(td)) {
>> +		log_err("fio: --zonemode=zbd supports copy operation only in sequential mode.\n");
>> +		ret |= 1;
>> +	}
>> +
>>  	/*
>>  	 * Reads and copies can do overwrites, we always need to pre-create the file
>>  	 */
>> diff --git a/io_u.c b/io_u.c
>> index 83c7960a..2de91f2a 100644
>> --- a/io_u.c
>> +++ b/io_u.c
>> @@ -1003,7 +1003,7 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
>>  		offset = io_u->offset;
>>  	}
>>
>> -	if (td->o.zone_mode == ZONE_MODE_ZBD) {
>> +	if ((td->o.zone_mode == ZONE_MODE_ZBD) && !(td_copy(td))) {
>>  		ret = zbd_adjust_block(td, io_u);
>>  		if (ret == io_u_eof)
>>  			return 1;
>> diff --git a/zbd.c b/zbd.c
>> index 58fed98e..8201665b 100644
>> --- a/zbd.c
>> +++ b/zbd.c
>> @@ -246,11 +246,11 @@ static bool zbd_is_seq_job(struct fio_file *f)
>>   */
>>  static bool zbd_verify_sizes(void)
>>  {
>> -	const struct fio_zone_info *z;
>> +	const struct fio_zone_info *z, *zd;
>>  	struct thread_data *td;
>>  	struct fio_file *f;
>>  	uint64_t new_offset, new_end;
>> -	uint32_t zone_idx;
>> +	uint32_t zone_idx, zone_didx;
>>  	int i, j;
>>
>>  	for_each_td(td, i) {
>> @@ -259,6 +259,9 @@ static bool zbd_verify_sizes(void)
>>  				continue;
>>  			if (f->file_offset >= f->real_file_size)
>>  				continue;
>> +			if ((td->o.td_ddir == TD_DDIR_COPY) &&
>> +			    (f->file_dest_offset >= f->real_file_size))
>> +				continue;
>>  			if (!zbd_is_seq_job(f))
>>  				continue;
>>
>> @@ -301,6 +304,15 @@ static bool zbd_verify_sizes(void)
>>  				f->io_size -= (new_offset - f->file_offset);
>>  				f->file_offset = new_offset;
>>  			}
>> +			if (td->o.td_ddir == TD_DDIR_COPY) {
>> +				zone_didx = zbd_zone_idx(f, f->file_dest_offset);
>> +				zd = &f->zbd_info->zone_info[zone_didx];
>> +				if (f->file_dest_offset != zd->start) {
>> +					new_offset = zbd_zone_end(zd);
>> +					f->file_dest_offset = new_offset;
>> +				}
>> +			}
>> +
>>  			zone_idx = zbd_zone_idx(f, f->file_offset + f->io_size);
>>  			z = &f->zbd_info->zone_info[zone_idx];
>>  			new_end = z->start;
>> @@ -320,6 +332,12 @@ static bool zbd_verify_sizes(void)
>>  			f->min_zone = zbd_zone_idx(f, f->file_offset);
>>  			f->max_zone = zbd_zone_idx(f, f->file_offset + f->io_size);
>>  			assert(f->min_zone < f->max_zone);
>> +
>> +			if (td->o.td_ddir == TD_DDIR_COPY) {
>> +				f->min_dest_zone = zbd_zone_idx(f, f->file_dest_offset);
>> +				f->max_dest_zone = zbd_zone_idx(f, f->file_dest_offset + f->io_size);
>> +				assert(f->min_dest_zone < f->max_dest_zone);
>> +			}
>>  		}
>>  	}
>>
>> @@ -823,6 +841,42 @@ static int zbd_reset_zones(struct thread_data *td, struct fio_file *f,
>>  	return res;
>>  }
>>
>> +/*
>> + * Reset a range of zones.
>> + * @td: fio thread data.
>> + * @f: fio file for which to reset zones
>> + */
>> +static void zbd_reset_dest_zones(struct thread_data *td, struct fio_file *f)
>> +{
>> +	struct fio_zone_info *z, *zb, *ze;
>> +	int ret = 0;
>> +	uint64_t offset, length;
>> +
>> +	zb = &f->zbd_info->zone_info[f->min_dest_zone];
>> +	ze = &f->zbd_info->zone_info[f->max_dest_zone];
>> +
>> +	for (z = zb; z < ze; z++) {
>> +		offset = z->start;
>> +		length = (z+1)->start - offset;
>> +
>> +		dprint(FD_ZBD, "%s: resetting wp of zone %u.\n", f->file_name,
>> +		       zbd_zone_nr(f->zbd_info, z));
>> +		switch (f->zbd_info->model) {
>> +		case ZBD_HOST_AWARE:
>> +		case ZBD_HOST_MANAGED:
>> +			ret = zbd_reset_wp(td, f, offset, length);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +
>> +		if (ret < 0)
>> +			continue;
>> +
>> +		td->ts.nr_zone_resets++;
>> +	}
>> +}
>> +
>>  /*
>>   * Reset zbd_info.write_cnt, the counter that counts down towards the next
>>   * zone reset.
>> @@ -924,9 +978,14 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
>>  {
>>  	struct fio_zone_info *zb, *ze;
>>
>> -	if (!f->zbd_info || !td_write(td))
>> +	if (!f->zbd_info || !(td_write(td) || td_copy(td)))
>>  		return;
>>
>> +	if (td_copy(td)) {
>> +		zbd_reset_dest_zones(td, f);
>> +		return;
>> +	}
>> +
>>  	zb = &f->zbd_info->zone_info[f->min_zone];
>>  	ze = &f->zbd_info->zone_info[f->max_zone];
>>  	zbd_init_swd(f);
>> @@ -1410,8 +1469,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>  {
>>  	struct fio_file *f = io_u->file;
>>  	enum fio_ddir ddir = io_u->ddir;
>> -	struct fio_zone_info *z;
>> -	uint32_t zone_idx;
>> +	struct fio_zone_info *z, *zd;
>> +	uint32_t zone_idx, zone_didx;
>>
>>  	assert(td->o.zone_mode == ZONE_MODE_ZBD);
>>  	assert(td->o.zone_size);
>> @@ -1419,13 +1478,18 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>  	zone_idx = zbd_zone_idx(f, f->last_pos[ddir]);
>>  	z = &f->zbd_info->zone_info[zone_idx];
>>
>> +	if (ddir == DDIR_COPY) {
>> +		zone_didx = zbd_zone_idx(f, f->last_pos_dest[ddir]);
>> +		zd = &f->zbd_info->zone_info[zone_didx];
>> +	}
>> +
>>  	/*
>>  	 * When the zone capacity is smaller than the zone size and the I/O is
>> -	 * sequential write, skip to zone end if the latest position is at the
>> +	 * sequential write or copy, skip to zone end if the latest position is at the
>>  	 * zone capacity limit.
>>  	 */
>>  	if (z->capacity < f->zbd_info->zone_size && !td_random(td) &&
>> -	    ddir == DDIR_WRITE &&
>> +	    (ddir == DDIR_WRITE || ddir == DDIR_COPY) &&
>>  	    f->last_pos[ddir] >= zbd_zone_capacity_end(z)) {
>>  		dprint(FD_ZBD,
>>  		       "%s: Jump from zone capacity limit to zone end:"
>> @@ -1436,6 +1500,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>  		       (unsigned long long) z->capacity);
>>  		td->io_skip_bytes += zbd_zone_end(z) - f->last_pos[ddir];
>>  		f->last_pos[ddir] = zbd_zone_end(z);
>> +		if (ddir == DDIR_COPY)
>> +			f->last_pos_dest[ddir] = zbd_zone_end(zd);
>>  	}
>>
>>  	/*
>> @@ -1461,13 +1527,21 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>  		td->zone_bytes = 0;
>>  		f->file_offset += td->o.zone_size + td->o.zone_skip;
>>
>> +		if (ddir == DDIR_COPY)
>> +			f->file_dest_offset += td->o.zone_size + td->o.zone_skip;
>>  		/*
>>  		 * Wrap from the beginning, if we exceed the file size
>>  		 */
>>  		if (f->file_offset >= f->real_file_size)
>>  			f->file_offset = get_start_offset(td, f);
>>
>> +		if ((ddir == DDIR_COPY) && f->file_dest_offset >= f->real_file_size)
>> +			f->file_dest_offset = get_dest_offset(td, f);
>> +
>>  		f->last_pos[ddir] = f->file_offset;
>> +		if (ddir == DDIR_COPY)
>> +			f->last_pos_dest[io_u->ddir] = f->file_dest_offset;
>> +
>>  		td->io_skip_bytes += td->o.zone_skip;
>>  	}
>>  }
>>
>
>
>-- 
>Damien Le Moal
>Western Digital Research
>

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 8/9] Support copy operation for zoned block devices.
  2020-12-10 14:14         ` Krishna Kanth Reddy
@ 2020-12-11  0:37           ` Damien Le Moal
  0 siblings, 0 replies; 13+ messages in thread
From: Damien Le Moal @ 2020-12-11  0:37 UTC (permalink / raw)
  To: Krishna Kanth Reddy; +Cc: axboe, fio, Ankit Kumar

On 2020/12/10 23:14, Krishna Kanth Reddy wrote:
> On Tue, Dec 01, 2020 at 12:11:48PM +0000, Damien Le Moal wrote:
>> On 2020/12/01 20:44, Krishna Kanth Reddy wrote:
>>> From: Ankit Kumar <ankit.kumar@samsung.com>
>>>
>>> Added a check so that source and destination zones don't overlap.
>>> Source and destination offsets are aligned to zone start.
>>> The source range zone data is copied sequentially to the destination
>>> zones.
>>> Added a function to reset the destination zones. Source zones won't
>>> be reset.
>>
>> I do not see how this can work correctly without relying on the zone locking
>> mechanism. Other jobs may be writing to the same zones too. That will either
>> result in the copy failing or the writes failing due to the zone wp unexpectedly
>> changing.
>>
> Sorry for the delay response, as I was hoping for comments for the other patches in this patchset too.
> 
> Yes, you are right. There are no locks in the current implementation.
> Our initial focus is to introduce a new copy workload and get the infrastructure ready for FIO.
> 
> We will modify the existing patch to add the zone locking mechanism, so that it can be used in multi-threaded scenario.
> Kindly provide your valuable feedback to the other patches in this patchset too.

I find it very difficult to review something that does not yet have a stable
well defined kernel interface. So I prefer to first wait for more progress with
the block copy implementation on the kernel side. Reviewing fio patches will
then make more sense.

> 
>>>
>>> Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
>>> ---
>>>  file.h      |  4 +++
>>>  filesetup.c | 18 +++++++++++
>>>  init.c      |  5 +++
>>>  io_u.c      |  2 +-
>>>  zbd.c       | 88 ++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>  5 files changed, 109 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/file.h b/file.h
>>> index f5a794e4..23012753 100644
>>> --- a/file.h
>>> +++ b/file.h
>>> @@ -110,6 +110,10 @@ struct fio_file {
>>>  	uint32_t min_zone;	/* inclusive */
>>>  	uint32_t max_zone;	/* exclusive */
>>>
>>> +	/* zonemode=zbd copy destination working area */
>>> +	uint32_t min_dest_zone;	/* inclusive */
>>> +	uint32_t max_dest_zone;	/* exclusive */
>>> +
>>>  	/*
>>>  	 * Track last end and last start of IO for a given data direction
>>>  	 */
>>> diff --git a/filesetup.c b/filesetup.c
>>> index 68a21fac..54752511 100644
>>> --- a/filesetup.c
>>> +++ b/filesetup.c
>>> @@ -1231,6 +1231,24 @@ int setup_files(struct thread_data *td)
>>>
>>>  		if (td_copy(td))
>>>  			f->file_dest_offset = get_dest_offset(td, f);
>>> +
>>> +		if (td_copy(td) && (td->o.zone_mode == ZONE_MODE_ZBD)) {
>>> +			if (f->file_offset > f->file_dest_offset) {
>>> +				if (f->file_offset - f->file_dest_offset < f->io_size) {
>>> +					log_err("%s: For copy operation on ZBD device "
>>> +						 "source and destination area shouldn't overlap\n",
>>> +						 o->name);
>>> +					goto err_out;
>>> +				}
>>> +			} else {
>>> +				if (f->file_dest_offset - f->file_offset < f->io_size) {
>>> +					log_err("%s: For copy operation on ZBD device "
>>> +						 "source and destination area shouldn't overlap\n",
>>> +						 o->name);
>>> +					goto err_out;
>>> +				}
>>> +			}
>>> +		}
>>>  	}
>>>
>>>  	if (td->o.block_error_hist) {
>>> diff --git a/init.c b/init.c
>>> index e5835b7b..b5db65af 100644
>>> --- a/init.c
>>> +++ b/init.c
>>> @@ -671,6 +671,11 @@ static int fixup_options(struct thread_data *td)
>>>  	if (o->zone_mode == ZONE_MODE_STRIDED && !o->zone_range)
>>>  		o->zone_range = o->zone_size;
>>>
>>> +	if (o->zone_mode == ZONE_MODE_ZBD && td_copy(td) && td_random(td)) {
>>> +		log_err("fio: --zonemode=zbd supports copy operation only in sequential mode.\n");
>>> +		ret |= 1;
>>> +	}
>>> +
>>>  	/*
>>>  	 * Reads and copies can do overwrites, we always need to pre-create the file
>>>  	 */
>>> diff --git a/io_u.c b/io_u.c
>>> index 83c7960a..2de91f2a 100644
>>> --- a/io_u.c
>>> +++ b/io_u.c
>>> @@ -1003,7 +1003,7 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
>>>  		offset = io_u->offset;
>>>  	}
>>>
>>> -	if (td->o.zone_mode == ZONE_MODE_ZBD) {
>>> +	if ((td->o.zone_mode == ZONE_MODE_ZBD) && !(td_copy(td))) {
>>>  		ret = zbd_adjust_block(td, io_u);
>>>  		if (ret == io_u_eof)
>>>  			return 1;
>>> diff --git a/zbd.c b/zbd.c
>>> index 58fed98e..8201665b 100644
>>> --- a/zbd.c
>>> +++ b/zbd.c
>>> @@ -246,11 +246,11 @@ static bool zbd_is_seq_job(struct fio_file *f)
>>>   */
>>>  static bool zbd_verify_sizes(void)
>>>  {
>>> -	const struct fio_zone_info *z;
>>> +	const struct fio_zone_info *z, *zd;
>>>  	struct thread_data *td;
>>>  	struct fio_file *f;
>>>  	uint64_t new_offset, new_end;
>>> -	uint32_t zone_idx;
>>> +	uint32_t zone_idx, zone_didx;
>>>  	int i, j;
>>>
>>>  	for_each_td(td, i) {
>>> @@ -259,6 +259,9 @@ static bool zbd_verify_sizes(void)
>>>  				continue;
>>>  			if (f->file_offset >= f->real_file_size)
>>>  				continue;
>>> +			if ((td->o.td_ddir == TD_DDIR_COPY) &&
>>> +			    (f->file_dest_offset >= f->real_file_size))
>>> +				continue;
>>>  			if (!zbd_is_seq_job(f))
>>>  				continue;
>>>
>>> @@ -301,6 +304,15 @@ static bool zbd_verify_sizes(void)
>>>  				f->io_size -= (new_offset - f->file_offset);
>>>  				f->file_offset = new_offset;
>>>  			}
>>> +			if (td->o.td_ddir == TD_DDIR_COPY) {
>>> +				zone_didx = zbd_zone_idx(f, f->file_dest_offset);
>>> +				zd = &f->zbd_info->zone_info[zone_didx];
>>> +				if (f->file_dest_offset != zd->start) {
>>> +					new_offset = zbd_zone_end(zd);
>>> +					f->file_dest_offset = new_offset;
>>> +				}
>>> +			}
>>> +
>>>  			zone_idx = zbd_zone_idx(f, f->file_offset + f->io_size);
>>>  			z = &f->zbd_info->zone_info[zone_idx];
>>>  			new_end = z->start;
>>> @@ -320,6 +332,12 @@ static bool zbd_verify_sizes(void)
>>>  			f->min_zone = zbd_zone_idx(f, f->file_offset);
>>>  			f->max_zone = zbd_zone_idx(f, f->file_offset + f->io_size);
>>>  			assert(f->min_zone < f->max_zone);
>>> +
>>> +			if (td->o.td_ddir == TD_DDIR_COPY) {
>>> +				f->min_dest_zone = zbd_zone_idx(f, f->file_dest_offset);
>>> +				f->max_dest_zone = zbd_zone_idx(f, f->file_dest_offset + f->io_size);
>>> +				assert(f->min_dest_zone < f->max_dest_zone);
>>> +			}
>>>  		}
>>>  	}
>>>
>>> @@ -823,6 +841,42 @@ static int zbd_reset_zones(struct thread_data *td, struct fio_file *f,
>>>  	return res;
>>>  }
>>>
>>> +/*
>>> + * Reset a range of zones.
>>> + * @td: fio thread data.
>>> + * @f: fio file for which to reset zones
>>> + */
>>> +static void zbd_reset_dest_zones(struct thread_data *td, struct fio_file *f)
>>> +{
>>> +	struct fio_zone_info *z, *zb, *ze;
>>> +	int ret = 0;
>>> +	uint64_t offset, length;
>>> +
>>> +	zb = &f->zbd_info->zone_info[f->min_dest_zone];
>>> +	ze = &f->zbd_info->zone_info[f->max_dest_zone];
>>> +
>>> +	for (z = zb; z < ze; z++) {
>>> +		offset = z->start;
>>> +		length = (z+1)->start - offset;
>>> +
>>> +		dprint(FD_ZBD, "%s: resetting wp of zone %u.\n", f->file_name,
>>> +		       zbd_zone_nr(f->zbd_info, z));
>>> +		switch (f->zbd_info->model) {
>>> +		case ZBD_HOST_AWARE:
>>> +		case ZBD_HOST_MANAGED:
>>> +			ret = zbd_reset_wp(td, f, offset, length);
>>> +			break;
>>> +		default:
>>> +			break;
>>> +		}
>>> +
>>> +		if (ret < 0)
>>> +			continue;
>>> +
>>> +		td->ts.nr_zone_resets++;
>>> +	}
>>> +}
>>> +
>>>  /*
>>>   * Reset zbd_info.write_cnt, the counter that counts down towards the next
>>>   * zone reset.
>>> @@ -924,9 +978,14 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
>>>  {
>>>  	struct fio_zone_info *zb, *ze;
>>>
>>> -	if (!f->zbd_info || !td_write(td))
>>> +	if (!f->zbd_info || !(td_write(td) || td_copy(td)))
>>>  		return;
>>>
>>> +	if (td_copy(td)) {
>>> +		zbd_reset_dest_zones(td, f);
>>> +		return;
>>> +	}
>>> +
>>>  	zb = &f->zbd_info->zone_info[f->min_zone];
>>>  	ze = &f->zbd_info->zone_info[f->max_zone];
>>>  	zbd_init_swd(f);
>>> @@ -1410,8 +1469,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>>  {
>>>  	struct fio_file *f = io_u->file;
>>>  	enum fio_ddir ddir = io_u->ddir;
>>> -	struct fio_zone_info *z;
>>> -	uint32_t zone_idx;
>>> +	struct fio_zone_info *z, *zd;
>>> +	uint32_t zone_idx, zone_didx;
>>>
>>>  	assert(td->o.zone_mode == ZONE_MODE_ZBD);
>>>  	assert(td->o.zone_size);
>>> @@ -1419,13 +1478,18 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>>  	zone_idx = zbd_zone_idx(f, f->last_pos[ddir]);
>>>  	z = &f->zbd_info->zone_info[zone_idx];
>>>
>>> +	if (ddir == DDIR_COPY) {
>>> +		zone_didx = zbd_zone_idx(f, f->last_pos_dest[ddir]);
>>> +		zd = &f->zbd_info->zone_info[zone_didx];
>>> +	}
>>> +
>>>  	/*
>>>  	 * When the zone capacity is smaller than the zone size and the I/O is
>>> -	 * sequential write, skip to zone end if the latest position is at the
>>> +	 * sequential write or copy, skip to zone end if the latest position is at the
>>>  	 * zone capacity limit.
>>>  	 */
>>>  	if (z->capacity < f->zbd_info->zone_size && !td_random(td) &&
>>> -	    ddir == DDIR_WRITE &&
>>> +	    (ddir == DDIR_WRITE || ddir == DDIR_COPY) &&
>>>  	    f->last_pos[ddir] >= zbd_zone_capacity_end(z)) {
>>>  		dprint(FD_ZBD,
>>>  		       "%s: Jump from zone capacity limit to zone end:"
>>> @@ -1436,6 +1500,8 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>>  		       (unsigned long long) z->capacity);
>>>  		td->io_skip_bytes += zbd_zone_end(z) - f->last_pos[ddir];
>>>  		f->last_pos[ddir] = zbd_zone_end(z);
>>> +		if (ddir == DDIR_COPY)
>>> +			f->last_pos_dest[ddir] = zbd_zone_end(zd);
>>>  	}
>>>
>>>  	/*
>>> @@ -1461,13 +1527,21 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>>>  		td->zone_bytes = 0;
>>>  		f->file_offset += td->o.zone_size + td->o.zone_skip;
>>>
>>> +		if (ddir == DDIR_COPY)
>>> +			f->file_dest_offset += td->o.zone_size + td->o.zone_skip;
>>>  		/*
>>>  		 * Wrap from the beginning, if we exceed the file size
>>>  		 */
>>>  		if (f->file_offset >= f->real_file_size)
>>>  			f->file_offset = get_start_offset(td, f);
>>>
>>> +		if ((ddir == DDIR_COPY) && f->file_dest_offset >= f->real_file_size)
>>> +			f->file_dest_offset = get_dest_offset(td, f);
>>> +
>>>  		f->last_pos[ddir] = f->file_offset;
>>> +		if (ddir == DDIR_COPY)
>>> +			f->last_pos_dest[io_u->ddir] = f->file_dest_offset;
>>> +
>>>  		td->io_skip_bytes += td->o.zone_skip;
>>>  	}
>>>  }
>>>
>>
>>
>> -- 
>> Damien Le Moal
>> Western Digital Research
>>


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-11  0:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20201201114048epcas5p3e12de26128ce442bbe8406082eaccde9@epcas5p3.samsung.com>
2020-12-01 11:40 ` [PATCH 0/9] v1 Patchset : Simple Copy Command support Krishna Kanth Reddy
     [not found]   ` <CGME20201201114051epcas5p4b9c67cd0ad4b55fc9334dfde59ae349c@epcas5p4.samsung.com>
2020-12-01 11:40     ` [PATCH 1/9] Adding the necessary ddir changes to introduce copy operation Krishna Kanth Reddy
     [not found]   ` <CGME20201201114054epcas5p2cf4bff491f02c4a29a386ae44d5e42c7@epcas5p2.samsung.com>
2020-12-01 11:40     ` [PATCH 2/9] Introducing new offsets for the " Krishna Kanth Reddy
     [not found]   ` <CGME20201201114057epcas5p1aa9d8e1a56197e55251191a0a5985e3d@epcas5p1.samsung.com>
2020-12-01 11:40     ` [PATCH 3/9] Added support for printing of stats and estimate time for " Krishna Kanth Reddy
     [not found]   ` <CGME20201201114100epcas5p2f02995779f5172f711cc6ad3d362d50a@epcas5p2.samsung.com>
2020-12-01 11:40     ` [PATCH 4/9] Adding a new " Krishna Kanth Reddy
     [not found]   ` <CGME20201201114103epcas5p1bbf3d8ca05252935c14fed68f44dab2d@epcas5p1.samsung.com>
2020-12-01 11:40     ` [PATCH 5/9] Added the changes for copy operation support in FIO Krishna Kanth Reddy
     [not found]   ` <CGME20201201114105epcas5p4b99d6a66a543152a377461875aedf342@epcas5p4.samsung.com>
2020-12-01 11:40     ` [PATCH 6/9] New ioctl based synchronous IO engine. Only supports copy command Krishna Kanth Reddy
     [not found]   ` <CGME20201201114107epcas5p4694adc6b50a123a06a06411393395636@epcas5p4.samsung.com>
2020-12-01 11:40     ` [PATCH 7/9] Example configuration for simple " Krishna Kanth Reddy
     [not found]   ` <CGME20201201114110epcas5p34032161c14f467576734346712d0c3db@epcas5p3.samsung.com>
2020-12-01 11:40     ` [PATCH 8/9] Support copy operation for zoned block devices Krishna Kanth Reddy
2020-12-01 12:11       ` Damien Le Moal
2020-12-10 14:14         ` Krishna Kanth Reddy
2020-12-11  0:37           ` Damien Le Moal
     [not found]   ` <CGME20201201114113epcas5p328eabe564e0bda24c29e285bba8d8fd2@epcas5p3.samsung.com>
2020-12-01 11:40     ` [PATCH 9/9] Add a new test case to test the copy operation Krishna Kanth Reddy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.