fio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues
@ 2022-10-28 11:53 Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 01/13] oslib: blkzoned: add blkzoned_finish_zone() helper function Shin'ichiro Kawasaki
                   ` (12 more replies)
  0 siblings, 13 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

When zonemode is zbd and block size is not divisor of zone size, write target
zone selection does not work as expected. First four patches in this series fix
the issue by introducing zone finish operation. First three patches do
preparation and the 4th patch does the fix.

This series also fix four more issues related to verify with zonemode zbd. The
first issue is that verify is allowed only when block size is divisor of zone
size. The 5th patch removes this limitation. The second and third issue are
caused by zone reset, which erases data pattern for verify. The 6th and 7th
patches adjust verify and zone reset timing to avoid verify errors by the erased
data pattern. The last issue is experimental verify, which does not work with
zonemode=zbd. The 8th patch checks options to error out when experimental
verify is specified together with zonemode=zbd.

Five more patches follow to make test cases in t/zbd match with the fixes.

Changes from v2:
* Improved commit message and block comment in 3rd patch
* Fixed a comment typo in last patch
* Added Tested-by and Reviewed-by tags

Changes from v1:
* Separated zbd_zone_remainder() addition to 3rd patch and added suggested hunks
* Added last 13th patch for a new test case to test experimental_verify option
* Reflected other comments on the list
* Added Tested-by and Reviewed-by tags

Shin'ichiro Kawasaki (13):
  oslib: blkzoned: add blkzoned_finish_zone() helper function
  engines/libzbc: add libzbc_finish_zone() helper function
  zbd: add zbd_zone_remainder() helper function
  zbd: finish zones with remainder smaller than minimum write block size
  zbd: allow block size not divisor of zone size
  zbd: verify before zone reset for zone_reset_threshold option
  zbd: fix zone reset condition for verify_backlog option
  zbd: prevent experimental verify with zonemode=zbd
  t/zbd: fix test case #33 for block size unaligned to zone size
  t/zbd: modify test case #34 for block size unaligned to zone size
  t/zbd: add test case to check zone_reset_threshold option with verify
  t/zbd: remove experimental_verify option from test case #54
  t/zbd: add test case to check experimental_verify option

 engines/libzbc.c       |  34 +++++++++
 ioengines.h            |   2 +
 oslib/blkzoned.h       |   8 +++
 oslib/linux-blkzoned.c |  37 ++++++++++
 t/zbd/test-zbd-support |  52 ++++++++++----
 zbd.c                  | 159 ++++++++++++++++++++++++++---------------
 zbd.h                  |   2 -
 7 files changed, 223 insertions(+), 71 deletions(-)

-- 
2.37.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 01/13] oslib: blkzoned: add blkzoned_finish_zone() helper function
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 02/13] engines/libzbc: add libzbc_finish_zone() " Shin'ichiro Kawasaki
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

Add the helper function blkzoned_finish_zone() to support zone finish
operation to zoned block devices through ioctl. This feature will be
used to change status of zones which is not yet full but does not have
enough size to write next block, so that such zones do not exceed max
active zones limit. This function does zone finish only when kernel
supports the ioctl BLKFINISHZONE. Otherwise, it does nothing. This
should be fine since the kernel without BLKFINISHZONE does not report
max active zone limit through sysfs to user space.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 oslib/blkzoned.h       |  8 ++++++++
 oslib/linux-blkzoned.c | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/oslib/blkzoned.h b/oslib/blkzoned.h
index 719b041d..29fb034f 100644
--- a/oslib/blkzoned.h
+++ b/oslib/blkzoned.h
@@ -18,6 +18,8 @@ extern int blkzoned_reset_wp(struct thread_data *td, struct fio_file *f,
 				uint64_t offset, uint64_t length);
 extern int blkzoned_get_max_open_zones(struct thread_data *td, struct fio_file *f,
 				       unsigned int *max_open_zones);
+extern int blkzoned_finish_zone(struct thread_data *td, struct fio_file *f,
+				uint64_t offset, uint64_t length);
 #else
 /*
  * Define stubs for systems that do not have zoned block device support.
@@ -51,6 +53,12 @@ static inline int blkzoned_get_max_open_zones(struct thread_data *td, struct fio
 {
 	return -EIO;
 }
+static inline int blkzoned_finish_zone(struct thread_data *td,
+				       struct fio_file *f,
+				       uint64_t offset, uint64_t length)
+{
+	return -EIO;
+}
 #endif
 
 #endif /* FIO_BLKZONED_H */
diff --git a/oslib/linux-blkzoned.c b/oslib/linux-blkzoned.c
index 185bd501..c3130d0e 100644
--- a/oslib/linux-blkzoned.c
+++ b/oslib/linux-blkzoned.c
@@ -308,3 +308,40 @@ int blkzoned_reset_wp(struct thread_data *td, struct fio_file *f,
 
 	return ret;
 }
+
+int blkzoned_finish_zone(struct thread_data *td, struct fio_file *f,
+			 uint64_t offset, uint64_t length)
+{
+#ifdef BLKFINISHZONE
+	struct blk_zone_range zr = {
+		.sector         = offset >> 9,
+		.nr_sectors     = length >> 9,
+	};
+	int fd, ret = 0;
+
+	/* If the file is not yet opened, open it for this function. */
+	fd = f->fd;
+	if (fd < 0) {
+		fd = open(f->file_name, O_RDWR | O_LARGEFILE);
+		if (fd < 0)
+			return -errno;
+	}
+
+	if (ioctl(fd, BLKFINISHZONE, &zr) < 0)
+		ret = -errno;
+
+	if (f->fd < 0)
+		close(fd);
+
+	return ret;
+#else
+	/*
+	 * Kernel versions older than 5.5 does not support BLKFINISHZONE. These
+	 * old kernels assumed zones are closed automatically at max_open_zones
+	 * limit. Also they did not support max_active_zones limit. Then there
+	 * was no need to finish zones to avoid errors caused by max_open_zones
+	 * or max_active_zones. For those old versions, just do nothing.
+	 */
+	return 0;
+#endif
+}
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 02/13] engines/libzbc: add libzbc_finish_zone() helper function
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 01/13] oslib: blkzoned: add blkzoned_finish_zone() helper function Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 03/13] zbd: add zbd_zone_remainder() " Shin'ichiro Kawasaki
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

To support zone finish operation to ZBC drives through libzbc, add
finish_zone() callback to struct ioengine_ops, and implement in libzbc
IO engine. This feature is used to keep the same zone handling by
zonemode=zbd for libzbc engine as other engines.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 engines/libzbc.c | 34 ++++++++++++++++++++++++++++++++++
 ioengines.h      |  2 ++
 2 files changed, 36 insertions(+)

diff --git a/engines/libzbc.c b/engines/libzbc.c
index 2bc2c7e0..2b63ef1a 100644
--- a/engines/libzbc.c
+++ b/engines/libzbc.c
@@ -332,6 +332,39 @@ err:
 	return -ret;
 }
 
+static int libzbc_finish_zone(struct thread_data *td, struct fio_file *f,
+			      uint64_t offset, uint64_t length)
+{
+	struct libzbc_data *ld = td->io_ops_data;
+	uint64_t sector = offset >> 9;
+	unsigned int nr_zones;
+	struct zbc_errno err;
+	int i, ret;
+
+	assert(ld);
+	assert(ld->zdev);
+
+	nr_zones = (length + td->o.zone_size - 1) / td->o.zone_size;
+	assert(nr_zones > 0);
+
+	for (i = 0; i < nr_zones; i++, sector += td->o.zone_size >> 9) {
+		ret = zbc_finish_zone(ld->zdev, sector, 0);
+		if (ret)
+			goto err;
+	}
+
+	return 0;
+
+err:
+	zbc_errno(ld->zdev, &err);
+	td_verror(td, errno, "zbc_finish_zone failed");
+	if (err.sk)
+		log_err("%s: finish zone failed %s:%s\n",
+			f->file_name,
+			zbc_sk_str(err.sk), zbc_asc_ascq_str(err.asc_ascq));
+	return -ret;
+}
+
 static int libzbc_get_max_open_zones(struct thread_data *td, struct fio_file *f,
 				     unsigned int *max_open_zones)
 {
@@ -434,6 +467,7 @@ FIO_STATIC struct ioengine_ops ioengine = {
 	.report_zones		= libzbc_report_zones,
 	.reset_wp		= libzbc_reset_wp,
 	.get_max_open_zones	= libzbc_get_max_open_zones,
+	.finish_zone		= libzbc_finish_zone,
 	.queue			= libzbc_queue,
 	.flags			= FIO_SYNCIO | FIO_NOEXTEND | FIO_RAWIO,
 };
diff --git a/ioengines.h b/ioengines.h
index fafa1e48..11d2115c 100644
--- a/ioengines.h
+++ b/ioengines.h
@@ -61,6 +61,8 @@ struct ioengine_ops {
 			uint64_t, uint64_t);
 	int (*get_max_open_zones)(struct thread_data *, struct fio_file *,
 				  unsigned int *);
+	int (*finish_zone)(struct thread_data *, struct fio_file *,
+			   uint64_t, uint64_t);
 	int option_struct_size;
 	struct fio_option *options;
 };
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 03/13] zbd: add zbd_zone_remainder() helper function
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 01/13] oslib: blkzoned: add blkzoned_finish_zone() helper function Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 02/13] engines/libzbc: add libzbc_finish_zone() " Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 04/13] zbd: finish zones with remainder smaller than minimum write block size Shin'ichiro Kawasaki
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

Add the helper function zbd_zone_remainder(), which returns the number
of bytes that are still available for writing before the zone gets full.
Use this function to improve readability. It will also be used in the
following patch.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/zbd.c b/zbd.c
index 627fb968..26a6404d 100644
--- a/zbd.c
+++ b/zbd.c
@@ -70,6 +70,19 @@ static inline uint64_t zbd_zone_capacity_end(const struct fio_zone_info *z)
 	return z->start + z->capacity;
 }
 
+/**
+ * zbd_zone_remainder - Return the number of bytes that are still available for
+ *                      writing before the zone gets full
+ * @z: zone info pointer.
+ */
+static inline uint64_t zbd_zone_remainder(struct fio_zone_info *z)
+{
+	if (z->wp >= zbd_zone_capacity_end(z))
+		return 0;
+
+	return zbd_zone_capacity_end(z) - z->wp;
+}
+
 /**
  * zbd_zone_full - verify whether a minimum number of bytes remain in a zone
  * @f: file pointer.
@@ -83,8 +96,7 @@ static bool zbd_zone_full(const struct fio_file *f, struct fio_zone_info *z,
 {
 	assert((required & 511) == 0);
 
-	return z->has_wp &&
-		z->wp + required > zbd_zone_capacity_end(z);
+	return z->has_wp && required > zbd_zone_remainder(z);
 }
 
 static void zone_lock(struct thread_data *td, const struct fio_file *f,
@@ -440,7 +452,7 @@ static bool zbd_open_zone(struct thread_data *td, const struct fio_file *f,
 		 * already in-flight, handle it as a full zone instead of an
 		 * open zone.
 		 */
-		if (z->wp >= zbd_zone_capacity_end(z))
+		if (!zbd_zone_remainder(z))
 			res = false;
 		goto out;
 	}
@@ -1368,7 +1380,7 @@ found_candidate_zone:
 	/* Both z->mutex and zbdi->mutex are held. */
 
 examine_zone:
-	if (z->wp + min_bs <= zbd_zone_capacity_end(z)) {
+	if (zbd_zone_remainder(z) >= min_bs) {
 		pthread_mutex_unlock(&zbdi->mutex);
 		goto out;
 	}
@@ -1433,7 +1445,7 @@ retry:
 		z = zbd_get_zone(f, zone_idx);
 
 		zone_lock(td, f, z);
-		if (z->wp + min_bs <= zbd_zone_capacity_end(z))
+		if (zbd_zone_remainder(z) >= min_bs)
 			goto out;
 		pthread_mutex_lock(&zbdi->mutex);
 	}
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 04/13] zbd: finish zones with remainder smaller than minimum write block size
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (2 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 03/13] zbd: add zbd_zone_remainder() " Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 05/13] zbd: allow block size not divisor of zone size Shin'ichiro Kawasaki
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

When zonemode is zbd and block size is not divisor of zone size, write
target zone selection does not work as expected. When the write is
random write and the device has max open zone limit, the random write is
repeated to the zones selected up to the max open zone limit. All writes
are repeated only to the zones. When the write is sequential write,
write is done only for the first zone. The cause of such unexpected zone
selection is current write target zone selection logic. It selects write
target zones within open zones. When block size is not divisor of zone
size, the selected open zone has only remainder of writable blocks
smaller than the block size. Fio resets such zone after zone selection
and continues writing to it. This zone reset is required not to exceed
the limit of max_open_zones option or max_active_zone limit of the zoned
device, but it does not simulate the workload.

To avoid the zone reset and unexpected write to same zone, fix write
target zone handling of zones with remainder smaller than write block
size. Do not reset but finish such zone not to exceed the max_open_zones
option and max_active_zone limit. Then choose the zone next to the
finished zone as write target. To implement this, add the helper
function zbd_finish_zone().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/zbd.c b/zbd.c
index 26a6404d..9ab78e2e 100644
--- a/zbd.c
+++ b/zbd.c
@@ -334,6 +334,44 @@ static void zbd_close_zone(struct thread_data *td, const struct fio_file *f,
 	z->open = 0;
 }
 
+/**
+ * zbd_finish_zone - finish the specified zone
+ * @td: FIO thread data.
+ * @f: FIO file for which to finish a zone
+ * @z: Zone to finish.
+ *
+ * Finish the zone at @offset with open or close status.
+ */
+static int zbd_finish_zone(struct thread_data *td, struct fio_file *f,
+			   struct fio_zone_info *z)
+{
+	uint64_t offset = z->start;
+	uint64_t length = f->zbd_info->zone_size;
+	int ret = 0;
+
+	switch (f->zbd_info->model) {
+	case ZBD_HOST_AWARE:
+	case ZBD_HOST_MANAGED:
+		if (td->io_ops && td->io_ops->finish_zone)
+			ret = td->io_ops->finish_zone(td, f, offset, length);
+		else
+			ret = blkzoned_finish_zone(td, f, offset, length);
+		break;
+	default:
+		break;
+	}
+
+	if (ret < 0) {
+		td_verror(td, errno, "finish zone failed");
+		log_err("%s: finish zone at sector %"PRIu64" failed (%d).\n",
+			f->file_name, offset >> 9, errno);
+	} else {
+		z->wp = (z+1)->start;
+	}
+
+	return ret;
+}
+
 /**
  * zbd_reset_zones - Reset a range of zones.
  * @td: fio thread data.
@@ -1953,6 +1991,33 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
 			goto eof;
 		}
 
+retry:
+		if (zbd_zone_remainder(zb) > 0 &&
+		    zbd_zone_remainder(zb) < min_bs) {
+			pthread_mutex_lock(&f->zbd_info->mutex);
+			zbd_close_zone(td, f, zb);
+			pthread_mutex_unlock(&f->zbd_info->mutex);
+			dprint(FD_ZBD,
+			       "%s: finish zone %d\n",
+			       f->file_name, zbd_zone_idx(f, zb));
+			io_u_quiesce(td);
+			zbd_finish_zone(td, f, zb);
+			if (zbd_zone_idx(f, zb) + 1 >= f->max_zone) {
+				if (!td_random(td))
+					goto eof;
+			}
+			zone_unlock(zb);
+
+			/* Find the next write pointer zone */
+			do {
+				zb++;
+				if (zbd_zone_idx(f, zb) >= f->max_zone)
+					zb = zbd_get_zone(f, f->min_zone);
+			} while (!zb->has_wp);
+
+			zone_lock(td, f, zb);
+		}
+
 		if (!zbd_open_zone(td, f, zb)) {
 			zone_unlock(zb);
 			zb = zbd_convert_to_open_zone(td, io_u);
@@ -1963,6 +2028,10 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
 			}
 		}
 
+		if (zbd_zone_remainder(zb) > 0 &&
+		    zbd_zone_remainder(zb) < min_bs)
+			goto retry;
+
 		/* Check whether the zone reset threshold has been exceeded */
 		if (td->o.zrf.u.f) {
 			if (zbdi->wp_sectors_with_data >= f->io_size * td->o.zrt.u.f &&
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 05/13] zbd: allow block size not divisor of zone size
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (3 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 04/13] zbd: finish zones with remainder smaller than minimum write block size Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option Shin'ichiro Kawasaki
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

Current implementation checks that block size is divisor of zone size
when verify work load is specified. After the recent fix of block size
unaligned to zone, this check is no longer valid. Remove the check.

The check had been valid since such block size left unwritten area at
each zone end and keeps the zones in open/active status until verify
read is done. It easily hit max open/active zones limitation. After the
fix, the zones with unwritten area are finished then they do not hit the
limitation.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/zbd.c b/zbd.c
index 9ab78e2e..6f4e5ab2 100644
--- a/zbd.c
+++ b/zbd.c
@@ -652,7 +652,7 @@ static bool zbd_verify_bs(void)
 {
 	struct thread_data *td;
 	struct fio_file *f;
-	int i, j, k;
+	int i, j;
 
 	for_each_td(td, i) {
 		if (td_trim(td) &&
@@ -674,15 +674,6 @@ static bool zbd_verify_bs(void)
 					 zone_size);
 				return false;
 			}
-			for (k = 0; k < FIO_ARRAY_SIZE(td->o.bs); k++) {
-				if (td->o.verify != VERIFY_NONE &&
-				    zone_size % td->o.bs[k] != 0) {
-					log_info("%s: block size %llu is not a divisor of the zone size %"PRIu64"\n",
-						 f->file_name, td->o.bs[k],
-						 zone_size);
-					return false;
-				}
-			}
 		}
 	}
 	return true;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (4 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 05/13] zbd: allow block size not divisor of zone size Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-11-01 16:08   ` Vincent Fu
  2022-10-28 11:53 ` [PATCH v3 07/13] zbd: fix zone reset condition for verify_backlog option Shin'ichiro Kawasaki
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

When verify option is specified together with zonemode=zbd and
zone_reset_threshold option, zone reset happens to zones not full. This
erases data for verify and causes verify failure. Current implementation
avoids this scenario by assert.

To allow zone_reset_threshold option together with zonemode=zbd and
verify option, do verify before the zone reset. When zone reset is
required to an open zone with verify data, call get_next_verify() to
get verify read unit and return it from zbd_adjust_block().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/zbd.c b/zbd.c
index 6f4e5ab2..af761c17 100644
--- a/zbd.c
+++ b/zbd.c
@@ -2032,7 +2032,10 @@ retry:
 
 		/* Reset the zone pointer if necessary */
 		if (zb->reset_zone || zbd_zone_full(f, zb, min_bs)) {
-			assert(td->o.verify == VERIFY_NONE);
+			if (td->o.verify != VERIFY_NONE &&
+			    !get_next_verify(td, io_u))
+				goto accept;
+
 			/*
 			 * Since previous write requests may have been submitted
 			 * asynchronously and since we will submit the zone
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 07/13] zbd: fix zone reset condition for verify_backlog option
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (5 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 08/13] zbd: prevent experimental verify with zonemode=zbd Shin'ichiro Kawasaki
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

When data verification is requested, zbd_file_reset() resets zones only
when td status is not TD_VERIFYING so that the data to read back for
verify is not wiped out. However, when verify_backlog option is set, the
verify data can be left even if td status is not TD_VERIFYING. This
causes verify failure.

Fix this by improving the condition to reset zones in zbd_file_reset().
Refer not only td status but also verify_batch and io_hist_len values to
avoid zone reset when verify_backlog option is set. This is same check
as check_get_verify().

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/zbd.c b/zbd.c
index af761c17..54bbc82b 100644
--- a/zbd.c
+++ b/zbd.c
@@ -1249,6 +1249,7 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
 {
 	struct fio_zone_info *zb, *ze;
 	uint64_t swd;
+	bool verify_ongoing;
 
 	if (!f->zbd_info || !td_write(td))
 		return;
@@ -1265,7 +1266,10 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f)
 	 * writing any data to avoid that a zone reset has to be issued while
 	 * writing data, which causes data loss.
 	 */
-	if (td->o.verify != VERIFY_NONE && td->runstate != TD_VERIFYING)
+	verify_ongoing = td->runstate == TD_VERIFYING || td->verify_batch;
+	if (td->io_hist_len && td->o.verify_backlog)
+		verify_ongoing = td->io_hist_len % td->o.verify_backlog;
+	if (td->o.verify != VERIFY_NONE && !verify_ongoing)
 		zbd_reset_zones(td, f, zb, ze);
 	zbd_reset_write_cnt(td, f);
 }
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 08/13] zbd: prevent experimental verify with zonemode=zbd
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (6 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 07/13] zbd: fix zone reset condition for verify_backlog option Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 09/13] t/zbd: fix test case #33 for block size unaligned to zone size Shin'ichiro Kawasaki
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

When the experimental_verify option is specified, fio does not record
normal I/O metadata to create verify read unit. Instead, fio resets file
status after normal I/O and before verify start, then replay I/Os to
regenerate write unit and adjust it to verify read. This I/O replay does
not work for zonemode=zbd since zone status needs to be restored at
verify start to the status before the normal I/O. However, such status
restore moves write pointers and erases data pattern for verify.

Check that the options experimental_verify and zonemode=zbd are not
specified together and error out in case they are both specified. Also
remove the helper function zbd_replay_write_order() which is called from
zbd_adjust_block(). This function adjusts verify read unit to meet zone
write restrictions for experimental verify, but such adjustment does not
work because zone status is not restored before verify start.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 zbd.c | 46 ++++++----------------------------------------
 zbd.h |  2 --
 2 files changed, 6 insertions(+), 42 deletions(-)

diff --git a/zbd.c b/zbd.c
index 54bbc82b..6bc6a5dc 100644
--- a/zbd.c
+++ b/zbd.c
@@ -291,7 +291,6 @@ static int zbd_reset_zone(struct thread_data *td, struct fio_file *f,
 	pthread_mutex_unlock(&f->zbd_info->mutex);
 
 	z->wp = z->start;
-	z->verify_block = 0;
 
 	td->ts.nr_zone_resets++;
 
@@ -1085,6 +1084,11 @@ int zbd_setup_files(struct thread_data *td)
 	if (!zbd_verify_bs())
 		return 1;
 
+	if (td->o.experimental_verify) {
+		log_err("zonemode=zbd does not support experimental verify\n");
+		return 1;
+	}
+
 	for_each_file(td, f, i) {
 		struct zoned_block_device_info *zbd = f->zbd_info;
 		struct fio_zone_info *z;
@@ -1521,42 +1525,6 @@ out:
 	return z;
 }
 
-/* The caller must hold z->mutex. */
-static struct fio_zone_info *zbd_replay_write_order(struct thread_data *td,
-						    struct io_u *io_u,
-						    struct fio_zone_info *z)
-{
-	const struct fio_file *f = io_u->file;
-	const uint64_t min_bs = td->o.min_bs[DDIR_WRITE];
-
-	if (!zbd_open_zone(td, f, z)) {
-		zone_unlock(z);
-		z = zbd_convert_to_open_zone(td, io_u);
-		assert(z);
-	}
-
-	if (z->verify_block * min_bs >= z->capacity) {
-		log_err("%s: %d * %"PRIu64" >= %"PRIu64"\n",
-			f->file_name, z->verify_block, min_bs, z->capacity);
-		/*
-		 * If the assertion below fails during a test run, adding
-		 * "--experimental_verify=1" to the command line may help.
-		 */
-		assert(false);
-	}
-
-	io_u->offset = z->start + z->verify_block * min_bs;
-	if (io_u->offset + io_u->buflen >= zbd_zone_capacity_end(z)) {
-		log_err("%s: %llu + %llu >= %"PRIu64"\n",
-			f->file_name, io_u->offset, io_u->buflen,
-			zbd_zone_capacity_end(z));
-		assert(false);
-	}
-	z->verify_block += io_u->buflen / min_bs;
-
-	return z;
-}
-
 /*
  * Find another zone which has @min_bytes of readable data. Search in zones
  * @zb + 1 .. @zl. For random workload, also search in zones @zb - 1 .. @zf.
@@ -1907,10 +1875,8 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
 
 	switch (io_u->ddir) {
 	case DDIR_READ:
-		if (td->runstate == TD_VERIFYING && td_write(td)) {
-			zb = zbd_replay_write_order(td, io_u, zb);
+		if (td->runstate == TD_VERIFYING && td_write(td))
 			goto accept;
-		}
 
 		/*
 		 * Check that there is enough written data in the zone to do an
diff --git a/zbd.h b/zbd.h
index 0a73b41d..d425707e 100644
--- a/zbd.h
+++ b/zbd.h
@@ -25,7 +25,6 @@ enum io_u_action {
  * @start: zone start location (bytes)
  * @wp: zone write pointer location (bytes)
  * @capacity: maximum size usable from the start of a zone (bytes)
- * @verify_block: number of blocks that have been verified for this zone
  * @mutex: protects the modifiable members in this structure
  * @type: zone type (BLK_ZONE_TYPE_*)
  * @cond: zone state (BLK_ZONE_COND_*)
@@ -39,7 +38,6 @@ struct fio_zone_info {
 	uint64_t		start;
 	uint64_t		wp;
 	uint64_t		capacity;
-	uint32_t		verify_block;
 	enum zbd_zone_type	type:2;
 	enum zbd_zone_cond	cond:4;
 	unsigned int		has_wp:1;
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 09/13] t/zbd: fix test case #33 for block size unaligned to zone size
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (7 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 08/13] zbd: prevent experimental verify with zonemode=zbd Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 10/13] t/zbd: modify test case #34 " Shin'ichiro Kawasaki
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

Recent fix of zonemode=zbd support for block size unaligned to zone size
unveiled that the test case #33 has two issues.

First issue is test preparation. Before the fix, write was done to only
to the first zone due to a bug in zone selection which happens when
block size is not a divisor of zone size. Then, status of second zone
did not affect. After the fix, write count to the zones may vary if the
second zone has almost full status since the zone is skipped from write
target. Fix this by resetting the write target zones.

Second issue is expected written data size. The test case checks that
the written data size is larger than the io_size option value. This
expectation was fine before the fix because data write was repeated in
do_io() and the limit was checked with io_issue_bytes_exceeded(),
which triggers loop break when written data is larger than io_size.
However, after the fix, the limit is checked with keep_running() in
thread_main(). According to code and block comment in keep_running(),
fio job terminates even when written data size is smaller than io_size
if the gap is smaller than maximum IO size. Then the expected written
data size is the largest multiple of block size smaller than or equal to
the io_size. This io_size check change resulted in the test case
failure. Avoid the failure by fixing the expected written data size
calculation.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 t/zbd/test-zbd-support | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index cdc03f28..debe3763 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -813,7 +813,8 @@ test33() {
     local bs io_size size
     local off capacity=0;
 
-    prep_write
+    [ -n "$is_zbd" ] && reset_zone "$dev" -1
+
     off=$((first_sequential_zone_sector * 512))
     capacity=$(total_zone_capacity 1 $off $dev)
     size=$((2 * zone_size))
@@ -822,7 +823,7 @@ test33() {
     run_fio_on_seq "$(ioengine "psync")" --iodepth=1 --rw=write	\
 		   --size=$size --io_size=$io_size --bs=$bs	\
 		   >> "${logfile}.${test_number}" 2>&1 || return $?
-    check_written $(((io_size + bs - 1) / bs * bs)) || return $?
+    check_written $((io_size / bs * bs)) || return $?
 }
 
 # Write to sequential zones with a block size that is not a divisor of the
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 10/13] t/zbd: modify test case #34 for block size unaligned to zone size
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (8 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 09/13] t/zbd: fix test case #33 for block size unaligned to zone size Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 11/13] t/zbd: add test case to check zone_reset_threshold option with verify Shin'ichiro Kawasaki
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

The test case #34 confirmed that the block size unaligned to zone size
is handled as an error. After recent fix, now fio is able to handle such
block sizes, then the check for the error is no longer required.

Instead of removing this unnecessary test case, change it to cover
verify with complex workload. It runs read-write mix workload with high
queue depth with verify and block size unaligned to zone size. This test
is same as test case #57 except the verify option.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 t/zbd/test-zbd-support | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index debe3763..e17a81de 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -826,17 +826,23 @@ test33() {
     check_written $((io_size / bs * bs)) || return $?
 }
 
-# Write to sequential zones with a block size that is not a divisor of the
-# zone size and with data verification enabled.
+# Test repeated async write job with verify.
 test34() {
-    local size
+	local bs off
 
-    prep_write
-    size=$((2 * zone_size))
-    run_fio_on_seq "$(ioengine "psync")" --iodepth=1 --rw=write --size=$size \
-		   --do_verify=1 --verify=md5 --bs=$((3 * zone_size / 4)) \
-		   >> "${logfile}.${test_number}" 2>&1 && return 1
-    grep -q 'not a divisor of' "${logfile}.${test_number}"
+	require_zbd || return $SKIP_TESTCASE
+	prep_write
+
+	bs=$((4096 * 7))
+	off=$((first_sequential_zone_sector * 512))
+
+	run_fio --name=job --filename="${dev}" --rw=randwrite --bs="${bs}" \
+		--offset="${off}" --size=$((4 * zone_size)) --iodepth=256 \
+		"$(ioengine "libaio")" --time_based=1 --runtime=30s \
+		--zonemode=zbd --direct=1 --zonesize="${zone_size}" \
+		--verify=crc32c --do_verify=1 \
+		${job_var_opts[@]} \
+		>> "${logfile}.${test_number}" 2>&1 || return $?
 }
 
 # Test 1/4 for the I/O boundary rounding code: $size < $zone_size.
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 11/13] t/zbd: add test case to check zone_reset_threshold option with verify
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (9 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 10/13] t/zbd: modify test case #34 " Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 12/13] t/zbd: remove experimental_verify option from test case #54 Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 13/13] t/zbd: add test case to check experimental_verify option Shin'ichiro Kawasaki
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

Recent commit fix assertion failure observed with zone_reset_threshold
option with verify. Add a test case to confirm the fix.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 t/zbd/test-zbd-support | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index e17a81de..0e185997 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -1276,6 +1276,20 @@ test58() {
 	    >>"${logfile}.${test_number}" 2>&1
 }
 
+# Test zone_reset_threshold with verify.
+test59() {
+	local off bs loops=2 size=$((zone_size))
+
+	prep_write
+	off=$((first_sequential_zone_sector * 512))
+
+	bs=$(min $((256*1024)) "$zone_size")
+	run_fio_on_seq "$(ioengine "psync")" --rw=write --bs="$bs" \
+		       --size=$size --loops=$loops --do_verify=1 --verify=md5 \
+		       --zone_reset_frequency=.9 --zone_reset_threshold=.1 \
+		       >> "${logfile}.${test_number}" 2>&1 || return $?
+}
+
 SECONDS=0
 tests=()
 dynamic_analyzer=()
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 12/13] t/zbd: remove experimental_verify option from test case #54
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (10 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 11/13] t/zbd: add test case to check zone_reset_threshold option with verify Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  2022-10-28 11:53 ` [PATCH v3 13/13] t/zbd: add test case to check experimental_verify option Shin'ichiro Kawasaki
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

The option experimental_verify does not work with zonemode=zbd. Remove
it from the test case #54.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 t/zbd/test-zbd-support | 1 -
 1 file changed, 1 deletion(-)

diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index 0e185997..ce4a5916 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -1178,7 +1178,6 @@ test54() {
 		--rw=randrw:2 --rwmixwrite=25 --bsrange=4k-${zone_size} \
 		--zonemode=zbd --zonesize=${zone_size} \
 		--verify=crc32c --do_verify=1 --verify_backlog=2 \
-		--experimental_verify=1 \
 		--alloc-size=65536 --random_generator=tausworthe64 \
 		${job_var_opts[@]} --debug=zbd \
 		>> "${logfile}.${test_number}" 2>&1 || return $?
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 13/13] t/zbd: add test case to check experimental_verify option
  2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
                   ` (11 preceding siblings ...)
  2022-10-28 11:53 ` [PATCH v3 12/13] t/zbd: remove experimental_verify option from test case #54 Shin'ichiro Kawasaki
@ 2022-10-28 11:53 ` Shin'ichiro Kawasaki
  12 siblings, 0 replies; 16+ messages in thread
From: Shin'ichiro Kawasaki @ 2022-10-28 11:53 UTC (permalink / raw)
  To: fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel, Shin'ichiro Kawasaki

The option experimental_verify does not work with zonemode=zbd. Add a
test case to check that fio errors out when the both options are
specified.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 t/zbd/test-zbd-support | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/t/zbd/test-zbd-support b/t/zbd/test-zbd-support
index ce4a5916..4bc29d1b 100755
--- a/t/zbd/test-zbd-support
+++ b/t/zbd/test-zbd-support
@@ -1289,6 +1289,14 @@ test59() {
 		       >> "${logfile}.${test_number}" 2>&1 || return $?
 }
 
+# Test fio errors out experimental_verify option with zonemode=zbd.
+test60() {
+	run_fio_on_seq "$(ioengine "psync")" --rw=write --size=$zone_size \
+		       --do_verify=1 --verify=md5 --experimental_verify=1 \
+		       >> "${logfile}.${test_number}" 2>&1 && return 1
+	grep -q 'not support experimental verify' "${logfile}.${test_number}"
+}
+
 SECONDS=0
 tests=()
 dynamic_analyzer=()
-- 
2.37.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* RE: [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option
  2022-10-28 11:53 ` [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option Shin'ichiro Kawasaki
@ 2022-11-01 16:08   ` Vincent Fu
  2022-11-02  7:35     ` Shinichiro Kawasaki
  0 siblings, 1 reply; 16+ messages in thread
From: Vincent Fu @ 2022-11-01 16:08 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, fio, Jens Axboe, Vincent Fu
  Cc: Damien Le Moal, Dmitry Fomichev, Niklas Cassel

> -----Original Message-----
> From: Shin'ichiro Kawasaki [mailto:shinichiro.kawasaki@wdc.com]
> Sent: Friday, October 28, 2022 7:54 AM
> To: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu
> <vincentfu@gmail.com>
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>; Dmitry
> Fomichev <Dmitry.Fomichev@wdc.com>; Niklas Cassel
> <niklas.cassel@wdc.com>; Shin'ichiro Kawasaki
> <shinichiro.kawasaki@wdc.com>
> Subject: [PATCH v3 06/13] zbd: verify before zone reset for
> zone_reset_threshold option
> 
> When verify option is specified together with zonemode=zbd and
> zone_reset_threshold option, zone reset happens to zones not full. This
> erases data for verify and causes verify failure. Current implementation
> avoids this scenario by assert.
> 
> To allow zone_reset_threshold option together with zonemode=zbd and
> verify option, do verify before the zone reset. When zone reset is
> required to an open zone with verify data, call get_next_verify() to
> get verify read unit and return it from zbd_adjust_block().
> 
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> ---
>  zbd.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/zbd.c b/zbd.c
> index 6f4e5ab2..af761c17 100644
> --- a/zbd.c
> +++ b/zbd.c
> @@ -2032,7 +2032,10 @@ retry:
> 
>  		/* Reset the zone pointer if necessary */
>  		if (zb->reset_zone || zbd_zone_full(f, zb, min_bs)) {
> -			assert(td->o.verify == VERIFY_NONE);
> +			if (td->o.verify != VERIFY_NONE &&
> +			    !get_next_verify(td, io_u))
> +				goto accept;
> +
>  			/*
>  			 * Since previous write requests may have been
> submitted
>  			 * asynchronously and since we will submit the
> zone
> --
> 2.37.1

The new test case 59 assesses the above functionality with a sequential write
workload. But when I try a random write workload, it seems like randomly
generated write offsets may no longer be changed to a zone write pointer.

Am I missing something?

# uname -a
Linux ubuntu 6.0.0 #1 SMP PREEMPT_DYNAMIC Tue Nov 1 13:49:21 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
# modprobe null_blk zoned=1 gb=1 memory_backed=1
# ./fio --name=test --filename=/dev/nullb0 --zonemode=zbd --bs=64M --rw=randwrite --direct=1 --verify=crc32c --zone_reset_threshold=0.1 --zone_reset_frequency=1 --debug=io,zbd
fio: set debug option io
fio: set debug option zbd
test: (g=0): rw=randwrite, bs=(R) 64.0MiB-64.0MiB, (W) 64.0MiB-64.0MiB, (T) 64.0MiB-64.0MiB, ioengine=psync, iodepth=1
fio-3.30-223-g33796
Starting 1 process
zbd      4343  /dev/nullb0: zbd model string: host-managed
zbd      4343  Device /dev/nullb0 has 4 zones of size 262144 KB
zbd      4343  /dev/nullb0: max open zones supported by device: 0
zbd      4343  /dev/nullb0: using max open zones limit: 0
zbd      4345  zbd_file_reset(/dev/nullb0): swd = 134217728
zbd      4345  /dev/nullb0: examining zones 0 .. 4
zbd      4345  /dev/nullb0: resetting zone 0
zbd      4345  /dev/nullb0: resetting wp of zone 0.
zbd      4345  /dev/nullb0: resetting zone 2
zbd      4345  /dev/nullb0: resetting wp of zone 2.
io       4345  drop page cache /dev/nullb0
io       4345  fill: io_u 0x55cc663f3180: off=0x0,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  prep: io_u 0x55cc663f3180: off=0x0,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  queue: io_u 0x55cc663f3180: off=0x0,len=0x4000000,ddir=1,file=/dev/nullb0
zbd      4345  /dev/nullb0: queued I/O (0, 67108864) for zone 0
io       4345  complete: io_u 0x55cc663f3180: off=0x0,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  fill: io_u 0x55cc663f3180: off=0x20000000,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  prep: io_u 0x55cc663f3180: off=0x20000000,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  queue: io_u 0x55cc663f3180: off=0x20000000,len=0x4000000,ddir=1,file=/dev/nullb0
zbd      4345  /dev/nullb0: queued I/O (536870912, 67108864) for zone 2
io       4345  complete: io_u 0x55cc663f3180: off=0x20000000,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  fill: io_u 0x55cc663f3180: off=0x34000000,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  prep: io_u 0x55cc663f3180: off=0x34000000,len=0x4000000,ddir=1,file=/dev/nullb0
io       4345  queue: io_u 0x55cc663f3180: off=0x34000000,len=0x4000000,ddir=1,file=/dev/nullb0
fio: io_u error on file /dev/nullb0: Input/output error: write offset=872415232, buflen=67108864
io       4345  io_u_queued_complete: min=0
io       4345  getevents: 0
fio: pid=4345, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
io       4345  close ioengine psync
io       4345  free ioengine psync

test: (groupid=0, jobs=1): err= 5 (file:io_u.c:1876, func=io_u error, error=Input/output error): pid=4345: Tue Nov  1 14:15:31 2022
  write: IOPS=10, BW=432MiB/s (453MB/s)(128MiB/296msec); 2 zone resets
    clat (usec): min=73204, max=79962, avg=76583.41, stdev=4778.37
     lat (msec): min=102, max=112, avg=107.51, stdev= 7.04
    clat percentiles (usec):
     |  1.00th=[72877],  5.00th=[72877], 10.00th=[72877], 20.00th=[72877],
     | 30.00th=[72877], 40.00th=[72877], 50.00th=[72877], 60.00th=[80217],
     | 70.00th=[80217], 80.00th=[80217], 90.00th=[80217], 95.00th=[80217],
     | 99.00th=[80217], 99.50th=[80217], 99.90th=[80217], 99.95th=[80217],
     | 99.99th=[80217]
  lat (msec)   : 100=66.67%
  cpu          : usr=29.49%, sys=4.41%, ctx=133, majf=0, minf=18
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=25.0%, 4=75.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,3,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=432MiB/s (453MB/s), 432MiB/s-432MiB/s (453MB/s-453MB/s), io=128MiB (134MB), run=296-296msec

Disk stats (read/write):
  nullb0: ios=0/91, merge=0/0, ticks=0/435, in_queue=435, util=19.92%

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option
  2022-11-01 16:08   ` Vincent Fu
@ 2022-11-02  7:35     ` Shinichiro Kawasaki
  0 siblings, 0 replies; 16+ messages in thread
From: Shinichiro Kawasaki @ 2022-11-02  7:35 UTC (permalink / raw)
  To: Vincent Fu
  Cc: fio, Jens Axboe, Vincent Fu, Damien Le Moal, Dmitry Fomichev,
	Niklas Cassel

On Nov 01, 2022 / 16:08, Vincent Fu wrote:
> > -----Original Message-----
> > From: Shin'ichiro Kawasaki [mailto:shinichiro.kawasaki@wdc.com]
> > Sent: Friday, October 28, 2022 7:54 AM
> > To: fio@vger.kernel.org; Jens Axboe <axboe@kernel.dk>; Vincent Fu
> > <vincentfu@gmail.com>
> > Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>; Dmitry
> > Fomichev <Dmitry.Fomichev@wdc.com>; Niklas Cassel
> > <niklas.cassel@wdc.com>; Shin'ichiro Kawasaki
> > <shinichiro.kawasaki@wdc.com>
> > Subject: [PATCH v3 06/13] zbd: verify before zone reset for
> > zone_reset_threshold option
> > 
> > When verify option is specified together with zonemode=zbd and
> > zone_reset_threshold option, zone reset happens to zones not full. This
> > erases data for verify and causes verify failure. Current implementation
> > avoids this scenario by assert.
> > 
> > To allow zone_reset_threshold option together with zonemode=zbd and
> > verify option, do verify before the zone reset. When zone reset is
> > required to an open zone with verify data, call get_next_verify() to
> > get verify read unit and return it from zbd_adjust_block().
> > 
> > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> > Tested-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> > ---
> >  zbd.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/zbd.c b/zbd.c
> > index 6f4e5ab2..af761c17 100644
> > --- a/zbd.c
> > +++ b/zbd.c
> > @@ -2032,7 +2032,10 @@ retry:
> > 
> >  		/* Reset the zone pointer if necessary */
> >  		if (zb->reset_zone || zbd_zone_full(f, zb, min_bs)) {
> > -			assert(td->o.verify == VERIFY_NONE);
> > +			if (td->o.verify != VERIFY_NONE &&
> > +			    !get_next_verify(td, io_u))
> > +				goto accept;
> > +
> >  			/*
> >  			 * Since previous write requests may have been
> > submitted
> >  			 * asynchronously and since we will submit the
> > zone
> > --
> > 2.37.1
> 
> The new test case 59 assesses the above functionality with a sequential write
> workload. But when I try a random write workload, it seems like randomly
> generated write offsets may no longer be changed to a zone write pointer.
> 
> Am I missing something?

No, you caught a bug. Thank you!

I noticed that get_next_verify() does nothing to io_u when io-u>file is set. So
the hunk above does not return verify read io_u from zbd_adjust_block(). Just
returns the original write io_u without adjustment to write pointer. Hence the
weird symptom you observed. (I checked verify debug option output, but
overlooked the returned io_u was not read but write. My bad.)

I modified the hunk to unset io_u->file before calling get_next_verify(), then
observed the verify read happens as expected. I also found "goto accept" in the
hunk was not good either, since it does not unlock the current selected zone.
The hunk [1] should work good.

On top of the workloads write and randwrite, I also tried rw and randrw. They
reported another failure symptom "rand_seed verify check failure". It happened
because random seed was not reset for the verify read when zone_reset_threshold
option is specified since all write operations have not yet completed. This is
same story as verify_backlog option. I created another hunk [2] which skips the
rand_seed check in same manner as verify_backlog. It avoided the failure.

With the changes [1] and [2], the zone_reset_threshold option looks working good
with verify. After some more testing, I will post v4 with the changes. I will
also improve the test case #59 to cover the various workloads, write, randwrite,
rw and randrw.


[1]

diff --git a/zbd.c b/zbd.c
index 6f4e5ab2..fadeb458 100644
--- a/zbd.c
+++ b/zbd.c
@@ -2032,7 +2032,19 @@ retry:
 
 		/* Reset the zone pointer if necessary */
 		if (zb->reset_zone || zbd_zone_full(f, zb, min_bs)) {
-			assert(td->o.verify == VERIFY_NONE);
+			if (td->o.verify != VERIFY_NONE) {
+				/*
+				 * Unset io-u->file to tell get_next_verify()
+				 * that this IO is not requeue.
+				 */
+				io_u->file = NULL;
+				if (!get_next_verify(td, io_u)) {
+					zone_unlock(zb);
+					return io_u_accept;
+				}
+				io_u->file = f;
+			}
+
 			/*
 			 * Since previous write requests may have been submitted
 			 * asynchronously and since we will submit the zone



[2]

diff --git a/verify.c b/verify.c
index d6a229ca..ddfadcc8 100644
--- a/verify.c
+++ b/verify.c
@@ -917,9 +917,11 @@ int verify_io_u(struct thread_data *td, struct io_u **io_u_ptr)
 		hdr = p;
 
 		/*
-		 * Make rand_seed check pass when have verify_backlog.
+		 * Make rand_seed check pass when have verify_backlog or
+		 * zone reset frequency for zonemode=zbd.
 		 */
-		if (!td_rw(td) || (td->flags & TD_F_VER_BACKLOG))
+		if (!td_rw(td) || (td->flags & TD_F_VER_BACKLOG) ||
+		    td->o.zrf.u.f)
 			io_u->rand_seed = hdr->rand_seed;
 
 		if (td->o.verify != VERIFY_PATTERN_NO_HDR) {

-- 
Shin'ichiro Kawasaki

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-02  7:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 11:53 [PATCH v3 00/13] zbd: fix unaligned block size issue and verify issues Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 01/13] oslib: blkzoned: add blkzoned_finish_zone() helper function Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 02/13] engines/libzbc: add libzbc_finish_zone() " Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 03/13] zbd: add zbd_zone_remainder() " Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 04/13] zbd: finish zones with remainder smaller than minimum write block size Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 05/13] zbd: allow block size not divisor of zone size Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 06/13] zbd: verify before zone reset for zone_reset_threshold option Shin'ichiro Kawasaki
2022-11-01 16:08   ` Vincent Fu
2022-11-02  7:35     ` Shinichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 07/13] zbd: fix zone reset condition for verify_backlog option Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 08/13] zbd: prevent experimental verify with zonemode=zbd Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 09/13] t/zbd: fix test case #33 for block size unaligned to zone size Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 10/13] t/zbd: modify test case #34 " Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 11/13] t/zbd: add test case to check zone_reset_threshold option with verify Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 12/13] t/zbd: remove experimental_verify option from test case #54 Shin'ichiro Kawasaki
2022-10-28 11:53 ` [PATCH v3 13/13] t/zbd: add test case to check experimental_verify option Shin'ichiro Kawasaki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).