All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
@ 2020-04-16 11:30 Shin'ichiro Kawasaki
  2020-04-17  4:58 ` Damien Le Moal
  2020-04-17 14:32 ` Jens Axboe
  0 siblings, 2 replies; 7+ messages in thread
From: Shin'ichiro Kawasaki @ 2020-04-16 11:30 UTC (permalink / raw)
  To: fio, Jens Axboe; +Cc: Alexey Dobriyan, Damien Le Moal, Shinichiro Kawasaki

Commit fb0259fb ("zbd: Ensure first I/O is write for random read/write to
sequential zones") introduced a step to change direction of io_u from
read to write when that is the first I/O of the random read/write
workload to zoned block devices. However, such direction adjustment
results in inconsistent I/O length when read block size and write block
size are different.

To avoid the inconsistency between I/O direction and I/O length,
adjust the I/O direction before the I/O length is set. Move the step
from zbd_adjust_block() to set_rw_ddir(). To minimize changes in
set_rw_ddir(), introduce zbd_adjust_ddir() helper function.

Fixes: fb0259fb ("zbd: Ensure first I/O is write for random read/write to sequential zones")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
 io_u.c |  2 ++
 zbd.c  | 40 ++++++++++++++++++++++++++++++----------
 zbd.h  |  2 ++
 3 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/io_u.c b/io_u.c
index 5d62a76c..18e94617 100644
--- a/io_u.c
+++ b/io_u.c
@@ -746,6 +746,8 @@ static void set_rw_ddir(struct thread_data *td, struct io_u *io_u)
 {
 	enum fio_ddir ddir = get_rw_ddir(td);
 
+	ddir = zbd_adjust_ddir(td, io_u, ddir);
+
 	if (td_trimwrite(td)) {
 		struct fio_file *f = io_u->file;
 		if (f->last_pos[DDIR_WRITE] == f->last_pos[DDIR_TRIM])
diff --git a/zbd.c b/zbd.c
index de0c5bf4..8dc3c397 100644
--- a/zbd.c
+++ b/zbd.c
@@ -1331,6 +1331,36 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
 	}
 }
 
+/**
+ * zbd_adjust_ddir - Adjust an I/O direction for zonedmode=zbd.
+ *
+ * @td: FIO thread data.
+ * @io_u: FIO I/O unit.
+ * @ddir: I/O direction before adjustment.
+ *
+ * Return adjusted I/O direction.
+ */
+enum fio_ddir zbd_adjust_ddir(struct thread_data *td, struct io_u *io_u,
+			      enum fio_ddir ddir)
+{
+	/*
+	 * In case read direction is chosen for the first random I/O, fio with
+	 * zonemode=zbd stops because no data can be read from zoned block
+	 * devices with all empty zones. Overwrite the first I/O direction as
+	 * write to make sure data to read exists.
+	 */
+	if (td->o.zone_mode != ZONE_MODE_ZBD ||
+	    ddir != DDIR_READ ||
+	    !td_rw(td))
+		return ddir;
+
+	if (io_u->file->zbd_info->sectors_with_data ||
+	    td->o.read_beyond_wp)
+		return DDIR_READ;
+
+	return DDIR_WRITE;
+}
+
 /**
  * zbd_adjust_block - adjust the offset and length as necessary for ZBD drives
  * @td: FIO thread data.
@@ -1364,16 +1394,6 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
 	if (!zbd_zone_swr(zb))
 		return io_u_accept;
 
-	/*
-	 * In case read direction is chosen for the first random I/O, fio with
-	 * zonemode=zbd stops because no data can be read from zoned block
-	 * devices with all empty zones. Overwrite the first I/O direction as
-	 * write to make sure data to read exists.
-	 */
-	if (td_rw(td) && !f->zbd_info->sectors_with_data
-	    && !td->o.read_beyond_wp)
-		io_u->ddir = DDIR_WRITE;
-
 	/*
 	 * Accept the I/O offset for reads if reading beyond the write pointer
 	 * is enabled.
diff --git a/zbd.h b/zbd.h
index 4eaf902e..5a660399 100644
--- a/zbd.h
+++ b/zbd.h
@@ -82,6 +82,8 @@ int zbd_init(struct thread_data *td);
 void zbd_file_reset(struct thread_data *td, struct fio_file *f);
 bool zbd_unaligned_write(int error_code);
 void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u);
+enum fio_ddir zbd_adjust_ddir(struct thread_data *td, struct io_u *io_u,
+			      enum fio_ddir ddir);
 enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u);
 char *zbd_write_status(const struct thread_stat *ts);
 
-- 
2.25.2



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-16 11:30 [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write Shin'ichiro Kawasaki
@ 2020-04-17  4:58 ` Damien Le Moal
  2020-04-17 14:32 ` Jens Axboe
  1 sibling, 0 replies; 7+ messages in thread
From: Damien Le Moal @ 2020-04-17  4:58 UTC (permalink / raw)
  To: Shinichiro Kawasaki, fio, Jens Axboe; +Cc: Alexey Dobriyan

On 2020/04/16 20:30, Shin'ichiro Kawasaki wrote:
> Commit fb0259fb ("zbd: Ensure first I/O is write for random read/write to
> sequential zones") introduced a step to change direction of io_u from
> read to write when that is the first I/O of the random read/write
> workload to zoned block devices. However, such direction adjustment
> results in inconsistent I/O length when read block size and write block
> size are different.
> 
> To avoid the inconsistency between I/O direction and I/O length,
> adjust the I/O direction before the I/O length is set. Move the step
> from zbd_adjust_block() to set_rw_ddir(). To minimize changes in
> set_rw_ddir(), introduce zbd_adjust_ddir() helper function.
> 
> Fixes: fb0259fb ("zbd: Ensure first I/O is write for random read/write to sequential zones")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>

Looks good to me.

Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>


> ---
>  io_u.c |  2 ++
>  zbd.c  | 40 ++++++++++++++++++++++++++++++----------
>  zbd.h  |  2 ++
>  3 files changed, 34 insertions(+), 10 deletions(-)
> 
> diff --git a/io_u.c b/io_u.c
> index 5d62a76c..18e94617 100644
> --- a/io_u.c
> +++ b/io_u.c
> @@ -746,6 +746,8 @@ static void set_rw_ddir(struct thread_data *td, struct io_u *io_u)
>  {
>  	enum fio_ddir ddir = get_rw_ddir(td);
>  
> +	ddir = zbd_adjust_ddir(td, io_u, ddir);
> +
>  	if (td_trimwrite(td)) {
>  		struct fio_file *f = io_u->file;
>  		if (f->last_pos[DDIR_WRITE] == f->last_pos[DDIR_TRIM])
> diff --git a/zbd.c b/zbd.c
> index de0c5bf4..8dc3c397 100644
> --- a/zbd.c
> +++ b/zbd.c
> @@ -1331,6 +1331,36 @@ void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u)
>  	}
>  }
>  
> +/**
> + * zbd_adjust_ddir - Adjust an I/O direction for zonedmode=zbd.
> + *
> + * @td: FIO thread data.
> + * @io_u: FIO I/O unit.
> + * @ddir: I/O direction before adjustment.
> + *
> + * Return adjusted I/O direction.
> + */
> +enum fio_ddir zbd_adjust_ddir(struct thread_data *td, struct io_u *io_u,
> +			      enum fio_ddir ddir)
> +{
> +	/*
> +	 * In case read direction is chosen for the first random I/O, fio with
> +	 * zonemode=zbd stops because no data can be read from zoned block
> +	 * devices with all empty zones. Overwrite the first I/O direction as
> +	 * write to make sure data to read exists.
> +	 */
> +	if (td->o.zone_mode != ZONE_MODE_ZBD ||
> +	    ddir != DDIR_READ ||
> +	    !td_rw(td))
> +		return ddir;
> +
> +	if (io_u->file->zbd_info->sectors_with_data ||
> +	    td->o.read_beyond_wp)
> +		return DDIR_READ;
> +
> +	return DDIR_WRITE;
> +}
> +
>  /**
>   * zbd_adjust_block - adjust the offset and length as necessary for ZBD drives
>   * @td: FIO thread data.
> @@ -1364,16 +1394,6 @@ enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u)
>  	if (!zbd_zone_swr(zb))
>  		return io_u_accept;
>  
> -	/*
> -	 * In case read direction is chosen for the first random I/O, fio with
> -	 * zonemode=zbd stops because no data can be read from zoned block
> -	 * devices with all empty zones. Overwrite the first I/O direction as
> -	 * write to make sure data to read exists.
> -	 */
> -	if (td_rw(td) && !f->zbd_info->sectors_with_data
> -	    && !td->o.read_beyond_wp)
> -		io_u->ddir = DDIR_WRITE;
> -
>  	/*
>  	 * Accept the I/O offset for reads if reading beyond the write pointer
>  	 * is enabled.
> diff --git a/zbd.h b/zbd.h
> index 4eaf902e..5a660399 100644
> --- a/zbd.h
> +++ b/zbd.h
> @@ -82,6 +82,8 @@ int zbd_init(struct thread_data *td);
>  void zbd_file_reset(struct thread_data *td, struct fio_file *f);
>  bool zbd_unaligned_write(int error_code);
>  void setup_zbd_zone_mode(struct thread_data *td, struct io_u *io_u);
> +enum fio_ddir zbd_adjust_ddir(struct thread_data *td, struct io_u *io_u,
> +			      enum fio_ddir ddir);
>  enum io_u_action zbd_adjust_block(struct thread_data *td, struct io_u *io_u);
>  char *zbd_write_status(const struct thread_stat *ts);
>  
> 


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-16 11:30 [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write Shin'ichiro Kawasaki
  2020-04-17  4:58 ` Damien Le Moal
@ 2020-04-17 14:32 ` Jens Axboe
  2020-04-17 17:30   ` Alexey Dobriyan
  2020-04-20  2:33   ` Damien Le Moal
  1 sibling, 2 replies; 7+ messages in thread
From: Jens Axboe @ 2020-04-17 14:32 UTC (permalink / raw)
  To: Shin'ichiro Kawasaki, fio; +Cc: Alexey Dobriyan, Damien Le Moal

On 4/16/20 5:30 AM, Shin'ichiro Kawasaki wrote:
> Commit fb0259fb ("zbd: Ensure first I/O is write for random read/write to
> sequential zones") introduced a step to change direction of io_u from
> read to write when that is the first I/O of the random read/write
> workload to zoned block devices. However, such direction adjustment
> results in inconsistent I/O length when read block size and write block
> size are different.
> 
> To avoid the inconsistency between I/O direction and I/O length,
> adjust the I/O direction before the I/O length is set. Move the step
> from zbd_adjust_block() to set_rw_ddir(). To minimize changes in
> set_rw_ddir(), introduce zbd_adjust_ddir() helper function.

I'm not a huge fan of this zbd sprinkling everywhere, but that's not the
fault of your fix. But I would really prefer if zbd got disentangled
from the core/hot paths completely, maybe by adding a io engine fn for
whatever it needs to do. At least that reduces the hot path overhead to
check a pointer and calling that function instead of having various
checks here and there.

Really not great to have a somewhat esoteric feature like zbd support
add _any_ overhead to the hot path at all.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-17 14:32 ` Jens Axboe
@ 2020-04-17 17:30   ` Alexey Dobriyan
  2020-04-20  2:41     ` Damien Le Moal
  2020-04-20  2:33   ` Damien Le Moal
  1 sibling, 1 reply; 7+ messages in thread
From: Alexey Dobriyan @ 2020-04-17 17:30 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Shin'ichiro Kawasaki, fio, Damien Le Moal

On Fri, Apr 17, 2020 at 08:32:05AM -0600, Jens Axboe wrote:
> I'm not a huge fan of this zbd sprinkling everywhere, but that's not the
> fault of your fix. But I would really prefer if zbd got disentangled
> from the core/hot paths completely, maybe by adding a io engine fn for
> whatever it needs to do.

I think it should be done at generator level. zonemode=strided/zbd is
really a specialized random generator. Such generator would issue first
write in ZBD mode and read/write in normal mode.

zbd_adjust_block() is strange. For example "norandommap" in ZBD mode
doesn't guarantee full LBA space coverage even with 1 thread but it
would guarantee it with 1 thread and generator which shuffles zone
numbers once and issues writes with any QD.

But the combinatorial explosion given the number of options scaries me.
And then things will become even more entertaining once QD > 1 zone appends
start to show up!


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-17 14:32 ` Jens Axboe
  2020-04-17 17:30   ` Alexey Dobriyan
@ 2020-04-20  2:33   ` Damien Le Moal
  1 sibling, 0 replies; 7+ messages in thread
From: Damien Le Moal @ 2020-04-20  2:33 UTC (permalink / raw)
  To: Jens Axboe, Shinichiro Kawasaki, fio; +Cc: Alexey Dobriyan

Jens,

On 2020/04/17 23:32, Jens Axboe wrote:
> On 4/16/20 5:30 AM, Shin'ichiro Kawasaki wrote:
>> Commit fb0259fb ("zbd: Ensure first I/O is write for random read/write to
>> sequential zones") introduced a step to change direction of io_u from
>> read to write when that is the first I/O of the random read/write
>> workload to zoned block devices. However, such direction adjustment
>> results in inconsistent I/O length when read block size and write block
>> size are different.
>>
>> To avoid the inconsistency between I/O direction and I/O length,
>> adjust the I/O direction before the I/O length is set. Move the step
>> from zbd_adjust_block() to set_rw_ddir(). To minimize changes in
>> set_rw_ddir(), introduce zbd_adjust_ddir() helper function.
> 
> I'm not a huge fan of this zbd sprinkling everywhere, but that's not the
> fault of your fix. But I would really prefer if zbd got disentangled
> from the core/hot paths completely, maybe by adding a io engine fn for
> whatever it needs to do. At least that reduces the hot path overhead to
> check a pointer and calling that function instead of having various
> checks here and there.

Understood. We will work on that.

> 
> Really not great to have a somewhat esoteric feature like zbd support
> add _any_ overhead to the hot path at all.

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-17 17:30   ` Alexey Dobriyan
@ 2020-04-20  2:41     ` Damien Le Moal
  2020-04-20 18:10       ` Alexey Dobriyan
  0 siblings, 1 reply; 7+ messages in thread
From: Damien Le Moal @ 2020-04-20  2:41 UTC (permalink / raw)
  To: Alexey Dobriyan, Jens Axboe; +Cc: Shinichiro Kawasaki, fio

On 2020/04/18 2:31, Alexey Dobriyan wrote:
> On Fri, Apr 17, 2020 at 08:32:05AM -0600, Jens Axboe wrote:
>> I'm not a huge fan of this zbd sprinkling everywhere, but that's not the
>> fault of your fix. But I would really prefer if zbd got disentangled
>> from the core/hot paths completely, maybe by adding a io engine fn for
>> whatever it needs to do.
> 
> I think it should be done at generator level. zonemode=strided/zbd is
> really a specialized random generator. Such generator would issue first
> write in ZBD mode and read/write in normal mode.
> 
> zbd_adjust_block() is strange. For example "norandommap" in ZBD mode
> doesn't guarantee full LBA space coverage even with 1 thread but it
> would guarantee it with 1 thread and generator which shuffles zone
> numbers once and issues writes with any QD.

I do not understand your point here: norandommap does *not* guarantee full block
space coverage at all, no matter the zonemode=xxx or no zonemode. For guaranteed
full block space coverage, norandommap should not be specified so that fio keeps
the history of past accesses, no ?

> 
> But the combinatorial explosion given the number of options scaries me.
> And then things will become even more entertaining once QD > 1 zone appends
> start to show up!

Indeed, zone append will be hard to support if we also want to have --verify
working, which is I think mandatory. But we need the kernel interface first. Not
there yet.


-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write
  2020-04-20  2:41     ` Damien Le Moal
@ 2020-04-20 18:10       ` Alexey Dobriyan
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Dobriyan @ 2020-04-20 18:10 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: Jens Axboe, Shinichiro Kawasaki, fio

On Mon, Apr 20, 2020 at 02:41:06AM +0000, Damien Le Moal wrote:
> On 2020/04/18 2:31, Alexey Dobriyan wrote:
> > On Fri, Apr 17, 2020 at 08:32:05AM -0600, Jens Axboe wrote:
> >> I'm not a huge fan of this zbd sprinkling everywhere, but that's not the
> >> fault of your fix. But I would really prefer if zbd got disentangled
> >> from the core/hot paths completely, maybe by adding a io engine fn for
> >> whatever it needs to do.
> > 
> > I think it should be done at generator level. zonemode=strided/zbd is
> > really a specialized random generator. Such generator would issue first
> > write in ZBD mode and read/write in normal mode.
> > 
> > zbd_adjust_block() is strange. For example "norandommap" in ZBD mode
> > doesn't guarantee full LBA space coverage even with 1 thread but it
> > would guarantee it with 1 thread and generator which shuffles zone
> > numbers once and issues writes with any QD.
> 
> I do not understand your point here: norandommap does *not* guarantee full block
> space coverage at all, no matter the zonemode=xxx or no zonemode. For guaranteed
> full block space coverage, norandommap should not be specified so that fio keeps
> the history of past accesses, no ?

I always forget which one is which.

Right now adjusting in zbd_adjust_block() doesn't guarantee full LBA
space coverage in cases where deleting "zonemode=" would have done it,
even with 1 thread:

[global]
filename=/dev/loop0	# 1GB
direct=1
zonemode=zbd
zonesize=1M
randseed=1
thread
bs=256K
rw=randwrite
ioengine=libaio
iodepth=1
[j]
max_open_zones=1
size=27M

	t 1     ZR 1
	t 1     ZR 2
	t 1     ZR 3
	t 1     ZR 4
	t 1     ZR 5
	t 1     ZR 6
	t 1     ZR 7
	t 1     ZR 8
	t 1     ZR 9
	t 1     ZR 10
	t 1     ZR 11
	t 1     ZR 12
	t 1     ZR 13
	t 1     ZR 14
	t 1     ZR 15
	t 1     ZR 16
	t 1     ZR 17
	t 1     ZR 18
	t 1     ZR 19
	t 1     ZR 20
	t 1     ZR 21
	t 1     ZR 22
	t 1     ZR 22	<===
	t 1     ZR 23
	t 1     ZR 24
	t 1     ZR 25
	t 1     ZR 26

Non-randomness is separate bug -- "z++" in zbd_convert_to_open_zone().


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-04-20 18:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-16 11:30 [PATCH v2] zbd: Fix I/O direction adjustment step for random read/write Shin'ichiro Kawasaki
2020-04-17  4:58 ` Damien Le Moal
2020-04-17 14:32 ` Jens Axboe
2020-04-17 17:30   ` Alexey Dobriyan
2020-04-20  2:41     ` Damien Le Moal
2020-04-20 18:10       ` Alexey Dobriyan
2020-04-20  2:33   ` Damien Le Moal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.