From: Jens Axboe <axboe@kernel.dk>
To: Bart Van Assche <Bart.VanAssche@wdc.com>,
"mb@lightnvm.io" <mb@lightnvm.io>,
"loberman@redhat.com" <loberman@redhat.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: [PATCH 0/2] null_blk: zone support
Date: Tue, 10 Jul 2018 12:45:47 -0600 [thread overview]
Message-ID: <9225abd8-35de-641d-2d2b-7ed566fb9956@kernel.dk> (raw)
In-Reply-To: <a183979f10fda180176f671702f69efe338f56e7.camel@wdc.com>
On 7/10/18 10:47 AM, Bart Van Assche wrote:
> On Tue, 2018-07-10 at 08:46 -0600, Jens Axboe wrote:
>> On 7/9/18 6:05 PM, Bart Van Assche wrote:
>>> On Mon, 2018-07-09 at 10:34 -0600, Jens Axboe wrote:
>>>> In the spirit of making some progress on this, I just don't like how
>>>> it's done. For example, it should not be necessary to adjust what
>>>> comes out of the block generator, instead the block generator should
>>>> be told to do what we need on zbc. This is a key concept. The workload
>>>> should be defined as such that it works for zoned devices.
>>>
>>> How would you like to see block generation work? I don't see an
>>> alternative for random I/O other starting from the output of a random
>>> generator and translating that output into something that is
>>> appropriate for a zoned block device. Random reads must happen below
>>> the zone pointer if fio is configured to read below the zone pointer.
>>> Random writes must happen at the write pointer. The only way I see to
>>> implement such an I/O pattern is to start from the output of a random
>>> generator and to adjust the output of that random generator. However,
>>> I don't have a strong opinion whether adjusting the output of a random
>>> generator should happen by the caller of get_next_buflen() or inside
>>> get_next_buflen(). Or is your concern perhaps that the current
>>> approach interferes with fio job options like bs_unaligned?
>>
>> The main issue I have with that approach is that the core of fio is
>> generating the IO patterns, and then you are just changing them as you
>> see fit. This means that the workload definition and the resulting IO
>> operations are no longer matched up, since they now also depend on what
>> you are running on. If I take one workload and run it on a zoned drive,
>> and then run it on a non-zoned drive, I can't compare the results at
>> all. This is a showstopper.
>>
>> There should be no adjusting of the output, rather it should be possible
>> to write zoned friendly job definitions. It should be possible to run
>> the same job on a non-zoned drive, and vice versa, and the resulting IO
>> patterns must be the same.
>>
>> Fio already has some notion of zones. Maybe that could be extended to
>> hard zones, and some control of open zones, and patterns within those
>> zones?
>
> Hello Jens,
>
> How about adding a job option that makes it possible to use the zoned
> block device (ZBD) I/O pattern on non-ZBD devices, requiring that the
> zone size is set explicitly for non-ZBD devices and maintaining a write
> pointer not only when performing I/O to a ZBD device but also if a
> ZBD-style I/O pattern is applied to a non-ZBD disk? This should allow to
> apply exactly the same workload to a non-ZBD disk as to a ZBD disk.
It just doesn't make any sense to me. The source of truth is the
generator of the IO, which does exactly what it is told by the job
definition. You're proposing to mangle that somehow, to fit some
restrictions that the underlying device has. That very concept is
foreign, and adding an option to be able to do the same on some other
device is misleading. The difference between the job file and the
workload run can be huge. Consider something really basic:
[randwrites]
bs=4k
rw=randwrite
which would be 100% random 4k writes. If I run this on a zoned device,
then that'd turn into 100% sequential writes. That makes no sense at
all. And if I run it on a different devices, I'd get 100% random writes.
Except if I set some magic option. Sorry, but that concept is just too
ugly to live, it makes zero sense. Put down your zoned hat for a bit and
think about it.
> What I derived from the fio source code is as follows (please correct me
> if I got anything wrong):
> * The purpose of the zonesize, zonerange and zoneskip job options is to
> limit the I/O range to a single zone with size "zonesize". The I/O
> pattern for zoned block devices is different: I/O happens in multiple
> zones simultaneously. The number of zones to which I/O happens is
> called the number of open zones.
The only difference is that fio currently only has one zone active. When
it finishes one, it goes to the next. See my above suggestion on adding
the notion of open zones, which would extend this to more than 1.
> * The purpose of the random_distribution=zoned{_abs} job option is to
> allow the user to skew a uniform random distribution. This is another
> workload pattern than the typical pattern for ZBD drives.
Fio's zones were never intended to be for zoned devices. Don't get hung
up on current use cases, think about what kind of definitions would make
sense for zoned devices.
--
Jens Axboe
next prev parent reply other threads:[~2018-07-10 18:45 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-06 17:38 [PATCH 0/2] null_blk: zone support Matias Bjørling
2018-07-06 17:38 ` [PATCH 1/2] null_blk: move shared definitions to header file Matias Bjørling
2018-07-06 17:38 ` [PATCH 2/2] null_blk: add zone support Matias Bjørling
2018-07-06 17:45 ` [PATCH 0/2] null_blk: " Laurence Oberman
2018-07-09 7:54 ` Matias Bjørling
2018-07-09 16:34 ` Jens Axboe
2018-07-10 0:05 ` Bart Van Assche
2018-07-10 14:46 ` Jens Axboe
2018-07-10 16:47 ` Bart Van Assche
2018-07-10 18:45 ` Jens Axboe [this message]
2018-07-10 18:49 ` Bart Van Assche
2018-07-10 18:51 ` Jens Axboe
2018-08-09 20:51 ` Zoned block device support for fio (was: [PATCH 0/2] null_blk: zone support) Bart Van Assche
2018-08-09 21:03 ` Zoned block device support for fio Jens Axboe
2018-08-15 18:07 ` Bart Van Assche
2018-07-07 2:54 ` [PATCH 0/2] null_blk: zone support Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9225abd8-35de-641d-2d2b-7ed566fb9956@kernel.dk \
--to=axboe@kernel.dk \
--cc=Bart.VanAssche@wdc.com \
--cc=Damien.LeMoal@wdc.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=mb@lightnvm.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).