From: "Javier González" <javier@javigon.com>
To: Matias Bjorling <Matias.Bjorling@wdc.com>
Cc: "lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: [LSF/MM TOPIC] Zoned Block Devices
Date: Tue, 29 Jan 2019 09:25:17 +0100 [thread overview]
Message-ID: <F88F3B28-4CC8-4B54-AE2F-341936C0C184@javigon.com> (raw)
In-Reply-To: <714fc666-c562-83c2-c1a3-19f1dd47d1d9@wdc.com>
[-- Attachment #1: Type: text/plain, Size: 2501 bytes --]
> On 28 Jan 2019, at 13.56, Matias Bjorling <Matias.Bjorling@wdc.com> wrote:
>
> Hi,
>
> Damien and I would like to propose a couple of topics centering around
> zoned block devices:
>
> 1) Zoned block devices require that writes to a zone are sequential. If
> the writes are dispatched to the device out of order, the drive rejects
> the write with a write failure.
>
> So far it has been the responsibility the deadline I/O scheduler to
> serialize writes to zones to avoid intra-zone write command reordering.
> This I/O scheduler based approach has worked so far for HDDs, but we can
> do better for multi-queue devices. NVMe has support for multiple queues,
> and one could dedicate a single queue to writes alone. Furthermore, the
> queue is processed in-order, enabling the host to serialize writes on
> the queue, instead of issuing them one by one. We like to gather
> feedback on this approach (new HCTX_TYPE_WRITE).
>
> 2) Adoption of Zone Append in file-systems and user-space applications.
>
> A Zone Append command, together with Zoned Namespaces, is being defined
> in the NVMe workgroup. The new command allows one to automatically
> direct writes to a zone write pointer position, similarly to writing to
> a file open with O_APPEND. With this write append command, the drive
> returns where data was written in the zone. Providing two benefits:
>
> (A) It moves the fine-grained logical block allocation in file-systems
> to the device side. A file-system continues to do coarse-grained logical
> block allocation, but the specific LBAs where data is written and
> reported from the device. Thus improving file-system performance. The
> current target is XFS but we would like to hear the feasibility of it
> being used in other file-systems.
>
> (B) It lets host issue multiple outstanding write I/Os to a zone,
> without having to maintain I/O order. Thus, improving the performance of
> the drive, but also reducing the need for zone locking on the host side.
>
> Is there other use-cases for this, and will an interface like this be
> valuable
> in the kernel? If the interface is successful, we would expect the
> interface to move to ATA/SCSI for standardization as well.
>
> Thanks, Matias
This topic is of interest to me as well.
For the append command, I think we also need to discuss the error model
as writes should be able to fail (e.g., a zone has shrink due to
previous, hidden, write errors and the host has not updated the zone
metadata).
Thanks,
Javier
[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2019-01-29 8:25 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-28 12:56 [LSF/MM TOPIC] Zoned Block Devices Matias Bjorling
2019-01-28 15:07 ` Bart Van Assche
2019-01-28 18:40 ` Matias Bjorling
2019-01-29 8:25 ` Javier González [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=F88F3B28-4CC8-4B54-AE2F-341936C0C184@javigon.com \
--to=javier@javigon.com \
--cc=Damien.LeMoal@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).