linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Javier González" <javier@javigon.com>
To: Matias Bjorling <Matias.Bjorling@wdc.com>
Cc: "lsf-pc@lists.linux-foundation.org" 
	<lsf-pc@lists.linux-foundation.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: [LSF/MM TOPIC] Zoned Block Devices
Date: Tue, 29 Jan 2019 09:25:17 +0100	[thread overview]
Message-ID: <F88F3B28-4CC8-4B54-AE2F-341936C0C184@javigon.com> (raw)
In-Reply-To: <714fc666-c562-83c2-c1a3-19f1dd47d1d9@wdc.com>

[-- Attachment #1: Type: text/plain, Size: 2501 bytes --]

> On 28 Jan 2019, at 13.56, Matias Bjorling <Matias.Bjorling@wdc.com> wrote:
> 
> Hi,
> 
> Damien and I would like to propose a couple of topics centering around
> zoned block devices:
> 
> 1) Zoned block devices require that writes to a zone are sequential. If
> the writes are dispatched to the device out of order, the drive rejects
> the write with a write failure.
> 
> So far it has been the responsibility the deadline I/O scheduler to
> serialize writes to zones to avoid intra-zone write command reordering.
> This I/O scheduler based approach has worked so far for HDDs, but we can
> do better for multi-queue devices. NVMe has support for multiple queues,
> and one could dedicate a single queue to writes alone. Furthermore, the
> queue is processed in-order, enabling the host to serialize writes on
> the queue, instead of issuing them one by one. We like to gather
> feedback on this approach (new HCTX_TYPE_WRITE).
> 
> 2) Adoption of Zone Append in file-systems and user-space applications.
> 
> A Zone Append command, together with Zoned Namespaces, is being defined
> in the NVMe workgroup. The new command allows one to automatically
> direct writes to a zone write pointer position, similarly to writing to
> a file open with O_APPEND. With this write append command, the drive
> returns where data was written in the zone. Providing two benefits:
> 
> (A) It moves the fine-grained logical block allocation in file-systems
> to the device side. A file-system continues to do coarse-grained logical
> block allocation, but the specific LBAs where data is written and
> reported from the device. Thus improving file-system performance. The
> current target is XFS but we would like to hear the feasibility of it
> being used in other file-systems.
> 
> (B) It lets host issue multiple outstanding write I/Os to a zone,
> without having to maintain I/O order. Thus, improving the performance of
> the drive, but also reducing the need for zone locking on the host side.
> 
> Is there other use-cases for this, and will an interface like this be
> valuable
> in the kernel? If the interface is successful, we would expect the
> interface to move to ATA/SCSI for standardization as well.
> 
> Thanks, Matias

This topic is of interest to me as well.

For the append command, I think we also need to discuss the error model
as writes should be able to fail (e.g., a zone has shrink due to
previous, hidden, write errors and the host has not updated the zone
metadata).

Thanks,
Javier

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2019-01-29  8:25 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-28 12:56 [LSF/MM TOPIC] Zoned Block Devices Matias Bjorling
2019-01-28 15:07 ` Bart Van Assche
2019-01-28 18:40   ` Matias Bjorling
2019-01-29  8:25 ` Javier González [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F88F3B28-4CC8-4B54-AE2F-341936C0C184@javigon.com \
    --to=javier@javigon.com \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Matias.Bjorling@wdc.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).