Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: "dgilbert@interlog.com" <dgilbert@interlog.com>,
	Christoph Hellwig <hch@lst.de>
Cc: "dm-devel@redhat.com" <dm-devel@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [PATCH v2 3/3] zonefs: fix synchronous write to sequential zone files
Date: Tue, 20 Apr 2021 01:35:21 +0000
Message-ID: <BL0PR04MB651451976F15C55D5578C131E7489@BL0PR04MB6514.namprd04.prod.outlook.com> (raw)
In-Reply-To: <9a4d5090-1a70-129a-72f7-3699db0038a1@interlog.com>

On 2021/04/20 10:20, Douglas Gilbert wrote:
> On 2021-04-19 2:45 a.m., Christoph Hellwig wrote:
>> On Sat, Apr 17, 2021 at 11:33:23AM +0900, Damien Le Moal wrote:
>>> Synchronous writes to sequential zone files cannot use zone append
>>> operations if the underlying zoned device queue limit
>>> max_zone_append_sectors is 0, indicating that the device does not
>>> support this operation. In this case, fall back to using regular write
>>> operations.
>>
>> Zone append is a mandatory feature of the zoned device API.
> 
> So a hack required for ZNS and not needed by ZBC and ZAC becomes
> a "mandatory feature" in a Linux API. Like many hacks, that one might
> come back to bite you :-)

Zone append is not a hack in ZNS. It is a write interface that fits very well
with the multi-queue nature of NVMe. The "hack" is the emulation in scsi.

We decided on having this mandatory for zoned devices (all types) to make sure
that file systems do not have to implement different IO paths for sequential
writing to zones. Zone append does simplify a lot of things and allows to get
the best performance from ZNS drives. Zone write locking/serialization of writes
per zones using regular writes is much harder to implement, make a mess of the
file system code, and would kill write performance on ZNS.


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-17  2:33 [PATCH v2 0/3] Fix dm-crypt zoned block device support Damien Le Moal
2021-04-17  2:33 ` [PATCH v2 1/3] dm: Introduce zone append support control Damien Le Moal
2021-04-17  2:33 ` [PATCH v2 2/3] dm crypt: Fix zoned block device support Damien Le Moal
2021-04-17 10:39   ` Johannes Thumshirn
2021-04-18 11:00   ` [dm-devel] " Milan Broz
2021-04-17  2:33 ` [PATCH v2 3/3] zonefs: fix synchronous write to sequential zone files Damien Le Moal
2021-04-19  6:45   ` Christoph Hellwig
2021-04-19  7:08     ` Damien Le Moal
2021-04-19  7:10       ` Christoph Hellwig
2021-04-20  1:20     ` Douglas Gilbert
2021-04-20  1:35       ` Damien Le Moal [this message]
2021-04-19 12:52 ` [dm-devel] [PATCH v2 0/3] Fix dm-crypt zoned block device support Mikulas Patocka
2021-04-19 13:02   ` Damien Le Moal
2021-04-19 13:55     ` Mikulas Patocka
2021-04-19 14:31       ` Milan Broz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL0PR04MB651451976F15C55D5578C131E7489@BL0PR04MB6514.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=dgilbert@interlog.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git