From: David Sterba <dsterba@suse.cz>
To: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Cc: "Javier González" <javier@javigon.com>,
"Christoph Hellwig" <hch@lst.de>,
"Matias Bjørling" <Matias.Bjorling@wdc.com>,
"Damien Le Moal" <damien.lemoal@opensource.wdc.com>,
"Luis Chamberlain" <mcgrof@kernel.org>,
"Keith Busch" <kbusch@kernel.org>,
"Pankaj Raghav" <p.raghav@samsung.com>,
"Adam Manzanares" <a.manzanares@samsung.com>,
"jiangbo.365@bytedance.com" <jiangbo.365@bytedance.com>,
"kanchan Joshi" <joshi.k@samsung.com>,
"Jens Axboe" <axboe@kernel.dk>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Pankaj Raghav" <pankydev8@gmail.com>,
"Kanchan Joshi" <joshiiitr@gmail.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"linux-btrfs @ vger . kernel . org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices
Date: Tue, 15 Mar 2022 15:27:40 +0100 [thread overview]
Message-ID: <20220315142740.GU12643@twin.jikos.cz> (raw)
In-Reply-To: <PH0PR04MB74167377D7D86C60C290DAB29B109@PH0PR04MB7416.namprd04.prod.outlook.com>
On Tue, Mar 15, 2022 at 02:14:23PM +0000, Johannes Thumshirn wrote:
> On 15/03/2022 14:52, Javier González wrote:
> > On 15.03.2022 14:30, Christoph Hellwig wrote:
> >> On Tue, Mar 15, 2022 at 02:26:11PM +0100, Javier González wrote:
> >>> but we do not see a usage for ZNS in F2FS, as it is a mobile
> >>> file-system. As other interfaces arrive, this work will become natural.
> >>>
> >>> ZoneFS and butrfs are good targets for ZNS and these we can do. I would
> >>> still do the work in phases to make sure we have enough early feedback
> >>> from the community.
> >>>
> >>> Since this thread has been very active, I will wait some time for
> >>> Christoph and others to catch up before we start sending code.
> >>
> >> Can someone summarize where we stand? Between the lack of quoting
> >> from hell and overly long lines from corporate mail clients I've
> >> mostly stopped reading this thread because it takes too much effort
> >> actually extract the information.
> >
> > Let me give it a try:
> >
> > - PO2 emulation in NVMe is a no-go. Drop this.
> >
> > - The arguments against supporting PO2 are:
> > - It makes ZNS depart from a SMR assumption of PO2 zone sizes. This
> > can create confusion for users of both SMR and ZNS
> >
> > - Existing applications assume PO2 zone sizes, and probably do
> > optimizations for these. These applications, if wanting to use
> > ZNS will have to change the calculations
> >
> > - There is a fear for performance regressions.
> >
> > - It adds more work to you and other maintainers
> >
> > - The arguments in favour of PO2 are:
> > - Unmapped LBAs create holes that applications need to deal with.
> > This affects mapping and performance due to splits. Bo explained
> > this in a thread from Bytedance's perspective. I explained in an
> > answer to Matias how we are not letting zones transition to
> > offline in order to simplify the host stack. Not sure if this is
> > something we want to bring to NVMe.
> >
> > - As ZNS adds more features and other protocols add support for
> > zoned devices we will have more use-cases for the zoned block
> > device. We will have to deal with these fragmentation at some
> > point.
> >
> > - This is used in production workloads in Linux hosts. I would
> > advocate for this not being off-tree as it will be a headache for
> > all in the future.
> >
> > - If you agree that removing PO2 is an option, we can do the following:
> > - Remove the constraint in the block layer and add ZoneFS support
> > in a first patch.
> >
> > - Add btrfs support in a later patch
>
> (+ linux-btrfs )
>
> Please also make sure to support btrfs and not only throw some patches
> over the fence. Zoned device support in btrfs is complex enough and has
> quite some special casing vs regular btrfs, which we're working on getting
> rid of. So having non-power-of-2 zone size, would also mean having NPO2
> block-groups (and thus block-groups not aligned to the stripe size).
>
> Just thinking of this and knowing I need to support it gives me a
> headache.
PO2 is really easy to work with and I guess allocation on the physical
device could also benefit from that, I'm still puzzled why the NPO2 is
even proposed.
We can possibly hide the calculations behind some API so I hope in the
end it should be bearable. The size of block groups is flexible we only
want some reasonable alignment.
> Also please consult the rest of the btrfs developers for thoughts on this.
> After all btrfs has full zoned support (including ZNS, not saying it's
> perfect) and is also the default FS for at least two Linux distributions.
I haven't read the whole thread yet, my impression is that some hardware
is deliberately breaking existing assumptions about zoned devices and in
turn breaking btrfs support. I hope I'm wrong on that or at least that
it's possible to work around it.
next prev parent reply other threads:[~2022-03-15 14:31 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220308165414eucas1p106df0bd6a901931215cfab81660a4564@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Pankaj Raghav
[not found] ` <CGME20220308165421eucas1p20575444f59702cd5478cb35fce8b72cd@eucas1p2.samsung.com>
2022-03-08 16:53 ` [PATCH 1/6] nvme: zns: Allow ZNS drives that have non-power_of_2 zone size Pankaj Raghav
2022-03-08 17:14 ` Keith Busch
2022-03-08 17:43 ` Pankaj Raghav
2022-03-09 3:40 ` Damien Le Moal
2022-03-09 13:19 ` Pankaj Raghav
2022-03-09 3:44 ` Damien Le Moal
2022-03-09 13:35 ` Pankaj Raghav
[not found] ` <CGME20220308165428eucas1p14ea0a38eef47055c4fa41d695c5a249d@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 2/6] block: Add npo2_zone_setup callback to block device fops Pankaj Raghav
2022-03-09 3:46 ` Damien Le Moal
2022-03-09 14:02 ` Pankaj Raghav
[not found] ` <CGME20220308165432eucas1p18b36a238ef3f5a812ee7f9b0e52599a5@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 3/6] block: add a bool member to request_queue for power_of_2 emulation Pankaj Raghav
[not found] ` <CGME20220308165436eucas1p1b76f3cb5b4fa1f7d78b51a3b1b44d160@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 4/6] nvme: zns: Add support for power_of_2 emulation to NVMe ZNS devices Pankaj Raghav
2022-03-09 4:04 ` Damien Le Moal
2022-03-09 14:33 ` Pankaj Raghav
2022-03-09 21:43 ` Damien Le Moal
2022-03-10 20:35 ` Luis Chamberlain
2022-03-10 23:50 ` Damien Le Moal
2022-03-11 0:56 ` Luis Chamberlain
[not found] ` <CGME20220308165443eucas1p17e61670a5057f21a6c073711b284bfeb@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 5/6] null_blk: forward the sector value from null_handle_memory_backend Pankaj Raghav
[not found] ` <CGME20220308165448eucas1p12c7c302a4b239db64b49d54cc3c1f0ac@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 6/6] null_blk: Add support for power_of_2 emulation to the null blk device Pankaj Raghav
2022-03-09 4:09 ` Damien Le Moal
2022-03-09 14:42 ` Pankaj Raghav
2022-03-10 9:47 ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Christoph Hellwig
2022-03-10 12:57 ` Pankaj Raghav
2022-03-10 13:07 ` Matias Bjørling
2022-03-10 13:14 ` Javier González
2022-03-10 14:58 ` Matias Bjørling
2022-03-10 15:07 ` Keith Busch
2022-03-10 15:16 ` Javier González
2022-03-10 23:44 ` Damien Le Moal
2022-03-10 15:13 ` Javier González
2022-03-10 14:44 ` Christoph Hellwig
2022-03-11 20:19 ` Luis Chamberlain
2022-03-11 20:51 ` Keith Busch
2022-03-11 21:04 ` Luis Chamberlain
2022-03-11 21:31 ` Keith Busch
2022-03-11 22:24 ` Luis Chamberlain
2022-03-12 7:58 ` Damien Le Moal
2022-03-14 7:35 ` Christoph Hellwig
2022-03-14 7:45 ` Damien Le Moal
2022-03-14 7:58 ` Christoph Hellwig
2022-03-14 10:49 ` Javier González
2022-03-14 14:16 ` Matias Bjørling
2022-03-14 16:23 ` Luis Chamberlain
2022-03-14 19:30 ` Matias Bjørling
2022-03-14 19:51 ` Luis Chamberlain
2022-03-15 10:45 ` Matias Bjørling
2022-03-14 19:55 ` Javier González
2022-03-15 12:32 ` Matias Bjørling
2022-03-15 13:05 ` Javier González
2022-03-15 13:14 ` Matias Bjørling
2022-03-15 13:26 ` Javier González
2022-03-15 13:30 ` Christoph Hellwig
2022-03-15 13:52 ` Javier González
2022-03-15 14:03 ` Matias Bjørling
2022-03-15 14:14 ` Johannes Thumshirn
2022-03-15 14:27 ` David Sterba [this message]
2022-03-15 19:56 ` Pankaj Raghav
2022-03-15 15:11 ` Javier González
2022-03-15 18:51 ` Pankaj Raghav
2022-03-16 8:37 ` Johannes Thumshirn
2022-03-15 17:00 ` Luis Chamberlain
2022-03-16 0:07 ` Damien Le Moal
2022-03-16 0:23 ` Luis Chamberlain
2022-03-16 0:46 ` Damien Le Moal
2022-03-16 1:24 ` Luis Chamberlain
2022-03-16 1:44 ` Damien Le Moal
2022-03-16 2:13 ` Luis Chamberlain
2022-03-16 2:27 ` Martin K. Petersen
2022-03-16 2:41 ` Luis Chamberlain
2022-03-16 8:44 ` Javier González
2022-03-15 13:39 ` Matias Bjørling
2022-03-16 0:00 ` Damien Le Moal
2022-03-16 8:57 ` Javier González
2022-03-16 16:18 ` Pankaj Raghav
2022-03-14 8:36 ` Matias Bjørling
2022-03-11 22:23 ` Adam Manzanares
2022-03-11 22:30 ` Keith Busch
2022-03-21 16:21 ` Jonathan Derrick
2022-03-21 16:44 ` Keith Busch
2022-03-10 17:38 ` Adam Manzanares
2022-03-14 7:36 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220315142740.GU12643@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=a.manzanares@samsung.com \
--cc=axboe@kernel.dk \
--cc=damien.lemoal@opensource.wdc.com \
--cc=hch@lst.de \
--cc=javier@javigon.com \
--cc=jiangbo.365@bytedance.com \
--cc=joshi.k@samsung.com \
--cc=joshiiitr@gmail.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=pankydev8@gmail.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.