From: David Sterba <dsterba@suse.cz> To: Pankaj Raghav <p.raghav@samsung.com> Cc: jaegeuk@kernel.org, hare@suse.de, dsterba@suse.com, axboe@kernel.dk, hch@lst.de, damien.lemoal@opensource.wdc.com, snitzer@kernel.org, Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>, bvanassche@acm.org, linux-fsdevel@vger.kernel.org, matias.bjorling@wdc.com, Jens Axboe <axboe@fb.com>, gost.dev@samsung.com, jonathan.derrick@linux.dev, jiangbo.365@bytedance.com, linux-nvme@lists.infradead.org, dm-devel@redhat.com, Naohiro Aota <naohiro.aota@wdc.com>, linux-kernel@vger.kernel.org, Johannes Thumshirn <jth@kernel.org>, Sagi Grimberg <sagi@grimberg.me>, Alasdair Kergon <agk@redhat.com>, linux-block@vger.kernel.org, Chaitanya Kulkarni <kch@nvidia.com>, Keith Busch <kbusch@kernel.org>, linux-btrfs@vger.kernel.org Subject: Re: [PATCH v3 00/11] support non power of 2 zoned devices Date: Fri, 6 May 2022 12:00:55 +0200 [thread overview] Message-ID: <20220506100054.GZ18596@suse.cz> (raw) In-Reply-To: <20220506081105.29134-1-p.raghav@samsung.com> On Fri, May 06, 2022 at 10:10:54AM +0200, Pankaj Raghav wrote: > - Open issue: > * btrfs superblock location for zoned devices is expected to be in 0, > 512GB(mirror) and 4TB(mirror) in the device. Zoned devices with po2 > zone size will naturally align with these superblock location but non > po2 devices will not align with 512GB and 4TB offset. > > The current approach for npo2 devices is to place the superblock mirror > zones near 512GB and 4TB that is **aligned to the zone size**. I don't like that, the offsets have been chosen so the values are fixed and also future proof in case the zone size increases significantly. The natural alignment of the pow2 zones makes it fairly trivial. If I understand correctly what you suggest, it would mean that if zone is eg. 5G and starts at 510G then the superblock should start at 510G, right? And with another device that has 7G zone size the nearest multiple is 511G. And so on. That makes it all less predictable, depending on the physical device constraints that are affecting the logical data structures of the filesystem. We tried to avoid that with pow2, the only thing that depends on the device is that the range from the super block offsets is always 2 zones. I really want to keep the offsets for all zoned devices the same and adapt the code that's handling the writes. This is possible with the non-pow2 too, the first write is set to the expected offset, leaving the beginning of the zone unused. > This > is of no issue for normal operation as we keep track where the superblock > mirror are placed but this can cause an issue with recovery tools for > zoned devices as they expect mirror superblock to be in 512GB and 4TB. Yeah the tools need to be updated, btrfs-progs and suite of blk* in util-linux. > Note that ATM, recovery tools such as `btrfs check` does not work for > image dumps for zoned devices even for po2 zone sizes. I thought this worked, but if you find something that does not please report that to Johannes or Naohiro.
WARNING: multiple messages have this Message-ID (diff)
From: David Sterba <dsterba@suse.cz> To: Pankaj Raghav <p.raghav@samsung.com> Cc: jiangbo.365@bytedance.com, linux-nvme@lists.infradead.org, Chris Mason <clm@fb.com>, dm-devel@redhat.com, hch@lst.de, Alasdair Kergon <agk@redhat.com>, Naohiro Aota <naohiro.aota@wdc.com>, bvanassche@acm.org, gost.dev@samsung.com, damien.lemoal@opensource.wdc.com, jonathan.derrick@linux.dev, Chaitanya Kulkarni <kch@nvidia.com>, snitzer@kernel.org, Josef Bacik <josef@toxicpanda.com>, Jens Axboe <axboe@fb.com>, dsterba@suse.com, jaegeuk@kernel.org, matias.bjorling@wdc.com, Sagi Grimberg <sagi@grimberg.me>, axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Keith Busch <kbusch@kernel.org>, linux-fsdevel@vger.kernel.org, Johannes Thumshirn <jth@kernel.org>, linux-btrfs@vger.kernel.org Subject: Re: [dm-devel] [PATCH v3 00/11] support non power of 2 zoned devices Date: Fri, 6 May 2022 12:00:55 +0200 [thread overview] Message-ID: <20220506100054.GZ18596@suse.cz> (raw) In-Reply-To: <20220506081105.29134-1-p.raghav@samsung.com> On Fri, May 06, 2022 at 10:10:54AM +0200, Pankaj Raghav wrote: > - Open issue: > * btrfs superblock location for zoned devices is expected to be in 0, > 512GB(mirror) and 4TB(mirror) in the device. Zoned devices with po2 > zone size will naturally align with these superblock location but non > po2 devices will not align with 512GB and 4TB offset. > > The current approach for npo2 devices is to place the superblock mirror > zones near 512GB and 4TB that is **aligned to the zone size**. I don't like that, the offsets have been chosen so the values are fixed and also future proof in case the zone size increases significantly. The natural alignment of the pow2 zones makes it fairly trivial. If I understand correctly what you suggest, it would mean that if zone is eg. 5G and starts at 510G then the superblock should start at 510G, right? And with another device that has 7G zone size the nearest multiple is 511G. And so on. That makes it all less predictable, depending on the physical device constraints that are affecting the logical data structures of the filesystem. We tried to avoid that with pow2, the only thing that depends on the device is that the range from the super block offsets is always 2 zones. I really want to keep the offsets for all zoned devices the same and adapt the code that's handling the writes. This is possible with the non-pow2 too, the first write is set to the expected offset, leaving the beginning of the zone unused. > This > is of no issue for normal operation as we keep track where the superblock > mirror are placed but this can cause an issue with recovery tools for > zoned devices as they expect mirror superblock to be in 512GB and 4TB. Yeah the tools need to be updated, btrfs-progs and suite of blk* in util-linux. > Note that ATM, recovery tools such as `btrfs check` does not work for > image dumps for zoned devices even for po2 zone sizes. I thought this worked, but if you find something that does not please report that to Johannes or Naohiro. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2022-05-06 10:05 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <CGME20220506081106eucas1p181e83ef352eb8bfb1752bee0cf84020f@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 00/11] support non power of 2 zoned devices Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081107eucas1p1070e00b208e00090c235017435be1593@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 01/11] block: make blkdev_nr_zones and blk_queue_zone_no generic for npo2 zsze Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081108eucas1p2ca72ccafb05dfdcc5b8ba9393da1ce60@eucas1p2.samsung.com> 2022-05-06 8:10 ` [PATCH v3 02/11] block: allow blk-zoned devices to have non-power-of-2 zone size Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081109eucas1p26bbb68a1740b1af923ed862a93112780@eucas1p2.samsung.com> 2022-05-06 8:10 ` [PATCH v3 03/11] nvme: zns: Allow ZNS drives that have non-power_of_2 " Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081110eucas1p1b6c624ddca1c41b9838bb5b85f8ca5ff@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 04/11] nvmet: Allow ZNS target to support non-power_of_2 zone sizes Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081111eucas1p11e4dd5a89ce49939bbea57433cea046f@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 05/11] btrfs: zoned: Cache superblock location in btrfs_zoned_device_info Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081112eucas1p2f6116cb713749c259a6da533df9c2505@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 06/11] btrfs: zoned: Make sb_zone_number function non power of 2 compatible Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081113eucas1p25deb73a4b7898476d2e8e3d35b16f879@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 07/11] btrfs: zoned: use generic btrfs zone helpers to support npo2 zoned devices Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081114eucas1p1a9d86eb429a6f68c29d1980891f49786@eucas1p1.samsung.com> 2022-05-06 8:11 ` [PATCH v3 08/11] btrfs: zoned: relax the alignment constraint for " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081115eucas1p2e7bed137c74be42a702732027581330e@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 09/11] zonefs: allow non power of 2 " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081116eucas1p2cce67bbf30f4c9c4e6854965be41b098@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 10/11] null_blk: " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav 2022-05-06 15:47 ` Damien Le Moal 2022-05-06 15:47 ` [dm-devel] " Damien Le Moal 2022-05-09 11:06 ` Pankaj Raghav 2022-05-09 11:06 ` [dm-devel] " Pankaj Raghav 2022-05-09 11:31 ` Damien Le Moal 2022-05-09 11:31 ` [dm-devel] " Damien Le Moal 2022-05-09 11:56 ` Pankaj Raghav 2022-05-09 11:56 ` [dm-devel] " Pankaj Raghav 2022-05-12 17:22 ` Bart Van Assche 2022-05-12 17:22 ` [dm-devel] " Bart Van Assche [not found] ` <CGME20220506081118eucas1p17f3c29cc36d748c3b5a3246f069f434a@eucas1p1.samsung.com> 2022-05-06 8:11 ` [PATCH v3 11/11] dm-zoned: ensure only power of 2 zone sizes are allowed Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav 2022-05-06 15:41 ` Damien Le Moal 2022-05-06 15:41 ` [dm-devel] " Damien Le Moal 2022-05-09 11:03 ` Pankaj Raghav 2022-05-09 11:03 ` [dm-devel] " Pankaj Raghav 2022-05-09 16:05 ` Mike Snitzer 2022-05-09 16:05 ` [dm-devel] " Mike Snitzer 2022-05-09 18:54 ` David Sterba 2022-05-09 18:54 ` [dm-devel] " David Sterba 2022-05-11 14:39 ` Pankaj Raghav 2022-05-11 14:39 ` [dm-devel] " Pankaj Raghav 2022-05-11 16:00 ` David Sterba 2022-05-11 16:00 ` [dm-devel] " David Sterba 2022-05-12 8:27 ` Pankaj Raghav 2022-05-12 8:27 ` [dm-devel] " Pankaj Raghav 2022-05-06 10:00 ` David Sterba [this message] 2022-05-06 10:00 ` [dm-devel] [PATCH v3 00/11] support non power of 2 zoned devices David Sterba 2022-05-09 11:02 ` Pankaj Raghav 2022-05-09 11:02 ` [dm-devel] " Pankaj Raghav
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220506100054.GZ18596@suse.cz \ --to=dsterba@suse.cz \ --cc=agk@redhat.com \ --cc=axboe@fb.com \ --cc=axboe@kernel.dk \ --cc=bvanassche@acm.org \ --cc=clm@fb.com \ --cc=damien.lemoal@opensource.wdc.com \ --cc=dm-devel@redhat.com \ --cc=dsterba@suse.com \ --cc=gost.dev@samsung.com \ --cc=hare@suse.de \ --cc=hch@lst.de \ --cc=jaegeuk@kernel.org \ --cc=jiangbo.365@bytedance.com \ --cc=jonathan.derrick@linux.dev \ --cc=josef@toxicpanda.com \ --cc=jth@kernel.org \ --cc=kbusch@kernel.org \ --cc=kch@nvidia.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-btrfs@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=matias.bjorling@wdc.com \ --cc=naohiro.aota@wdc.com \ --cc=p.raghav@samsung.com \ --cc=sagi@grimberg.me \ --cc=snitzer@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.