From: David Sterba <dsterba@suse.cz> To: Pankaj Raghav <p.raghav@samsung.com> Cc: dsterba@suse.cz, jaegeuk@kernel.org, hare@suse.de, dsterba@suse.com, axboe@kernel.dk, hch@lst.de, damien.lemoal@opensource.wdc.com, snitzer@kernel.org, Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>, bvanassche@acm.org, linux-fsdevel@vger.kernel.org, matias.bjorling@wdc.com, Jens Axboe <axboe@fb.com>, gost.dev@samsung.com, jonathan.derrick@linux.dev, jiangbo.365@bytedance.com, linux-nvme@lists.infradead.org, dm-devel@redhat.com, Naohiro Aota <naohiro.aota@wdc.com>, linux-kernel@vger.kernel.org, Johannes Thumshirn <jth@kernel.org>, Sagi Grimberg <sagi@grimberg.me>, Alasdair Kergon <agk@redhat.com>, linux-block@vger.kernel.org, Chaitanya Kulkarni <kch@nvidia.com>, Keith Busch <kbusch@kernel.org>, linux-btrfs@vger.kernel.org, Luis Chamberlain <mcgrof@kernel.org> Subject: Re: [PATCH v3 11/11] dm-zoned: ensure only power of 2 zone sizes are allowed Date: Wed, 11 May 2022 18:00:02 +0200 [thread overview] Message-ID: <20220511160001.GQ18596@twin.jikos.cz> (raw) In-Reply-To: <d8e86c32-f122-01df-168e-648179766c55@samsung.com> On Wed, May 11, 2022 at 04:39:17PM +0200, Pankaj Raghav wrote: > Hi David, > > On 2022-05-09 20:54, David Sterba wrote:>> diff --git > a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c > >> index 3e7b1fe15..27dc4ddf2 100644 > >> --- a/drivers/md/dm-zone.c > >> +++ b/drivers/md/dm-zone.c > >> @@ -231,6 +231,18 @@ static int dm_revalidate_zones(struct mapped_device *md, struct dm_table *t) > >> struct request_queue *q = md->queue; > >> unsigned int noio_flag; > >> int ret; > >> + struct block_device *bdev = md->disk->part0; > >> + sector_t zone_sectors; > >> + char bname[BDEVNAME_SIZE]; > >> + > >> + zone_sectors = bdev_zone_sectors(bdev); > >> + > >> + if (!is_power_of_2(zone_sectors)) { > > > > is_power_of_2 takes 'unsigned long' and sector_t is u64, so this is not > > 32bit clean and we had an actual bug where value 1<<48 was not > > recognized as power of 2. > > > Good catch. Now I understand why btrfs has a helper for is_power_of_two_u64. > > But the zone size can never be more than 32bit value so the zone size > sect will never greater than unsigned long. We've set the maximum supported zone size in btrfs to be 8G, which is a lot and should be sufficient for some time, but this also means that the value is larger than 32bit maximum. I have actually tested btrfs on top of such emaulated zoned device via TCMU, so it's not dm-zoned, so it's up to you to make sure that a silent overflow won't happen. > With that said, we have two options: > > 1.) We can put a comment explaining that even though it is 32 bit > unsafe, zone size sect can never be a 32bit value This is probably part of the protocol and specification of the zoned devices, the filesystem either accepts the spec or makes some room for larger values in case it's not too costly. > or > > 2) We should move the btrfs only helper `is_power_of_two_u64` to some > common header and use it everywhere. Yeah, that can be done independently. With some macro magic it can be made type-safe for any argument while preserving the 'is_power_of_2' name.
WARNING: multiple messages have this Message-ID (diff)
From: David Sterba <dsterba@suse.cz> To: Pankaj Raghav <p.raghav@samsung.com> Cc: jiangbo.365@bytedance.com, linux-nvme@lists.infradead.org, dsterba@suse.cz, Chris Mason <clm@fb.com>, dm-devel@redhat.com, hch@lst.de, Alasdair Kergon <agk@redhat.com>, Naohiro Aota <naohiro.aota@wdc.com>, bvanassche@acm.org, gost.dev@samsung.com, damien.lemoal@opensource.wdc.com, jonathan.derrick@linux.dev, Chaitanya Kulkarni <kch@nvidia.com>, snitzer@kernel.org, Josef Bacik <josef@toxicpanda.com>, Jens Axboe <axboe@fb.com>, dsterba@suse.com, jaegeuk@kernel.org, matias.bjorling@wdc.com, Sagi Grimberg <sagi@grimberg.me>, axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Keith Busch <kbusch@kernel.org>, Luis Chamberlain <mcgrof@kernel.org>, linux-fsdevel@vger.kernel.org, Johannes Thumshirn <jth@kernel.org>, linux-btrfs@vger.kernel.org Subject: Re: [dm-devel] [PATCH v3 11/11] dm-zoned: ensure only power of 2 zone sizes are allowed Date: Wed, 11 May 2022 18:00:02 +0200 [thread overview] Message-ID: <20220511160001.GQ18596@twin.jikos.cz> (raw) In-Reply-To: <d8e86c32-f122-01df-168e-648179766c55@samsung.com> On Wed, May 11, 2022 at 04:39:17PM +0200, Pankaj Raghav wrote: > Hi David, > > On 2022-05-09 20:54, David Sterba wrote:>> diff --git > a/drivers/md/dm-zone.c b/drivers/md/dm-zone.c > >> index 3e7b1fe15..27dc4ddf2 100644 > >> --- a/drivers/md/dm-zone.c > >> +++ b/drivers/md/dm-zone.c > >> @@ -231,6 +231,18 @@ static int dm_revalidate_zones(struct mapped_device *md, struct dm_table *t) > >> struct request_queue *q = md->queue; > >> unsigned int noio_flag; > >> int ret; > >> + struct block_device *bdev = md->disk->part0; > >> + sector_t zone_sectors; > >> + char bname[BDEVNAME_SIZE]; > >> + > >> + zone_sectors = bdev_zone_sectors(bdev); > >> + > >> + if (!is_power_of_2(zone_sectors)) { > > > > is_power_of_2 takes 'unsigned long' and sector_t is u64, so this is not > > 32bit clean and we had an actual bug where value 1<<48 was not > > recognized as power of 2. > > > Good catch. Now I understand why btrfs has a helper for is_power_of_two_u64. > > But the zone size can never be more than 32bit value so the zone size > sect will never greater than unsigned long. We've set the maximum supported zone size in btrfs to be 8G, which is a lot and should be sufficient for some time, but this also means that the value is larger than 32bit maximum. I have actually tested btrfs on top of such emaulated zoned device via TCMU, so it's not dm-zoned, so it's up to you to make sure that a silent overflow won't happen. > With that said, we have two options: > > 1.) We can put a comment explaining that even though it is 32 bit > unsafe, zone size sect can never be a 32bit value This is probably part of the protocol and specification of the zoned devices, the filesystem either accepts the spec or makes some room for larger values in case it's not too costly. > or > > 2) We should move the btrfs only helper `is_power_of_two_u64` to some > common header and use it everywhere. Yeah, that can be done independently. With some macro magic it can be made type-safe for any argument while preserving the 'is_power_of_2' name. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2022-05-11 16:04 UTC|newest] Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <CGME20220506081106eucas1p181e83ef352eb8bfb1752bee0cf84020f@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 00/11] support non power of 2 zoned devices Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081107eucas1p1070e00b208e00090c235017435be1593@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 01/11] block: make blkdev_nr_zones and blk_queue_zone_no generic for npo2 zsze Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081108eucas1p2ca72ccafb05dfdcc5b8ba9393da1ce60@eucas1p2.samsung.com> 2022-05-06 8:10 ` [PATCH v3 02/11] block: allow blk-zoned devices to have non-power-of-2 zone size Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081109eucas1p26bbb68a1740b1af923ed862a93112780@eucas1p2.samsung.com> 2022-05-06 8:10 ` [PATCH v3 03/11] nvme: zns: Allow ZNS drives that have non-power_of_2 " Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081110eucas1p1b6c624ddca1c41b9838bb5b85f8ca5ff@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 04/11] nvmet: Allow ZNS target to support non-power_of_2 zone sizes Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081111eucas1p11e4dd5a89ce49939bbea57433cea046f@eucas1p1.samsung.com> 2022-05-06 8:10 ` [PATCH v3 05/11] btrfs: zoned: Cache superblock location in btrfs_zoned_device_info Pankaj Raghav 2022-05-06 8:10 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081112eucas1p2f6116cb713749c259a6da533df9c2505@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 06/11] btrfs: zoned: Make sb_zone_number function non power of 2 compatible Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081113eucas1p25deb73a4b7898476d2e8e3d35b16f879@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 07/11] btrfs: zoned: use generic btrfs zone helpers to support npo2 zoned devices Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081114eucas1p1a9d86eb429a6f68c29d1980891f49786@eucas1p1.samsung.com> 2022-05-06 8:11 ` [PATCH v3 08/11] btrfs: zoned: relax the alignment constraint for " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081115eucas1p2e7bed137c74be42a702732027581330e@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 09/11] zonefs: allow non power of 2 " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220506081116eucas1p2cce67bbf30f4c9c4e6854965be41b098@eucas1p2.samsung.com> 2022-05-06 8:11 ` [PATCH v3 10/11] null_blk: " Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav 2022-05-06 15:47 ` Damien Le Moal 2022-05-06 15:47 ` [dm-devel] " Damien Le Moal 2022-05-09 11:06 ` Pankaj Raghav 2022-05-09 11:06 ` [dm-devel] " Pankaj Raghav 2022-05-09 11:31 ` Damien Le Moal 2022-05-09 11:31 ` [dm-devel] " Damien Le Moal 2022-05-09 11:56 ` Pankaj Raghav 2022-05-09 11:56 ` [dm-devel] " Pankaj Raghav 2022-05-12 17:22 ` Bart Van Assche 2022-05-12 17:22 ` [dm-devel] " Bart Van Assche [not found] ` <CGME20220506081118eucas1p17f3c29cc36d748c3b5a3246f069f434a@eucas1p1.samsung.com> 2022-05-06 8:11 ` [PATCH v3 11/11] dm-zoned: ensure only power of 2 zone sizes are allowed Pankaj Raghav 2022-05-06 8:11 ` [dm-devel] " Pankaj Raghav 2022-05-06 15:41 ` Damien Le Moal 2022-05-06 15:41 ` [dm-devel] " Damien Le Moal 2022-05-09 11:03 ` Pankaj Raghav 2022-05-09 11:03 ` [dm-devel] " Pankaj Raghav 2022-05-09 16:05 ` Mike Snitzer 2022-05-09 16:05 ` [dm-devel] " Mike Snitzer 2022-05-09 18:54 ` David Sterba 2022-05-09 18:54 ` [dm-devel] " David Sterba 2022-05-11 14:39 ` Pankaj Raghav 2022-05-11 14:39 ` [dm-devel] " Pankaj Raghav 2022-05-11 16:00 ` David Sterba [this message] 2022-05-11 16:00 ` David Sterba 2022-05-12 8:27 ` Pankaj Raghav 2022-05-12 8:27 ` [dm-devel] " Pankaj Raghav 2022-05-06 10:00 ` [PATCH v3 00/11] support non power of 2 zoned devices David Sterba 2022-05-06 10:00 ` [dm-devel] " David Sterba 2022-05-09 11:02 ` Pankaj Raghav 2022-05-09 11:02 ` [dm-devel] " Pankaj Raghav
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220511160001.GQ18596@twin.jikos.cz \ --to=dsterba@suse.cz \ --cc=agk@redhat.com \ --cc=axboe@fb.com \ --cc=axboe@kernel.dk \ --cc=bvanassche@acm.org \ --cc=clm@fb.com \ --cc=damien.lemoal@opensource.wdc.com \ --cc=dm-devel@redhat.com \ --cc=dsterba@suse.com \ --cc=gost.dev@samsung.com \ --cc=hare@suse.de \ --cc=hch@lst.de \ --cc=jaegeuk@kernel.org \ --cc=jiangbo.365@bytedance.com \ --cc=jonathan.derrick@linux.dev \ --cc=josef@toxicpanda.com \ --cc=jth@kernel.org \ --cc=kbusch@kernel.org \ --cc=kch@nvidia.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-btrfs@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=matias.bjorling@wdc.com \ --cc=mcgrof@kernel.org \ --cc=naohiro.aota@wdc.com \ --cc=p.raghav@samsung.com \ --cc=sagi@grimberg.me \ --cc=snitzer@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.