From: Pankaj Raghav <p.raghav@samsung.com> To: axboe@kernel.dk, damien.lemoal@opensource.wdc.com, pankydev8@gmail.com, dsterba@suse.com, hch@lst.de Cc: linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, jiangbo.365@bytedance.com, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, dm-devel@redhat.com Subject: [PATCH v4 08/13] btrfs:zoned: make sb for npo2 zone devices align with sb log offsets Date: Mon, 16 May 2022 18:54:11 +0200 [thread overview] Message-ID: <20220516165416.171196-9-p.raghav@samsung.com> (raw) In-Reply-To: <20220516165416.171196-1-p.raghav@samsung.com> Superblocks for zoned devices are fixed as 2 zones at 0, 512GB and 4TB. These are fixed at these locations so that recovery tools can reliably retrieve the superblocks even if one of the mirror gets corrupted. power of 2 zone sizes align at these offsets irrespective of their value but non power of 2 zone sizes will not align. To make sure the first zone at mirror 1 and mirror 2 align, write zero operation is performed to move the write pointer of the first zone to the expected offset. This operation is performed only after a zone reset of the first zone, i.e., when the second zone that contains the sb is FULL. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- fs/btrfs/zoned.c | 68 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 63 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 3023c871e..805aeaa76 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -760,11 +760,44 @@ int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info) return 0; } +static int fill_sb_wp_offset(struct block_device *bdev, struct blk_zone *zone, + int mirror, u64 *wp_ret) +{ + u64 offset = 0; + int ret = 0; + + ASSERT(!is_power_of_two_u64(zone->len)); + ASSERT(zone->wp == zone->start); + ASSERT(mirror != 0); + + switch (mirror) { + case 1: + div64_u64_rem(BTRFS_SB_LOG_FIRST_OFFSET >> SECTOR_SHIFT, + zone->len, &offset); + break; + case 2: + div64_u64_rem(BTRFS_SB_LOG_SECOND_OFFSET >> SECTOR_SHIFT, + zone->len, &offset); + break; + } + + ret = blkdev_issue_zeroout(bdev, zone->start, offset, GFP_NOFS, 0); + if (ret) + return ret; + + zone->wp += offset; + zone->cond = BLK_ZONE_COND_IMP_OPEN; + *wp_ret = zone->wp << SECTOR_SHIFT; + + return 0; +} + static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, - int rw, u64 *bytenr_ret) + int rw, int mirror, u64 *bytenr_ret) { u64 wp; int ret; + bool zones_empty = false; if (zones[0].type == BLK_ZONE_TYPE_CONVENTIONAL) { *bytenr_ret = zones[0].start << SECTOR_SHIFT; @@ -775,13 +808,31 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, if (ret != -ENOENT && ret < 0) return ret; + if (ret == -ENOENT) + zones_empty = true; + if (rw == WRITE) { struct blk_zone *reset = NULL; + bool is_sb_offset_write_req = false; + u32 reset_zone_nr = -1; - if (wp == zones[0].start << SECTOR_SHIFT) + if (wp == zones[0].start << SECTOR_SHIFT) { reset = &zones[0]; - else if (wp == zones[1].start << SECTOR_SHIFT) + reset_zone_nr = 0; + } else if (wp == zones[1].start << SECTOR_SHIFT) { reset = &zones[1]; + reset_zone_nr = 1; + } + + /* + * Non po2 zone sizes will not align naturally at + * mirror 1 (512GB) and mirror 2 (4TB). The wp of the + * 1st zone in those superblock mirrors need to be + * moved to align at those offsets. + */ + is_sb_offset_write_req = + (zones_empty || (reset_zone_nr == 0)) && mirror && + !is_power_of_2(zones[0].len); if (reset && reset->cond != BLK_ZONE_COND_EMPTY) { ASSERT(sb_zone_is_full(reset)); @@ -795,6 +846,13 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, reset->cond = BLK_ZONE_COND_EMPTY; reset->wp = reset->start; } + + if (is_sb_offset_write_req) { + ret = fill_sb_wp_offset(bdev, &zones[0], mirror, &wp); + if (ret) + return ret; + } + } else if (ret != -ENOENT) { /* * For READ, we want the previous one. Move write pointer to @@ -851,7 +909,7 @@ int btrfs_sb_log_location_bdev(struct block_device *bdev, int mirror, int rw, if (ret != BTRFS_NR_SB_LOG_ZONES) return -EIO; - return sb_log_location(bdev, zones, rw, bytenr_ret); + return sb_log_location(bdev, zones, rw, mirror, bytenr_ret); } int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw, @@ -877,7 +935,7 @@ int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw, return sb_log_location(device->bdev, &zinfo->sb_zones[BTRFS_NR_SB_LOG_ZONES * mirror], - rw, bytenr_ret); + rw, mirror, bytenr_ret); } static inline bool is_sb_log_zone(struct btrfs_zoned_device_info *zinfo, -- 2.25.1
WARNING: multiple messages have this Message-ID (diff)
From: Pankaj Raghav <p.raghav@samsung.com> To: axboe@kernel.dk, damien.lemoal@opensource.wdc.com, pankydev8@gmail.com, dsterba@suse.com, hch@lst.de Cc: p.raghav@samsung.com, gost.dev@samsung.com, jiangbo.365@bytedance.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: [dm-devel] [PATCH v4 08/13] btrfs:zoned: make sb for npo2 zone devices align with sb log offsets Date: Mon, 16 May 2022 18:54:11 +0200 [thread overview] Message-ID: <20220516165416.171196-9-p.raghav@samsung.com> (raw) In-Reply-To: <20220516165416.171196-1-p.raghav@samsung.com> Superblocks for zoned devices are fixed as 2 zones at 0, 512GB and 4TB. These are fixed at these locations so that recovery tools can reliably retrieve the superblocks even if one of the mirror gets corrupted. power of 2 zone sizes align at these offsets irrespective of their value but non power of 2 zone sizes will not align. To make sure the first zone at mirror 1 and mirror 2 align, write zero operation is performed to move the write pointer of the first zone to the expected offset. This operation is performed only after a zone reset of the first zone, i.e., when the second zone that contains the sb is FULL. Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> --- fs/btrfs/zoned.c | 68 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 63 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 3023c871e..805aeaa76 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -760,11 +760,44 @@ int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info) return 0; } +static int fill_sb_wp_offset(struct block_device *bdev, struct blk_zone *zone, + int mirror, u64 *wp_ret) +{ + u64 offset = 0; + int ret = 0; + + ASSERT(!is_power_of_two_u64(zone->len)); + ASSERT(zone->wp == zone->start); + ASSERT(mirror != 0); + + switch (mirror) { + case 1: + div64_u64_rem(BTRFS_SB_LOG_FIRST_OFFSET >> SECTOR_SHIFT, + zone->len, &offset); + break; + case 2: + div64_u64_rem(BTRFS_SB_LOG_SECOND_OFFSET >> SECTOR_SHIFT, + zone->len, &offset); + break; + } + + ret = blkdev_issue_zeroout(bdev, zone->start, offset, GFP_NOFS, 0); + if (ret) + return ret; + + zone->wp += offset; + zone->cond = BLK_ZONE_COND_IMP_OPEN; + *wp_ret = zone->wp << SECTOR_SHIFT; + + return 0; +} + static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, - int rw, u64 *bytenr_ret) + int rw, int mirror, u64 *bytenr_ret) { u64 wp; int ret; + bool zones_empty = false; if (zones[0].type == BLK_ZONE_TYPE_CONVENTIONAL) { *bytenr_ret = zones[0].start << SECTOR_SHIFT; @@ -775,13 +808,31 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, if (ret != -ENOENT && ret < 0) return ret; + if (ret == -ENOENT) + zones_empty = true; + if (rw == WRITE) { struct blk_zone *reset = NULL; + bool is_sb_offset_write_req = false; + u32 reset_zone_nr = -1; - if (wp == zones[0].start << SECTOR_SHIFT) + if (wp == zones[0].start << SECTOR_SHIFT) { reset = &zones[0]; - else if (wp == zones[1].start << SECTOR_SHIFT) + reset_zone_nr = 0; + } else if (wp == zones[1].start << SECTOR_SHIFT) { reset = &zones[1]; + reset_zone_nr = 1; + } + + /* + * Non po2 zone sizes will not align naturally at + * mirror 1 (512GB) and mirror 2 (4TB). The wp of the + * 1st zone in those superblock mirrors need to be + * moved to align at those offsets. + */ + is_sb_offset_write_req = + (zones_empty || (reset_zone_nr == 0)) && mirror && + !is_power_of_2(zones[0].len); if (reset && reset->cond != BLK_ZONE_COND_EMPTY) { ASSERT(sb_zone_is_full(reset)); @@ -795,6 +846,13 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones, reset->cond = BLK_ZONE_COND_EMPTY; reset->wp = reset->start; } + + if (is_sb_offset_write_req) { + ret = fill_sb_wp_offset(bdev, &zones[0], mirror, &wp); + if (ret) + return ret; + } + } else if (ret != -ENOENT) { /* * For READ, we want the previous one. Move write pointer to @@ -851,7 +909,7 @@ int btrfs_sb_log_location_bdev(struct block_device *bdev, int mirror, int rw, if (ret != BTRFS_NR_SB_LOG_ZONES) return -EIO; - return sb_log_location(bdev, zones, rw, bytenr_ret); + return sb_log_location(bdev, zones, rw, mirror, bytenr_ret); } int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw, @@ -877,7 +935,7 @@ int btrfs_sb_log_location(struct btrfs_device *device, int mirror, int rw, return sb_log_location(device->bdev, &zinfo->sb_zones[BTRFS_NR_SB_LOG_ZONES * mirror], - rw, bytenr_ret); + rw, mirror, bytenr_ret); } static inline bool is_sb_log_zone(struct btrfs_zoned_device_info *zinfo, -- 2.25.1 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2022-05-16 16:54 UTC|newest] Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <CGME20220516165418eucas1p2be592d9cd4b35f6b71d39ccbe87f3fef@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 00/13] support non power of 2 zoned devices Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165419eucas1p104aadda60df323e6154bfc3b92103b7b@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 01/13] block: make blkdev_nr_zones and blk_queue_zone_no generic for npo2 zsze Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165421eucas1p2515446ac290987bdb9af24ffb835b287@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 02/13] block: allow blk-zoned devices to have non-power-of-2 zone size Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-16 19:05 ` Pankaj Raghav 2022-05-16 19:05 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165422eucas1p174acec28848a9c2178376f092af3fa1c@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 03/13] nvme: zns: Allow ZNS drives that have non-power_of_2 " Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165424eucas1p2ee38cd64260539e5cac8d1fa4d0cba38@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 04/13] nvmet: Allow ZNS target to support non-power_of_2 zone sizes Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-17 14:19 ` Johannes Thumshirn 2022-05-17 14:19 ` [dm-devel] " Johannes Thumshirn [not found] ` <CGME20220516165425eucas1p29fcd11d7051d9d3a9a9efc17cd3b6999@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 05/13] btrfs: zoned: Cache superblock location in btrfs_zoned_device_info Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-16 21:58 ` David Sterba 2022-05-16 21:58 ` [dm-devel] " David Sterba 2022-05-17 7:55 ` Pankaj Raghav 2022-05-17 7:55 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165427eucas1p1cfd87ca44ec314ea1d2ddc8ece7259f9@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 06/13] btrfs: zoned: Make sb_zone_number function non power of 2 compatible Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-17 6:53 ` Johannes Thumshirn 2022-05-17 6:53 ` [dm-devel] " Johannes Thumshirn 2022-05-17 11:51 ` David Sterba 2022-05-17 11:51 ` [dm-devel] " David Sterba [not found] ` <CGME20220516165428eucas1p1374b5f9592db3ca6a6551aff975537ce@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 07/13] btrfs: zoned: use generic btrfs zone helpers to support npo2 zoned devices Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-17 12:30 ` David Sterba 2022-05-17 12:30 ` [dm-devel] " David Sterba 2022-05-18 9:40 ` Pankaj Raghav 2022-05-18 9:40 ` Pankaj Raghav 2022-05-18 11:21 ` David Sterba 2022-05-18 11:21 ` [dm-devel] " David Sterba 2022-05-19 4:13 ` Naohiro Aota 2022-05-19 4:13 ` [dm-devel] " Naohiro Aota [not found] ` <CGME20220516165429eucas1p272c8b4325a488675f08f2d7016aa6230@eucas1p2.samsung.com> 2022-05-16 16:54 ` Pankaj Raghav [this message] 2022-05-16 16:54 ` [dm-devel] [PATCH v4 08/13] btrfs:zoned: make sb for npo2 zone devices align with sb log offsets Pankaj Raghav 2022-05-17 6:50 ` Johannes Thumshirn 2022-05-17 6:50 ` Johannes Thumshirn 2022-05-17 8:00 ` Pankaj Raghav 2022-05-17 8:00 ` [dm-devel] " Pankaj Raghav 2022-05-17 12:42 ` David Sterba 2022-05-17 12:42 ` [dm-devel] " David Sterba 2022-05-18 9:15 ` Pankaj Raghav 2022-05-18 9:15 ` [dm-devel] " Pankaj Raghav 2022-05-19 7:57 ` Johannes Thumshirn 2022-05-19 7:57 ` [dm-devel] " Johannes Thumshirn 2022-05-20 9:06 ` Pankaj Raghav 2022-05-20 9:06 ` [dm-devel] " Pankaj Raghav 2022-05-20 9:15 ` Johannes Thumshirn 2022-05-20 9:15 ` [dm-devel] " Johannes Thumshirn 2022-05-19 7:59 ` Naohiro Aota 2022-05-19 7:59 ` [dm-devel] " Naohiro Aota 2022-05-20 9:09 ` Pankaj Raghav 2022-05-20 9:09 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165430eucas1p214cca8eaba1db2c98d947444cad4f18f@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 09/13] btrfs: zoned: relax the alignment constraint for zoned devices Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165432eucas1p2e1ea74d44738e44745f49e37b6b9e503@eucas1p2.samsung.com> 2022-05-16 16:54 ` [PATCH v4 10/13] zonefs: allow non power of 2 " Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165434eucas1p12b178fb83cc93470933e3d72c40e9004@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 11/13] null_blk: " Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav 2022-05-17 4:12 ` kernel test robot 2022-05-17 4:12 ` [dm-devel] " kernel test robot [not found] ` <CGME20220516165435eucas1p1dff8d9d039a76278ef1c09dba4b4e1fe@eucas1p1.samsung.com> 2022-05-16 16:54 ` [PATCH v4 12/13] null_blk: use zone_size_sects_shift for " Pankaj Raghav 2022-05-16 16:54 ` [dm-devel] " Pankaj Raghav [not found] ` <CGME20220516165436eucas1p178d079302dae3a9fca696b13b0390deb@eucas1p1.samsung.com> 2022-05-16 16:54 ` [dm-devel] [PATCH v4 13/13] dm-zoned: ensure only power of 2 zone sizes are allowed Pankaj Raghav 2022-05-16 16:54 ` Pankaj Raghav 2022-05-17 8:10 ` [PATCH v4 00/13] support non power of 2 zoned devices Christoph Hellwig 2022-05-17 8:10 ` [dm-devel] " Christoph Hellwig 2022-05-17 9:18 ` Javier González 2022-05-17 9:18 ` [dm-devel] " Javier González 2022-05-18 8:00 ` Christoph Hellwig 2022-05-18 8:00 ` [dm-devel] " Christoph Hellwig 2022-05-19 15:25 ` Javier González 2022-05-19 15:25 ` [dm-devel] " Javier González 2022-05-17 15:34 ` Theodore Ts'o 2022-05-17 15:34 ` Theodore Ts'o 2022-05-18 23:06 ` Luis Chamberlain 2022-05-18 23:06 ` Luis Chamberlain 2022-05-19 3:08 ` Damien Le Moal 2022-05-19 3:08 ` Damien Le Moal 2022-05-19 3:12 ` Luis Chamberlain 2022-05-19 3:12 ` Luis Chamberlain 2022-05-19 3:19 ` Damien Le Moal 2022-05-19 3:19 ` Damien Le Moal 2022-05-19 7:34 ` Johannes Thumshirn 2022-05-19 7:34 ` Johannes Thumshirn 2022-05-20 3:47 ` Damien Le Moal 2022-05-20 3:47 ` Damien Le Moal 2022-05-20 6:07 ` Hannes Reinecke 2022-05-20 6:07 ` Hannes Reinecke 2022-05-20 6:27 ` Javier González 2022-05-20 6:27 ` Javier González 2022-05-20 6:41 ` Damien Le Moal 2022-05-20 6:41 ` Damien Le Moal [not found] ` <CGME20220520065941eucas1p105cf273ede995dc4bf92f3245fad09b1@eucas1p1.samsung.com> 2022-05-20 6:59 ` Javier González 2022-05-20 6:59 ` Javier González 2022-05-20 9:30 ` Pankaj Raghav 2022-05-20 9:30 ` Pankaj Raghav 2022-05-20 17:18 ` David Sterba 2022-05-20 17:18 ` David Sterba 2022-05-23 8:25 ` Pankaj Raghav 2022-05-23 8:25 ` Pankaj Raghav 2022-05-20 9:30 ` Johannes Thumshirn 2022-05-20 9:30 ` Johannes Thumshirn [not found] ` <CGME20220520101610eucas1p1822ca6014e2a1d55ae74476f83c4de1d@eucas1p1.samsung.com> 2022-05-20 10:16 ` Javier González 2022-05-20 10:16 ` Javier González
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220516165416.171196-9-p.raghav@samsung.com \ --to=p.raghav@samsung.com \ --cc=axboe@kernel.dk \ --cc=damien.lemoal@opensource.wdc.com \ --cc=dm-devel@redhat.com \ --cc=dsterba@suse.com \ --cc=gost.dev@samsung.com \ --cc=hch@lst.de \ --cc=jiangbo.365@bytedance.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-btrfs@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=pankydev8@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.