linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pankaj Raghav <p.raghav@samsung.com>
To: <dsterba@suse.cz>
Cc: <axboe@kernel.dk>, <damien.lemoal@opensource.wdc.com>,
	<pankydev8@gmail.com>, <dsterba@suse.com>, <hch@lst.de>,
	<linux-nvme@lists.infradead.org>, <linux-fsdevel@vger.kernel.org>,
	<linux-btrfs@vger.kernel.org>, <jiangbo.365@bytedance.com>,
	<linux-block@vger.kernel.org>, <gost.dev@samsung.com>,
	<linux-kernel@vger.kernel.org>, <dm-devel@redhat.com>
Subject: Re: [PATCH v4 08/13] btrfs:zoned: make sb for npo2 zone devices align with sb log offsets
Date: Wed, 18 May 2022 11:15:52 +0200	[thread overview]
Message-ID: <717a2c83-0678-9310-4c75-9ad5da0472f6@samsung.com> (raw)
In-Reply-To: <20220517124257.GD18596@twin.jikos.cz>

On 2022-05-17 14:42, David Sterba wrote:
> On Mon, May 16, 2022 at 06:54:11PM +0200, Pankaj Raghav wrote:
>> Superblocks for zoned devices are fixed as 2 zones at 0, 512GB and 4TB.
>> These are fixed at these locations so that recovery tools can reliably
>> retrieve the superblocks even if one of the mirror gets corrupted.
>>
>> power of 2 zone sizes align at these offsets irrespective of their
>> value but non power of 2 zone sizes will not align.
>>
>> To make sure the first zone at mirror 1 and mirror 2 align, write zero
>> operation is performed to move the write pointer of the first zone to
>> the expected offset. This operation is performed only after a zone reset
>> of the first zone, i.e., when the second zone that contains the sb is FULL.
> 
> Is it a good idea to do the "write zeros", instead of a plain "set write
> pointer"? I assume setting write pointer is instant, while writing
> potentially hundreds of megabytes may take significiant time. As the
> functions may be called from random contexts, the increased time may
> become a problem.
> 
Unfortunately it is not possible to just move the WP in zoned devices.
The only alternative that I could use is to do write zeroes which are
natively supported by some devices such as ZNS. It would be nice to know
if someone had a better solution to this instead of doing write zeroes
in zoned devices.

>> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
>> ---
>>  fs/btrfs/zoned.c | 68 ++++++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 63 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
>> index 3023c871e..805aeaa76 100644
>> --- a/fs/btrfs/zoned.c
>> +++ b/fs/btrfs/zoned.c
>> @@ -760,11 +760,44 @@ int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info)
>>  	return 0;
>>  }
>>  
>> +static int fill_sb_wp_offset(struct block_device *bdev, struct blk_zone *zone,
>> +			     int mirror, u64 *wp_ret)
>> +{
>> +	u64 offset = 0;
>> +	int ret = 0;
>> +
>> +	ASSERT(!is_power_of_two_u64(zone->len));
>> +	ASSERT(zone->wp == zone->start);
>> +	ASSERT(mirror != 0);
> 
> This could simply accept 0 as the mirror offset too, the calculation is
> trivial.
> 
Ok. I will fix it up!
>> +
>> +	switch (mirror) {
>> +	case 1:
>> +		div64_u64_rem(BTRFS_SB_LOG_FIRST_OFFSET >> SECTOR_SHIFT,
>> +			      zone->len, &offset);
>> +		break;
>> +	case 2:
>> +		div64_u64_rem(BTRFS_SB_LOG_SECOND_OFFSET >> SECTOR_SHIFT,
>> +			      zone->len, &offset);
>> +		break;
>> +	}
>> +
>> +	ret =  blkdev_issue_zeroout(bdev, zone->start, offset, GFP_NOFS, 0);
>> +	if (ret)
>> +		return ret;
>> +
>> +		/*
>> +		 * Non po2 zone sizes will not align naturally at
>> +		 * mirror 1 (512GB) and mirror 2 (4TB). The wp of the
>> +		 * 1st zone in those superblock mirrors need to be
>> +		 * moved to align at those offsets.
>> +		 */
> 
> Please move this comment to the helper fill_sb_wp_offset itself, there
> it's more discoverable.
> 
Ok.
>> +		is_sb_offset_write_req =
>> +			(zones_empty || (reset_zone_nr == 0)) && mirror &&
>> +			!is_power_of_2(zones[0].len);
> 
> Accepting 0 as the mirror number would also get rid of this wild
> expression substituting and 'if'.
> 
>>  
>>  		if (reset && reset->cond != BLK_ZONE_COND_EMPTY) {
>>  			ASSERT(sb_zone_is_full(reset));
>> @@ -795,6 +846,13 @@ static int sb_log_location(struct block_device *bdev, struct blk_zone *zones,
>>  			reset->cond = BLK_ZONE_COND_EMPTY;
>>  			reset->wp = reset->start;
>>  		}
>> +
>> +		if (is_sb_offset_write_req) {
> 
> And get rid of the conditional. The point of supporting both po2 and
> nonpo2 is to hide any implementation details to wrappers as much as
> possible.
> 
Alright. I will move the logic to the wrapper instead of having the
conditional in this function.
>> +			ret = fill_sb_wp_offset(bdev, &zones[0], mirror, &wp);
>> +			if (ret)
>> +				return ret;
>> +		}
>> +
>>  	} else if (ret != -ENOENT) {
>>  		/*
>>  		 * For READ, we want the previous one. Move write pointer to
Thanks for your comments.

  reply	other threads:[~2022-05-18  9:16 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220516165418eucas1p2be592d9cd4b35f6b71d39ccbe87f3fef@eucas1p2.samsung.com>
2022-05-16 16:54 ` [PATCH v4 00/13] support non power of 2 zoned devices Pankaj Raghav
     [not found]   ` <CGME20220516165419eucas1p104aadda60df323e6154bfc3b92103b7b@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 01/13] block: make blkdev_nr_zones and blk_queue_zone_no generic for npo2 zsze Pankaj Raghav
     [not found]   ` <CGME20220516165421eucas1p2515446ac290987bdb9af24ffb835b287@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 02/13] block: allow blk-zoned devices to have non-power-of-2 zone size Pankaj Raghav
2022-05-16 19:05       ` Pankaj Raghav
     [not found]   ` <CGME20220516165422eucas1p174acec28848a9c2178376f092af3fa1c@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 03/13] nvme: zns: Allow ZNS drives that have non-power_of_2 " Pankaj Raghav
     [not found]   ` <CGME20220516165424eucas1p2ee38cd64260539e5cac8d1fa4d0cba38@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 04/13] nvmet: Allow ZNS target to support non-power_of_2 zone sizes Pankaj Raghav
2022-05-17 14:19       ` Johannes Thumshirn
     [not found]   ` <CGME20220516165425eucas1p29fcd11d7051d9d3a9a9efc17cd3b6999@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 05/13] btrfs: zoned: Cache superblock location in btrfs_zoned_device_info Pankaj Raghav
2022-05-16 21:58       ` David Sterba
2022-05-17  7:55         ` Pankaj Raghav
     [not found]   ` <CGME20220516165427eucas1p1cfd87ca44ec314ea1d2ddc8ece7259f9@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 06/13] btrfs: zoned: Make sb_zone_number function non power of 2 compatible Pankaj Raghav
2022-05-17  6:53       ` Johannes Thumshirn
2022-05-17 11:51         ` David Sterba
     [not found]   ` <CGME20220516165428eucas1p1374b5f9592db3ca6a6551aff975537ce@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 07/13] btrfs: zoned: use generic btrfs zone helpers to support npo2 zoned devices Pankaj Raghav
2022-05-17 12:30       ` David Sterba
2022-05-18  9:40         ` Pankaj Raghav
2022-05-18 11:21           ` David Sterba
2022-05-19  4:13       ` Naohiro Aota
     [not found]   ` <CGME20220516165429eucas1p272c8b4325a488675f08f2d7016aa6230@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 08/13] btrfs:zoned: make sb for npo2 zone devices align with sb log offsets Pankaj Raghav
2022-05-17  6:50       ` Johannes Thumshirn
2022-05-17  8:00         ` Pankaj Raghav
2022-05-17 12:42       ` David Sterba
2022-05-18  9:15         ` Pankaj Raghav [this message]
2022-05-19  7:57           ` Johannes Thumshirn
2022-05-20  9:06             ` Pankaj Raghav
2022-05-20  9:15               ` Johannes Thumshirn
2022-05-19  7:59       ` Naohiro Aota
2022-05-20  9:09         ` Pankaj Raghav
     [not found]   ` <CGME20220516165430eucas1p214cca8eaba1db2c98d947444cad4f18f@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 09/13] btrfs: zoned: relax the alignment constraint for zoned devices Pankaj Raghav
     [not found]   ` <CGME20220516165432eucas1p2e1ea74d44738e44745f49e37b6b9e503@eucas1p2.samsung.com>
2022-05-16 16:54     ` [PATCH v4 10/13] zonefs: allow non power of 2 " Pankaj Raghav
     [not found]   ` <CGME20220516165434eucas1p12b178fb83cc93470933e3d72c40e9004@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 11/13] null_blk: " Pankaj Raghav
2022-05-17  4:12       ` kernel test robot
     [not found]   ` <CGME20220516165435eucas1p1dff8d9d039a76278ef1c09dba4b4e1fe@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 12/13] null_blk: use zone_size_sects_shift for " Pankaj Raghav
     [not found]   ` <CGME20220516165436eucas1p178d079302dae3a9fca696b13b0390deb@eucas1p1.samsung.com>
2022-05-16 16:54     ` [PATCH v4 13/13] dm-zoned: ensure only power of 2 zone sizes are allowed Pankaj Raghav
2022-05-17  8:10   ` [PATCH v4 00/13] support non power of 2 zoned devices Christoph Hellwig
2022-05-17  9:18     ` Javier González
2022-05-18  8:00       ` Christoph Hellwig
2022-05-19 15:25         ` Javier González
2022-05-17 15:34     ` [dm-devel] " Theodore Ts'o
2022-05-18 23:06       ` Luis Chamberlain
2022-05-19  3:08       ` Damien Le Moal
2022-05-19  3:12         ` Luis Chamberlain
2022-05-19  3:19           ` Damien Le Moal
2022-05-19  7:34             ` Johannes Thumshirn
2022-05-20  3:47               ` Damien Le Moal
2022-05-20  6:07                 ` Hannes Reinecke
2022-05-20  6:27                   ` Javier González
2022-05-20  6:41                     ` Damien Le Moal
     [not found]                       ` <CGME20220520065941eucas1p105cf273ede995dc4bf92f3245fad09b1@eucas1p1.samsung.com>
2022-05-20  6:59                         ` Javier González
2022-05-20  9:30                       ` Pankaj Raghav
2022-05-20 17:18                         ` David Sterba
2022-05-23  8:25                           ` Pankaj Raghav
2022-05-20  9:30                     ` Johannes Thumshirn
     [not found]                       ` <CGME20220520101610eucas1p1822ca6014e2a1d55ae74476f83c4de1d@eucas1p1.samsung.com>
2022-05-20 10:16                         ` Javier González

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=717a2c83-0678-9310-4c75-9ad5da0472f6@samsung.com \
    --to=p.raghav@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=jiangbo.365@bytedance.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=pankydev8@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).