All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <Damien.LeMoal@wdc.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	David Sterba <dsterba@suse.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2 2/3] btrfs: zoned: fix compressed writes
Date: Thu, 10 Jun 2021 07:36:59 +0000	[thread overview]
Message-ID: <DM6PR04MB70813C91EEB7952FC3EF917EE7359@DM6PR04MB7081.namprd04.prod.outlook.com> (raw)
In-Reply-To: 9464ea87-6d50-9015-a6f5-c7b3d61458ca@gmx.com

On 2021/06/10 16:28, Qu Wenruo wrote:
> 
> 
> On 2021/5/18 下午11:40, Johannes Thumshirn wrote:
>> When multiple processes write data to the same block group on a compressed
>> zoned filesystem, the underlying device could report I/O errors and data
>> corruption is possible.
>>
>> This happens because on a zoned file system, compressed data writes where
>> sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
>> operation. But with REQ_OP_WRITE and parallel submission it cannot be
>> guaranteed that the data is always submitted aligned to the underlying
>> zone's write pointer.
>>
>> The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned
>> filesystem is non intrusive on a regular file system or when submitting to
>> a conventional zone on a zoned filesystem, as it is guarded by
>> btrfs_use_zone_append.
>>
>> Reported-by: David Sterba <dsterba@suse.com>
>> Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag")
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> 
> Now working on compression support for subpage, just noticed some
> strange code behavior, I'm not sure if it's designed or just a typo.
> 
> So please correct me if possible.
> 
> [...]
>>
>>   	bio = btrfs_bio_alloc(first_byte);
>> -	bio->bi_opf = REQ_OP_WRITE | write_flags;
>> +	bio->bi_opf = bio_op | write_flags;
>>   	bio->bi_private = cb;
>>   	bio->bi_end_io = end_compressed_bio_write;
>>
>> +	if (use_append) {
>> +		struct extent_map *em;
>> +		struct map_lookup *map;
>> +		struct block_device *bdev;
>> +
>> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
>> +		if (IS_ERR(em)) {
>> +			kfree(cb);
>> +			bio_put(bio);
>> +			return BLK_STS_NOTSUPP;
>> +		}
>> +
>> +		map = em->map_lookup;
>> +		/* We only support single profile for now */
>> +		ASSERT(map->num_stripes == 1);
>> +		bdev = map->stripes[0].dev->bdev;

This variable seems rather useless...

>> +
>> +		bio_set_dev(bio, bdev);
>> +		free_extent_map(em);
>> +	}
>> +
> 
> Here for the newly created bio, we will try to call bio_set_dev() for
> it. (although later patch refactor this part a little)
> 
> So far so good.
> 
>>   	if (blkcg_css) {
>>   		bio->bi_opf |= REQ_CGROUP_PUNT;
>>   		kthread_associate_blkcg(blkcg_css);
>> @@ -432,6 +458,7 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>>   	bytes_left = compressed_len;
>>   	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
>>   		int submit = 0;
>> +		int len;
>>
>>   		page = compressed_pages[pg_index];
>>   		page->mapping = inode->vfs_inode.i_mapping;
>> @@ -439,9 +466,13 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>>   			submit = btrfs_bio_fits_in_stripe(page, PAGE_SIZE, bio,
>>   							  0);
>>
>> +		if (pg_index == 0 && use_append)
>> +			len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
>> +		else
>> +			len = bio_add_page(bio, page, PAGE_SIZE, 0);
>> +
>>   		page->mapping = NULL;
>> -		if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
>> -		    PAGE_SIZE) {
>> +		if (submit || len < PAGE_SIZE) {
>>   			/*
>>   			 * inc the count before we submit the bio so
>>   			 * we know the end IO handler won't happen before
>> @@ -465,11 +496,15 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>>   			}
>>
>>   			bio = btrfs_bio_alloc(first_byte);
>> -			bio->bi_opf = REQ_OP_WRITE | write_flags;
>> +			bio->bi_opf = bio_op | write_flags;
> 
> But here, for the newly allocated bio, we didn't call bio_set_dev() at all.
> 
> Shouldn't all zoned write bio need that bio_set_dev() call?

Yep, bio->bi_bdev must be set before bio_add_zone_append_page() is called.
Otherwise, there will be a crash (first line of bio_add_zone_append_page() gets
the request queue from bio->bi_bdev). I wonder why we do not see NULL pointer
oops here... Johannes ?

> 
> I guess since most compressed extents are pretty small, it's really hard
> to hit a case where we need to split the bio due to stripe boundary,
> thus very hard to hit anything wrong.
> 
> Anyway, since I'm working on compression code to make compressed write
> to follow the same boundary check in extent_io.c, I can definitely
> refactor the bio allocation code to add the zoned needed calls.
> 
> Thanks,
> Qu
> 
>>   			bio->bi_private = cb;
>>   			bio->bi_end_io = end_compressed_bio_write;
>>   			if (blkcg_css)
>>   				bio->bi_opf |= REQ_CGROUP_PUNT;
>> +			/*
>> +			 * Use bio_add_page() to ensure the bio has at least one
>> +			 * page.
>> +			 */
>>   			bio_add_page(bio, page, PAGE_SIZE, 0);
>>   		}
>>   		if (bytes_left < PAGE_SIZE) {
>>
> 


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2021-06-10  7:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 15:40 [PATCH v2 0/3] btrfs: zoned: fix writes on a compressed zoned filesystem Johannes Thumshirn
2021-05-18 15:40 ` [PATCH v2 1/3] btrfs: zoned: pass start block to btrfs_use_zone_append Johannes Thumshirn
2021-05-18 15:40 ` [PATCH v2 2/3] btrfs: zoned: fix compressed writes Johannes Thumshirn
2021-05-23 14:13   ` Josef Bacik
2021-05-23 23:09     ` Qu Wenruo
2021-05-24 13:04       ` Qu Wenruo
2021-05-24 13:30         ` David Sterba
2021-05-25  6:31         ` Johannes Thumshirn
2021-05-25  5:46     ` Johannes Thumshirn
2021-06-10  7:27   ` Qu Wenruo
2021-06-10  7:36     ` Damien Le Moal [this message]
2021-06-10  7:41       ` Qu Wenruo
2021-06-10  7:45         ` Damien Le Moal
2021-06-10  7:51           ` Qu Wenruo
2021-06-10  8:28           ` Johannes Thumshirn
2021-05-18 15:40 ` [PATCH v2 3/3] btrfs: zoned: factor out zoned device lookup Johannes Thumshirn
2021-05-24 10:00   ` Qu Wenruo
2021-05-25  9:11     ` Johannes Thumshirn
2021-05-20 15:05 ` [PATCH v2 0/3] btrfs: zoned: fix writes on a compressed zoned filesystem David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR04MB70813C91EEB7952FC3EF917EE7359@DM6PR04MB7081.namprd04.prod.outlook.com \
    --to=damien.lemoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.