All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] btrfs: zoned: fix writes on a compressed zoned filesystem
@ 2021-05-12 14:01 Johannes Thumshirn
  2021-05-12 14:01 ` [PATCH 1/2] btrfs: zoned: pass start block to btrfs_use_zone_append Johannes Thumshirn
  2021-05-12 14:01 ` [PATCH 2/2] btrfs: zoned: fix compressed writes Johannes Thumshirn
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-12 14:01 UTC (permalink / raw)
  To: David Sterba; +Cc: Johannes Thumshirn, linux-btrfs

David reported that I/O errors get thrown on a zoned filesystem with
compression enabled.

This happens because we're using regular writes instead of zoned append, but
with regular writes and increased parallelism, we cannot guarantee the data
placement requirements can be met.

This series switches the compressed I/O submission path to zone append writing
on a zoned filesystem.

Johannes Thumshirn (2):
  btrfs: zoned: pass start block to btrfs_use_zone_append
  btrfs: zoned: fix compressed writes

 fs/btrfs/compression.c | 44 ++++++++++++++++++++++++++++++++++++++----
 fs/btrfs/extent_io.c   |  2 +-
 fs/btrfs/inode.c       |  2 +-
 fs/btrfs/zoned.c       |  4 ++--
 fs/btrfs/zoned.h       |  5 ++---
 5 files changed, 46 insertions(+), 11 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] btrfs: zoned: pass start block to btrfs_use_zone_append
  2021-05-12 14:01 [PATCH 0/2] btrfs: zoned: fix writes on a compressed zoned filesystem Johannes Thumshirn
@ 2021-05-12 14:01 ` Johannes Thumshirn
  2021-05-12 14:01 ` [PATCH 2/2] btrfs: zoned: fix compressed writes Johannes Thumshirn
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-12 14:01 UTC (permalink / raw)
  To: David Sterba; +Cc: Johannes Thumshirn, linux-btrfs

btrfs_use_zone_append only needs the passed in extent_map's block_start
member, so there's no need to pass in the full extent map.

This also enables the use of btrfs_use_zone_append in places where we only
have a start byte but no extent_map.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/extent_io.c | 2 +-
 fs/btrfs/inode.c     | 2 +-
 fs/btrfs/zoned.c     | 4 ++--
 fs/btrfs/zoned.h     | 5 ++---
 4 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0ce419512ed4..74ba2e1a3927 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3753,7 +3753,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
 		/* Note that em_end from extent_map_end() is exclusive */
 		iosize = min(em_end, end + 1) - cur;
 
-		if (btrfs_use_zone_append(inode, em))
+		if (btrfs_use_zone_append(inode, em->block_start))
 			opf = REQ_OP_ZONE_APPEND;
 
 		free_extent_map(em);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c6164ae16e2a..33f14573f2ec 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7786,7 +7786,7 @@ static int btrfs_dio_iomap_begin(struct inode *inode, loff_t start,
 	iomap->bdev = fs_info->fs_devices->latest_bdev;
 	iomap->length = len;
 
-	if (write && btrfs_use_zone_append(BTRFS_I(inode), em))
+	if (write && btrfs_use_zone_append(BTRFS_I(inode), em->block_start))
 		iomap->flags |= IOMAP_F_ZONE_APPEND;
 
 	free_extent_map(em);
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index c41373a92476..b9d5579a578d 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1296,7 +1296,7 @@ void btrfs_free_redirty_list(struct btrfs_transaction *trans)
 	spin_unlock(&trans->releasing_ebs_lock);
 }
 
-bool btrfs_use_zone_append(struct btrfs_inode *inode, struct extent_map *em)
+bool btrfs_use_zone_append(struct btrfs_inode *inode, u64 start)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct btrfs_block_group *cache;
@@ -1311,7 +1311,7 @@ bool btrfs_use_zone_append(struct btrfs_inode *inode, struct extent_map *em)
 	if (!is_data_inode(&inode->vfs_inode))
 		return false;
 
-	cache = btrfs_lookup_block_group(fs_info, em->block_start);
+	cache = btrfs_lookup_block_group(fs_info, start);
 	ASSERT(cache);
 	if (!cache)
 		return false;
diff --git a/fs/btrfs/zoned.h b/fs/btrfs/zoned.h
index 5e41a74a9cb2..e55d32595c2c 100644
--- a/fs/btrfs/zoned.h
+++ b/fs/btrfs/zoned.h
@@ -53,7 +53,7 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache);
 void btrfs_redirty_list_add(struct btrfs_transaction *trans,
 			    struct extent_buffer *eb);
 void btrfs_free_redirty_list(struct btrfs_transaction *trans);
-bool btrfs_use_zone_append(struct btrfs_inode *inode, struct extent_map *em);
+bool btrfs_use_zone_append(struct btrfs_inode *inode, u64 start);
 void btrfs_record_physical_zoned(struct inode *inode, u64 file_offset,
 				 struct bio *bio);
 void btrfs_rewrite_logical_zoned(struct btrfs_ordered_extent *ordered);
@@ -152,8 +152,7 @@ static inline void btrfs_redirty_list_add(struct btrfs_transaction *trans,
 					  struct extent_buffer *eb) { }
 static inline void btrfs_free_redirty_list(struct btrfs_transaction *trans) { }
 
-static inline bool btrfs_use_zone_append(struct btrfs_inode *inode,
-					 struct extent_map *em)
+static inline bool btrfs_use_zone_append(struct btrfs_inode *inode, u64 start)
 {
 	return false;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-12 14:01 [PATCH 0/2] btrfs: zoned: fix writes on a compressed zoned filesystem Johannes Thumshirn
  2021-05-12 14:01 ` [PATCH 1/2] btrfs: zoned: pass start block to btrfs_use_zone_append Johannes Thumshirn
@ 2021-05-12 14:01 ` Johannes Thumshirn
  2021-05-12 14:42   ` David Sterba
  1 sibling, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-12 14:01 UTC (permalink / raw)
  To: David Sterba; +Cc: Johannes Thumshirn, linux-btrfs

When multiple processes write data to the same block group on a compressed
zoned filesystem, the underlying device could report I/O errors and data
corruption is possible.

This happens because on a zoned file system, compressed data writes where
sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
operation. But with REQ_OP_WRITE and parallel submission it cannot be
guaranteed that the data is always submitted aligned to the underlying
zone's write pointer.

The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned
filesystem is non intrusive on a regular file system or when submitting to
a conventional zone on a zoned filesystem, as it is guarded by
btrfs_use_zone_append.

Reported-by: David Sterba <dsterba@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 fs/btrfs/compression.c | 44 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 2bea01d23a5b..d27205791483 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -28,6 +28,7 @@
 #include "compression.h"
 #include "extent_io.h"
 #include "extent_map.h"
+#include "zoned.h"
 
 static const char* const btrfs_compress_types[] = { "", "zlib", "lzo", "zstd" };
 
@@ -349,6 +350,7 @@ static void end_compressed_bio_write(struct bio *bio)
 	 */
 	inode = cb->inode;
 	cb->compressed_pages[0]->mapping = cb->inode->i_mapping;
+	btrfs_record_physical_zoned(inode, cb->start, bio);
 	btrfs_writepage_endio_finish_ordered(cb->compressed_pages[0],
 			cb->start, cb->start + cb->len - 1,
 			bio->bi_status == BLK_STS_OK);
@@ -401,6 +403,10 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 	u64 first_byte = disk_start;
 	blk_status_t ret;
 	int skip_sum = inode->flags & BTRFS_INODE_NODATASUM;
+	struct block_device *bdev;
+	const bool use_append = btrfs_use_zone_append(inode, disk_start);
+	const unsigned int bio_op =
+		use_append ? REQ_OP_ZONE_APPEND : REQ_OP_WRITE;
 
 	WARN_ON(!PAGE_ALIGNED(start));
 	cb = kmalloc(compressed_bio_size(fs_info, compressed_len), GFP_NOFS);
@@ -418,10 +424,31 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 	cb->nr_pages = nr_pages;
 
 	bio = btrfs_bio_alloc(first_byte);
-	bio->bi_opf = REQ_OP_WRITE | write_flags;
+	bio->bi_opf = bio_op | write_flags;
 	bio->bi_private = cb;
 	bio->bi_end_io = end_compressed_bio_write;
 
+	if (use_append) {
+		struct extent_map *em;
+		struct map_lookup *map;
+
+		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
+		if (IS_ERR(em)) {
+			kfree(cb);
+			bio_put(bio);
+			return BLK_STS_NOTSUPP;
+		}
+
+		map = em->map_lookup;
+		/* We only support single profile for now */
+		ASSERT(map->num_stripes == 1);
+		bdev = map->stripes[0].dev->bdev;
+
+		free_extent_map(em);
+
+		bio_set_dev(bio, bdev);
+	}
+
 	if (blkcg_css) {
 		bio->bi_opf |= REQ_CGROUP_PUNT;
 		kthread_associate_blkcg(blkcg_css);
@@ -432,6 +459,7 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 	bytes_left = compressed_len;
 	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
 		int submit = 0;
+		int len;
 
 		page = compressed_pages[pg_index];
 		page->mapping = inode->vfs_inode.i_mapping;
@@ -439,9 +467,13 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 			submit = btrfs_bio_fits_in_stripe(page, PAGE_SIZE, bio,
 							  0);
 
+		if (pg_index == 0 && use_append)
+			len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
+		else
+			len = bio_add_page(bio, page, PAGE_SIZE, 0);
+
 		page->mapping = NULL;
-		if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
-		    PAGE_SIZE) {
+		if (submit || len < PAGE_SIZE) {
 			/*
 			 * inc the count before we submit the bio so
 			 * we know the end IO handler won't happen before
@@ -465,11 +497,15 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
 			}
 
 			bio = btrfs_bio_alloc(first_byte);
-			bio->bi_opf = REQ_OP_WRITE | write_flags;
+			bio->bi_opf = bio_op | write_flags;
 			bio->bi_private = cb;
 			bio->bi_end_io = end_compressed_bio_write;
 			if (blkcg_css)
 				bio->bi_opf |= REQ_CGROUP_PUNT;
+			/*
+			 * Use bio_add_page() to ensure the bio has at least one
+			 * page.
+			 */
 			bio_add_page(bio, page, PAGE_SIZE, 0);
 		}
 		if (bytes_left < PAGE_SIZE) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-12 14:01 ` [PATCH 2/2] btrfs: zoned: fix compressed writes Johannes Thumshirn
@ 2021-05-12 14:42   ` David Sterba
  2021-05-17  7:07     ` Johannes Thumshirn
  0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2021-05-12 14:42 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, linux-btrfs

On Wed, May 12, 2021 at 11:01:40PM +0900, Johannes Thumshirn wrote:
> When multiple processes write data to the same block group on a compressed
> zoned filesystem, the underlying device could report I/O errors and data
> corruption is possible.
> 
> This happens because on a zoned file system, compressed data writes where
> sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
> operation. But with REQ_OP_WRITE and parallel submission it cannot be
> guaranteed that the data is always submitted aligned to the underlying
> zone's write pointer.
> 
> The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned
> filesystem is non intrusive on a regular file system or when submitting to
> a conventional zone on a zoned filesystem, as it is guarded by
> btrfs_use_zone_append.
> 
> Reported-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>  fs/btrfs/compression.c | 44 ++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 2bea01d23a5b..d27205791483 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -28,6 +28,7 @@
>  #include "compression.h"
>  #include "extent_io.h"
>  #include "extent_map.h"
> +#include "zoned.h"
>  
>  static const char* const btrfs_compress_types[] = { "", "zlib", "lzo", "zstd" };
>  
> @@ -349,6 +350,7 @@ static void end_compressed_bio_write(struct bio *bio)
>  	 */
>  	inode = cb->inode;
>  	cb->compressed_pages[0]->mapping = cb->inode->i_mapping;
> +	btrfs_record_physical_zoned(inode, cb->start, bio);
>  	btrfs_writepage_endio_finish_ordered(cb->compressed_pages[0],
>  			cb->start, cb->start + cb->len - 1,
>  			bio->bi_status == BLK_STS_OK);
> @@ -401,6 +403,10 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>  	u64 first_byte = disk_start;
>  	blk_status_t ret;
>  	int skip_sum = inode->flags & BTRFS_INODE_NODATASUM;
> +	struct block_device *bdev;
> +	const bool use_append = btrfs_use_zone_append(inode, disk_start);
> +	const unsigned int bio_op =
> +		use_append ? REQ_OP_ZONE_APPEND : REQ_OP_WRITE;
>  
>  	WARN_ON(!PAGE_ALIGNED(start));
>  	cb = kmalloc(compressed_bio_size(fs_info, compressed_len), GFP_NOFS);
> @@ -418,10 +424,31 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>  	cb->nr_pages = nr_pages;
>  
>  	bio = btrfs_bio_alloc(first_byte);
> -	bio->bi_opf = REQ_OP_WRITE | write_flags;
> +	bio->bi_opf = bio_op | write_flags;
>  	bio->bi_private = cb;
>  	bio->bi_end_io = end_compressed_bio_write;
>  
> +	if (use_append) {
> +		struct extent_map *em;
> +		struct map_lookup *map;
> +
> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);

The caller already does the em lookup, so this is duplicate, allocating
memory, taking locks and doing a tree lookup. All happening on write out
path so this seems heavy.

> +		if (IS_ERR(em)) {
> +			kfree(cb);
> +			bio_put(bio);
> +			return BLK_STS_NOTSUPP;
> +		}
> +
> +		map = em->map_lookup;
> +		/* We only support single profile for now */
> +		ASSERT(map->num_stripes == 1);
> +		bdev = map->stripes[0].dev->bdev;
> +
> +		free_extent_map(em);
> +
> +		bio_set_dev(bio, bdev);

bdev seems to be used just to set it for the bio, so it does not need to
be declared in the function scope (or for one-time use at all)

The same sequence of calls is done in submit_extent_page so this should
be in a helper.

> +	}
> +
>  	if (blkcg_css) {
>  		bio->bi_opf |= REQ_CGROUP_PUNT;
>  		kthread_associate_blkcg(blkcg_css);

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-12 14:42   ` David Sterba
@ 2021-05-17  7:07     ` Johannes Thumshirn
  2021-05-17  9:12       ` David Sterba
  2021-05-17 11:21       ` Johannes Thumshirn
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-17  7:07 UTC (permalink / raw)
  To: dsterba; +Cc: David Sterba, linux-btrfs

On 12/05/2021 16:44, David Sterba wrote:
> On Wed, May 12, 2021 at 11:01:40PM +0900, Johannes Thumshirn wrote:
>> When multiple processes write data to the same block group on a compressed
>> zoned filesystem, the underlying device could report I/O errors and data
>> corruption is possible.
>>
>> This happens because on a zoned file system, compressed data writes where
>> sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND
>> operation. But with REQ_OP_WRITE and parallel submission it cannot be
>> guaranteed that the data is always submitted aligned to the underlying
>> zone's write pointer.
>>
>> The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned
>> filesystem is non intrusive on a regular file system or when submitting to
>> a conventional zone on a zoned filesystem, as it is guarded by
>> btrfs_use_zone_append.
>>
>> Reported-by: David Sterba <dsterba@suse.com>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>> ---
>>  fs/btrfs/compression.c | 44 ++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 40 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
>> index 2bea01d23a5b..d27205791483 100644
>> --- a/fs/btrfs/compression.c
>> +++ b/fs/btrfs/compression.c
>> @@ -28,6 +28,7 @@
>>  #include "compression.h"
>>  #include "extent_io.h"
>>  #include "extent_map.h"
>> +#include "zoned.h"
>>  
>>  static const char* const btrfs_compress_types[] = { "", "zlib", "lzo", "zstd" };
>>  
>> @@ -349,6 +350,7 @@ static void end_compressed_bio_write(struct bio *bio)
>>  	 */
>>  	inode = cb->inode;
>>  	cb->compressed_pages[0]->mapping = cb->inode->i_mapping;
>> +	btrfs_record_physical_zoned(inode, cb->start, bio);
>>  	btrfs_writepage_endio_finish_ordered(cb->compressed_pages[0],
>>  			cb->start, cb->start + cb->len - 1,
>>  			bio->bi_status == BLK_STS_OK);
>> @@ -401,6 +403,10 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>>  	u64 first_byte = disk_start;
>>  	blk_status_t ret;
>>  	int skip_sum = inode->flags & BTRFS_INODE_NODATASUM;
>> +	struct block_device *bdev;
>> +	const bool use_append = btrfs_use_zone_append(inode, disk_start);
>> +	const unsigned int bio_op =
>> +		use_append ? REQ_OP_ZONE_APPEND : REQ_OP_WRITE;
>>  
>>  	WARN_ON(!PAGE_ALIGNED(start));
>>  	cb = kmalloc(compressed_bio_size(fs_info, compressed_len), GFP_NOFS);
>> @@ -418,10 +424,31 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start,
>>  	cb->nr_pages = nr_pages;
>>  
>>  	bio = btrfs_bio_alloc(first_byte);
>> -	bio->bi_opf = REQ_OP_WRITE | write_flags;
>> +	bio->bi_opf = bio_op | write_flags;
>>  	bio->bi_private = cb;
>>  	bio->bi_end_io = end_compressed_bio_write;
>>  
>> +	if (use_append) {
>> +		struct extent_map *em;
>> +		struct map_lookup *map;
>> +
>> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
> 
> The caller already does the em lookup, so this is duplicate, allocating
> memory, taking locks and doing a tree lookup. All happening on write out
> path so this seems heavy.

Right, I did not check this, sorry. Is it OK to add another patch as 
preparation swapping some of the parameters to btrfs_submit_compressed_write()
from the em? Otherwise btrfs_submit_compressed_write() will have 10 parameters
which sounds awefull.

> 
>> +		if (IS_ERR(em)) {
>> +			kfree(cb);
>> +			bio_put(bio);
>> +			return BLK_STS_NOTSUPP;
>> +		}
>> +
>> +		map = em->map_lookup;
>> +		/* We only support single profile for now */
>> +		ASSERT(map->num_stripes == 1);
>> +		bdev = map->stripes[0].dev->bdev;
>> +
>> +		free_extent_map(em);
>> +
>> +		bio_set_dev(bio, bdev);
> 
> bdev seems to be used just to set it for the bio, so it does not need to
> be declared in the function scope (or for one-time use at all)

Oops that's a left over from an earlier version.

> The same sequence of calls is done in submit_extent_page so this should
> be in a helper.

Sure.
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-17  7:07     ` Johannes Thumshirn
@ 2021-05-17  9:12       ` David Sterba
  2021-05-17  9:20         ` Johannes Thumshirn
  2021-05-17 11:21       ` Johannes Thumshirn
  1 sibling, 1 reply; 9+ messages in thread
From: David Sterba @ 2021-05-17  9:12 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: David Sterba, linux-btrfs

On Mon, May 17, 2021 at 07:07:04AM +0000, Johannes Thumshirn wrote:
> On 12/05/2021 16:44, David Sterba wrote:
> > On Wed, May 12, 2021 at 11:01:40PM +0900, Johannes Thumshirn wrote:
> >> +	if (use_append) {
> >> +		struct extent_map *em;
> >> +		struct map_lookup *map;
> >> +
> >> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
> > 
> > The caller already does the em lookup, so this is duplicate, allocating
> > memory, taking locks and doing a tree lookup. All happening on write out
> > path so this seems heavy.
> 
> Right, I did not check this, sorry. Is it OK to add another patch as 
> preparation swapping some of the parameters to btrfs_submit_compressed_write()
> from the em?

That would be another prep patch for the fix, I can't say now if this
would be still suitable for stable.

> Otherwise btrfs_submit_compressed_write() will have 10 parameters
> which sounds awefull.

In case the fix would have to be in one patch, extending the parameters
to 10 would be acceptable, if followed by reduction cleanup (that won't
have to be backported).

If you check the only caller of btrfs_submit_compressed_write, four
parameters are from async_extent, and two are async_chunk, where
async_extent = list_entry(async_chunk->extents.next, ...) so that should
be easy to reduce.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-17  9:12       ` David Sterba
@ 2021-05-17  9:20         ` Johannes Thumshirn
  0 siblings, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-17  9:20 UTC (permalink / raw)
  To: dsterba; +Cc: David Sterba, linux-btrfs

On 17/05/2021 11:15, David Sterba wrote:
> On Mon, May 17, 2021 at 07:07:04AM +0000, Johannes Thumshirn wrote:
>> On 12/05/2021 16:44, David Sterba wrote:
>>> On Wed, May 12, 2021 at 11:01:40PM +0900, Johannes Thumshirn wrote:
>>>> +	if (use_append) {
>>>> +		struct extent_map *em;
>>>> +		struct map_lookup *map;
>>>> +
>>>> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
>>>
>>> The caller already does the em lookup, so this is duplicate, allocating
>>> memory, taking locks and doing a tree lookup. All happening on write out
>>> path so this seems heavy.
>>
>> Right, I did not check this, sorry. Is it OK to add another patch as 
>> preparation swapping some of the parameters to btrfs_submit_compressed_write()
>> from the em?
> 
> That would be another prep patch for the fix, I can't say now if this
> would be still suitable for stable.
> 
>> Otherwise btrfs_submit_compressed_write() will have 10 parameters
>> which sounds awefull.
> 
> In case the fix would have to be in one patch, extending the parameters
> to 10 would be acceptable, if followed by reduction cleanup (that won't
> have to be backported).

OK, then I'll do that.

> If you check the only caller of btrfs_submit_compressed_write, four
> parameters are from async_extent, and two are async_chunk, where
> async_extent = list_entry(async_chunk->extents.next, ...) so that should
> be easy to reduce.
> 

Right 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-17  7:07     ` Johannes Thumshirn
  2021-05-17  9:12       ` David Sterba
@ 2021-05-17 11:21       ` Johannes Thumshirn
  2021-05-17 11:39         ` David Sterba
  1 sibling, 1 reply; 9+ messages in thread
From: Johannes Thumshirn @ 2021-05-17 11:21 UTC (permalink / raw)
  To: dsterba; +Cc: David Sterba, linux-btrfs

On 17/05/2021 09:07, Johannes Thumshirn wrote:
>>> +	if (use_append) {
>>> +		struct extent_map *em;
>>> +		struct map_lookup *map;
>>> +
>>> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
>> The caller already does the em lookup, so this is duplicate, allocating
>> memory, taking locks and doing a tree lookup. All happening on write out
>> path so this seems heavy.
> Right, I did not check this, sorry. Is it OK to add another patch as 
> preparation swapping some of the parameters to btrfs_submit_compressed_write()
> from the em? Otherwise btrfs_submit_compressed_write() will have 10 parameters
> which sounds awefull.
> 

Actually I can't do that. The caller does calls create_io_em() while this patch
needs to call brtfs_get_chunk_map(). The 'em' returned by create_io_em() does not
have em->map_lookup populated and we need the stripe's block device from 
em->map_lookup.

So it looks like we need to live with the additional memory allocation and locks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrfs: zoned: fix compressed writes
  2021-05-17 11:21       ` Johannes Thumshirn
@ 2021-05-17 11:39         ` David Sterba
  0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2021-05-17 11:39 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: dsterba, David Sterba, linux-btrfs

On Mon, May 17, 2021 at 11:21:49AM +0000, Johannes Thumshirn wrote:
> On 17/05/2021 09:07, Johannes Thumshirn wrote:
> >>> +	if (use_append) {
> >>> +		struct extent_map *em;
> >>> +		struct map_lookup *map;
> >>> +
> >>> +		em = btrfs_get_chunk_map(fs_info, disk_start, PAGE_SIZE);
> >> The caller already does the em lookup, so this is duplicate, allocating
> >> memory, taking locks and doing a tree lookup. All happening on write out
> >> path so this seems heavy.
> > Right, I did not check this, sorry. Is it OK to add another patch as 
> > preparation swapping some of the parameters to btrfs_submit_compressed_write()
> > from the em? Otherwise btrfs_submit_compressed_write() will have 10 parameters
> > which sounds awefull.
> > 
> 
> Actually I can't do that. The caller does calls create_io_em() while this patch
> needs to call brtfs_get_chunk_map(). The 'em' returned by create_io_em() does not
> have em->map_lookup populated and we need the stripe's block device from 
> em->map_lookup.
> 
> So it looks like we need to live with the additional memory allocation and locks.

Ok then, it's limited to zoned mode so the allocation won't affect
regular mode.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-05-17 11:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-12 14:01 [PATCH 0/2] btrfs: zoned: fix writes on a compressed zoned filesystem Johannes Thumshirn
2021-05-12 14:01 ` [PATCH 1/2] btrfs: zoned: pass start block to btrfs_use_zone_append Johannes Thumshirn
2021-05-12 14:01 ` [PATCH 2/2] btrfs: zoned: fix compressed writes Johannes Thumshirn
2021-05-12 14:42   ` David Sterba
2021-05-17  7:07     ` Johannes Thumshirn
2021-05-17  9:12       ` David Sterba
2021-05-17  9:20         ` Johannes Thumshirn
2021-05-17 11:21       ` Johannes Thumshirn
2021-05-17 11:39         ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.