linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs
@ 2020-11-10  2:09 Qu Wenruo
  2020-11-10  2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
  2020-11-10  2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
  0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10  2:09 UTC (permalink / raw)
  To: linux-btrfs

This is another cleanup exposed when I'm fixing my subpage patchset.

Dating back to the old time where we still have hooks for data/metadata
endio, we have a parameter called @phy_offset for both hooks.

That @phy_offset is the number of sectors compared to the bio on-disk
bytenr, and is used to grab the csum from btrfs_io_bio.

This is far from straightforward, and costs reader tons of time to grasp
the basic.

This patchset will change it by:
- Remove phy_offset completely for metadata
  Since metadata doesn't use btrfs_io_bio::csums[] at all, there is no
  need for it.

- Use @disk_bytenr to replace @phy_offset/@icsum
  Let the callee, check_data_csum() to calculate the offset from
  @disk_bytenr and bio to get the csum offset.

Changelog:
v2:
- Update commit message to remove the wrong comment on
  btrfs_io_bio->logical
  That logical is mess, it has different meanings for different use
  cases.
  What we should refer to is bio->bi_iter.bi_sector.

- Remove the false-alert prone ASSERT()
  Even at endio time. bio->bi_iter.bi_size can change due to incoming
  finished IOs.
  This means we can't really rely on bio->bi_iter.bi_size to check if
  our disk_bytenr is still valid.

Qu Wenruo (2):
  btrfs: remove the phy_offset parameter for
    btrfs_validate_metadata_buffer()
  btrfs: pass disk_bytenr directly for check_data_csum()

 fs/btrfs/disk-io.c   |  2 +-
 fs/btrfs/disk-io.h   |  2 +-
 fs/btrfs/extent_io.c | 16 +++++++++-------
 fs/btrfs/inode.c     | 26 +++++++++++++++++---------
 4 files changed, 28 insertions(+), 18 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer()
  2020-11-10  2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
@ 2020-11-10  2:09 ` Qu Wenruo
  2020-11-10  2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
  1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10  2:09 UTC (permalink / raw)
  To: linux-btrfs

Parameter @phy_offset is the offset against the io_bio->logical (which
is the disk bytenr).

@phy_offset is mostly for data io to lookup the csum in btrfs_io_bio.

But for metadata, it's completely useless as metadata stores their own
csum in its btrfs_header.

Remove this useless parameter from btrfs_validate_metadata_buffer().

Just an extra note for parameters @start and @end, they are not utilized
at all for current sectorsize == PAGE_SIZE, as we can grab eb directly
from page.

But those two parameters are very important for later subpage support,
thus @start/@len are not touched here.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/disk-io.c   | 2 +-
 fs/btrfs/disk-io.h   | 2 +-
 fs/btrfs/extent_io.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c70a52b44ceb..bd6e357dd280 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -524,7 +524,7 @@ static int check_tree_block_fsid(struct extent_buffer *eb)
 	return 1;
 }
 
-int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
 				   struct page *page, u64 start, u64 end,
 				   int mirror)
 {
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 238b45223f2e..76ede62737fd 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -79,7 +79,7 @@ void btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info);
 void btrfs_btree_balance_dirty_nodelay(struct btrfs_fs_info *fs_info);
 void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
 				 struct btrfs_root *root);
-int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
 				   struct page *page, u64 start, u64 end,
 				   int mirror);
 blk_status_t btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b4381d7ca52c..bd5a22bfee68 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2928,7 +2928,7 @@ static void end_bio_extent_readpage(struct bio *bio)
 							     start, end, mirror);
 			else
 				ret = btrfs_validate_metadata_buffer(io_bio,
-					offset, page, start, end, mirror);
+					page, start, end, mirror);
 			if (ret)
 				uptodate = 0;
 			else
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
  2020-11-10  2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
  2020-11-10  2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
@ 2020-11-10  2:09 ` Qu Wenruo
  2020-11-10 15:30   ` Josef Bacik
  1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10  2:09 UTC (permalink / raw)
  To: linux-btrfs

Parameter @icsum for check_data_csum() is a little hard to understand.
So is the @phy_offset for btrfs_verify_data_csum().

Both parameters are calculated values for csum lookup.

Instead of some calculated value, just pass @disk_bytenr and let the
final and only user, check_data_csum(), to calculate whatever it needs.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent_io.c | 14 ++++++++------
 fs/btrfs/inode.c     | 26 +++++++++++++++++---------
 2 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index bd5a22bfee68..f8b5d3d4e5b0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio *bio)
 	struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
 	struct extent_io_tree *tree, *failure_tree;
 	struct processed_extent processed = { 0 };
-	u64 offset = 0;
+	u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);
 	u64 start;
 	u64 end;
 	u64 len;
@@ -2924,8 +2924,9 @@ static void end_bio_extent_readpage(struct bio *bio)
 		mirror = io_bio->mirror_num;
 		if (likely(uptodate)) {
 			if (is_data_inode(inode))
-				ret = btrfs_verify_data_csum(io_bio, offset, page,
-							     start, end, mirror);
+				ret = btrfs_verify_data_csum(io_bio,
+						disk_bytenr, page, start, end,
+						mirror);
 			else
 				ret = btrfs_validate_metadata_buffer(io_bio,
 					page, start, end, mirror);
@@ -2953,12 +2954,13 @@ static void end_bio_extent_readpage(struct bio *bio)
 			 * If it can't handle the error it will return -EIO and
 			 * we remain responsible for that page.
 			 */
-			if (!btrfs_submit_read_repair(inode, bio, offset, page,
+			if (!btrfs_submit_read_repair(inode, bio, disk_bytenr,
+						page,
 						start - page_offset(page),
 						start, end, mirror,
 						btrfs_submit_data_bio)) {
 				uptodate = !bio->bi_status;
-				offset += len;
+				disk_bytenr += len;
 				continue;
 			}
 		} else {
@@ -2983,7 +2985,7 @@ static void end_bio_extent_readpage(struct bio *bio)
 			if (page->index == end_index && off)
 				zero_user_segment(page, off, PAGE_SIZE);
 		}
-		offset += len;
+		disk_bytenr += len;
 
 		endio_readpage_update_page_status(page, uptodate);
 		endio_readpage_release_extent(&processed, BTRFS_I(inode),
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c54e0ed0b938..e1d309bfc693 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2843,19 +2843,23 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start,
  * The length of such check is always one sector size.
  */
 static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
-			   int icsum, struct page *page, int pgoff)
+			   u64 disk_bytenr, struct page *page, int pgoff)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
 	char *kaddr;
 	u32 len = fs_info->sectorsize;
 	const u32 csum_size = fs_info->csum_size;
+	u64 bio_disk_bytenr = (io_bio->bio.bi_iter.bi_sector << 9);
+	int offset_sectors;
 	u8 *csum_expected;
 	u8 csum[BTRFS_CSUM_SIZE];
 
 	ASSERT(pgoff + len <= PAGE_SIZE);
 
-	csum_expected = ((u8 *)io_bio->csum) + icsum * csum_size;
+	offset_sectors = (disk_bytenr - bio_disk_bytenr) >>
+			 fs_info->sectorsize_bits;
+	csum_expected = ((u8 *)io_bio->csum) + offset_sectors * csum_size;
 
 	kaddr = kmap_atomic(page);
 	shash->tfm = fs_info->csum_shash;
@@ -2883,8 +2887,13 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
  * when reads are done, we need to check csums to verify the data is correct
  * if there's a match, we allow the bio to finish.  If not, the code in
  * extent_io.c will try to find good copies for us.
+ *
+ * @disk_bytenr: The on-disk bytenr of the range start
+ * @start:	 The file offset of the range start
+ * @end:	 The file offset of the range end (inclusive)
+ * @mirror:	 The mirror number
  */
-int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 disk_bytenr,
 			   struct page *page, u64 start, u64 end, int mirror)
 {
 	size_t offset = start - page_offset(page);
@@ -2909,8 +2918,7 @@ int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 phy_offset,
 		return 0;
 	}
 
-	phy_offset >>= root->fs_info->sectorsize_bits;
-	return check_data_csum(inode, io_bio, phy_offset, page, offset);
+	return check_data_csum(inode, io_bio, disk_bytenr, page, offset);
 }
 
 /*
@@ -7616,7 +7624,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
 	struct bio_vec bvec;
 	struct bvec_iter iter;
 	u64 start = io_bio->logical;
-	int icsum = 0;
+	u64 disk_bytenr = (io_bio->bio.bi_iter.bi_sector << 9);
 	blk_status_t err = BLK_STS_OK;
 
 	__bio_for_each_segment(bvec, &io_bio->bio, iter, io_bio->iter) {
@@ -7627,8 +7635,8 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
 		for (i = 0; i < nr_sectors; i++) {
 			ASSERT(pgoff < PAGE_SIZE);
 			if (uptodate &&
-			    (!csum || !check_data_csum(inode, io_bio, icsum,
-						       bvec.bv_page, pgoff))) {
+			    (!csum || !check_data_csum(inode, io_bio,
+					disk_bytenr, bvec.bv_page, pgoff))) {
 				clean_io_failure(fs_info, failure_tree, io_tree,
 						 start, bvec.bv_page,
 						 btrfs_ino(BTRFS_I(inode)),
@@ -7648,7 +7656,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
 					err = status;
 			}
 			start += sectorsize;
-			icsum++;
+			disk_bytenr += sectorsize;
 			pgoff += sectorsize;
 		}
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
  2020-11-10  2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
@ 2020-11-10 15:30   ` Josef Bacik
  2020-11-11  0:05     ` Qu Wenruo
  0 siblings, 1 reply; 5+ messages in thread
From: Josef Bacik @ 2020-11-10 15:30 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On 11/9/20 9:09 PM, Qu Wenruo wrote:
> Parameter @icsum for check_data_csum() is a little hard to understand.
> So is the @phy_offset for btrfs_verify_data_csum().
> 
> Both parameters are calculated values for csum lookup.
> 
> Instead of some calculated value, just pass @disk_bytenr and let the
> final and only user, check_data_csum(), to calculate whatever it needs.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/extent_io.c | 14 ++++++++------
>   fs/btrfs/inode.c     | 26 +++++++++++++++++---------
>   2 files changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index bd5a22bfee68..f8b5d3d4e5b0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio *bio)
>   	struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
>   	struct extent_io_tree *tree, *failure_tree;
>   	struct processed_extent processed = { 0 };
> -	u64 offset = 0;
> +	u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);

This doesn't work, bi_sector can be remapped based on the underlying device, and 
thus can be different between submit and endio.  To illustrate this point, make 
2 partitions on a single device, mkfs the second partition, and then run 
xfstests with this patch applied, all sorts of fun will happen.

In fact we should probably add such a test to xfstests to catch anybody relying 
on bi_sector to stay the same.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
  2020-11-10 15:30   ` Josef Bacik
@ 2020-11-11  0:05     ` Qu Wenruo
  0 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-11  0:05 UTC (permalink / raw)
  To: Josef Bacik, Qu Wenruo, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2127 bytes --]



On 2020/11/10 下午11:30, Josef Bacik wrote:
> On 11/9/20 9:09 PM, Qu Wenruo wrote:
>> Parameter @icsum for check_data_csum() is a little hard to understand.
>> So is the @phy_offset for btrfs_verify_data_csum().
>>
>> Both parameters are calculated values for csum lookup.
>>
>> Instead of some calculated value, just pass @disk_bytenr and let the
>> final and only user, check_data_csum(), to calculate whatever it needs.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/extent_io.c | 14 ++++++++------
>>   fs/btrfs/inode.c     | 26 +++++++++++++++++---------
>>   2 files changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index bd5a22bfee68..f8b5d3d4e5b0 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio
>> *bio)
>>       struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
>>       struct extent_io_tree *tree, *failure_tree;
>>       struct processed_extent processed = { 0 };
>> -    u64 offset = 0;
>> +    u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);
> 
> This doesn't work, bi_sector can be remapped based on the underlying
> device, and thus can be different between submit and endio.  To
> illustrate this point, make 2 partitions on a single device, mkfs the
> second partition, and then run xfstests with this patch applied, all
> sorts of fun will happen.

Then it still doesn't matter.

The important thing is, we only use that "disk_bytenr" to calculate the
offset against the beginning of the bio.

Thus the result is the same.

Although the new naming would be a little confusing then, it's not
really disk_bytenr used by btrfs.

In that case, if we want (and I believe we want) real disk_bytenr,
btrfs_io_bio would be the correct location to add this member.

> 
> In fact we should probably add such a test to xfstests to catch anybody
> relying on bi_sector to stay the same.  Thanks,

Thankfully, not for this patch.

Thanks,
Qu

> 
> Josef


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-11  0:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10  2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
2020-11-10  2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
2020-11-10  2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
2020-11-10 15:30   ` Josef Bacik
2020-11-11  0:05     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).