* [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs
@ 2020-11-10 2:09 Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
0 siblings, 2 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10 2:09 UTC (permalink / raw)
To: linux-btrfs
This is another cleanup exposed when I'm fixing my subpage patchset.
Dating back to the old time where we still have hooks for data/metadata
endio, we have a parameter called @phy_offset for both hooks.
That @phy_offset is the number of sectors compared to the bio on-disk
bytenr, and is used to grab the csum from btrfs_io_bio.
This is far from straightforward, and costs reader tons of time to grasp
the basic.
This patchset will change it by:
- Remove phy_offset completely for metadata
Since metadata doesn't use btrfs_io_bio::csums[] at all, there is no
need for it.
- Use @disk_bytenr to replace @phy_offset/@icsum
Let the callee, check_data_csum() to calculate the offset from
@disk_bytenr and bio to get the csum offset.
Changelog:
v2:
- Update commit message to remove the wrong comment on
btrfs_io_bio->logical
That logical is mess, it has different meanings for different use
cases.
What we should refer to is bio->bi_iter.bi_sector.
- Remove the false-alert prone ASSERT()
Even at endio time. bio->bi_iter.bi_size can change due to incoming
finished IOs.
This means we can't really rely on bio->bi_iter.bi_size to check if
our disk_bytenr is still valid.
Qu Wenruo (2):
btrfs: remove the phy_offset parameter for
btrfs_validate_metadata_buffer()
btrfs: pass disk_bytenr directly for check_data_csum()
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/disk-io.h | 2 +-
fs/btrfs/extent_io.c | 16 +++++++++-------
fs/btrfs/inode.c | 26 +++++++++++++++++---------
4 files changed, 28 insertions(+), 18 deletions(-)
--
2.29.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer()
2020-11-10 2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
@ 2020-11-10 2:09 ` Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
1 sibling, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10 2:09 UTC (permalink / raw)
To: linux-btrfs
Parameter @phy_offset is the offset against the io_bio->logical (which
is the disk bytenr).
@phy_offset is mostly for data io to lookup the csum in btrfs_io_bio.
But for metadata, it's completely useless as metadata stores their own
csum in its btrfs_header.
Remove this useless parameter from btrfs_validate_metadata_buffer().
Just an extra note for parameters @start and @end, they are not utilized
at all for current sectorsize == PAGE_SIZE, as we can grab eb directly
from page.
But those two parameters are very important for later subpage support,
thus @start/@len are not touched here.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/disk-io.h | 2 +-
fs/btrfs/extent_io.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c70a52b44ceb..bd6e357dd280 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -524,7 +524,7 @@ static int check_tree_block_fsid(struct extent_buffer *eb)
return 1;
}
-int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
struct page *page, u64 start, u64 end,
int mirror)
{
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 238b45223f2e..76ede62737fd 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -79,7 +79,7 @@ void btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info);
void btrfs_btree_balance_dirty_nodelay(struct btrfs_fs_info *fs_info);
void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info,
struct btrfs_root *root);
-int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio,
struct page *page, u64 start, u64 end,
int mirror);
blk_status_t btrfs_submit_metadata_bio(struct inode *inode, struct bio *bio,
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b4381d7ca52c..bd5a22bfee68 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2928,7 +2928,7 @@ static void end_bio_extent_readpage(struct bio *bio)
start, end, mirror);
else
ret = btrfs_validate_metadata_buffer(io_bio,
- offset, page, start, end, mirror);
+ page, start, end, mirror);
if (ret)
uptodate = 0;
else
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
2020-11-10 2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
@ 2020-11-10 2:09 ` Qu Wenruo
2020-11-10 15:30 ` Josef Bacik
1 sibling, 1 reply; 5+ messages in thread
From: Qu Wenruo @ 2020-11-10 2:09 UTC (permalink / raw)
To: linux-btrfs
Parameter @icsum for check_data_csum() is a little hard to understand.
So is the @phy_offset for btrfs_verify_data_csum().
Both parameters are calculated values for csum lookup.
Instead of some calculated value, just pass @disk_bytenr and let the
final and only user, check_data_csum(), to calculate whatever it needs.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/extent_io.c | 14 ++++++++------
fs/btrfs/inode.c | 26 +++++++++++++++++---------
2 files changed, 25 insertions(+), 15 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index bd5a22bfee68..f8b5d3d4e5b0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio *bio)
struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
struct extent_io_tree *tree, *failure_tree;
struct processed_extent processed = { 0 };
- u64 offset = 0;
+ u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);
u64 start;
u64 end;
u64 len;
@@ -2924,8 +2924,9 @@ static void end_bio_extent_readpage(struct bio *bio)
mirror = io_bio->mirror_num;
if (likely(uptodate)) {
if (is_data_inode(inode))
- ret = btrfs_verify_data_csum(io_bio, offset, page,
- start, end, mirror);
+ ret = btrfs_verify_data_csum(io_bio,
+ disk_bytenr, page, start, end,
+ mirror);
else
ret = btrfs_validate_metadata_buffer(io_bio,
page, start, end, mirror);
@@ -2953,12 +2954,13 @@ static void end_bio_extent_readpage(struct bio *bio)
* If it can't handle the error it will return -EIO and
* we remain responsible for that page.
*/
- if (!btrfs_submit_read_repair(inode, bio, offset, page,
+ if (!btrfs_submit_read_repair(inode, bio, disk_bytenr,
+ page,
start - page_offset(page),
start, end, mirror,
btrfs_submit_data_bio)) {
uptodate = !bio->bi_status;
- offset += len;
+ disk_bytenr += len;
continue;
}
} else {
@@ -2983,7 +2985,7 @@ static void end_bio_extent_readpage(struct bio *bio)
if (page->index == end_index && off)
zero_user_segment(page, off, PAGE_SIZE);
}
- offset += len;
+ disk_bytenr += len;
endio_readpage_update_page_status(page, uptodate);
endio_readpage_release_extent(&processed, BTRFS_I(inode),
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c54e0ed0b938..e1d309bfc693 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2843,19 +2843,23 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start,
* The length of such check is always one sector size.
*/
static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
- int icsum, struct page *page, int pgoff)
+ u64 disk_bytenr, struct page *page, int pgoff)
{
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
char *kaddr;
u32 len = fs_info->sectorsize;
const u32 csum_size = fs_info->csum_size;
+ u64 bio_disk_bytenr = (io_bio->bio.bi_iter.bi_sector << 9);
+ int offset_sectors;
u8 *csum_expected;
u8 csum[BTRFS_CSUM_SIZE];
ASSERT(pgoff + len <= PAGE_SIZE);
- csum_expected = ((u8 *)io_bio->csum) + icsum * csum_size;
+ offset_sectors = (disk_bytenr - bio_disk_bytenr) >>
+ fs_info->sectorsize_bits;
+ csum_expected = ((u8 *)io_bio->csum) + offset_sectors * csum_size;
kaddr = kmap_atomic(page);
shash->tfm = fs_info->csum_shash;
@@ -2883,8 +2887,13 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio,
* when reads are done, we need to check csums to verify the data is correct
* if there's a match, we allow the bio to finish. If not, the code in
* extent_io.c will try to find good copies for us.
+ *
+ * @disk_bytenr: The on-disk bytenr of the range start
+ * @start: The file offset of the range start
+ * @end: The file offset of the range end (inclusive)
+ * @mirror: The mirror number
*/
-int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 phy_offset,
+int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 disk_bytenr,
struct page *page, u64 start, u64 end, int mirror)
{
size_t offset = start - page_offset(page);
@@ -2909,8 +2918,7 @@ int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u64 phy_offset,
return 0;
}
- phy_offset >>= root->fs_info->sectorsize_bits;
- return check_data_csum(inode, io_bio, phy_offset, page, offset);
+ return check_data_csum(inode, io_bio, disk_bytenr, page, offset);
}
/*
@@ -7616,7 +7624,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
struct bio_vec bvec;
struct bvec_iter iter;
u64 start = io_bio->logical;
- int icsum = 0;
+ u64 disk_bytenr = (io_bio->bio.bi_iter.bi_sector << 9);
blk_status_t err = BLK_STS_OK;
__bio_for_each_segment(bvec, &io_bio->bio, iter, io_bio->iter) {
@@ -7627,8 +7635,8 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
for (i = 0; i < nr_sectors; i++) {
ASSERT(pgoff < PAGE_SIZE);
if (uptodate &&
- (!csum || !check_data_csum(inode, io_bio, icsum,
- bvec.bv_page, pgoff))) {
+ (!csum || !check_data_csum(inode, io_bio,
+ disk_bytenr, bvec.bv_page, pgoff))) {
clean_io_failure(fs_info, failure_tree, io_tree,
start, bvec.bv_page,
btrfs_ino(BTRFS_I(inode)),
@@ -7648,7 +7656,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode,
err = status;
}
start += sectorsize;
- icsum++;
+ disk_bytenr += sectorsize;
pgoff += sectorsize;
}
}
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
2020-11-10 2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
@ 2020-11-10 15:30 ` Josef Bacik
2020-11-11 0:05 ` Qu Wenruo
0 siblings, 1 reply; 5+ messages in thread
From: Josef Bacik @ 2020-11-10 15:30 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
On 11/9/20 9:09 PM, Qu Wenruo wrote:
> Parameter @icsum for check_data_csum() is a little hard to understand.
> So is the @phy_offset for btrfs_verify_data_csum().
>
> Both parameters are calculated values for csum lookup.
>
> Instead of some calculated value, just pass @disk_bytenr and let the
> final and only user, check_data_csum(), to calculate whatever it needs.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/extent_io.c | 14 ++++++++------
> fs/btrfs/inode.c | 26 +++++++++++++++++---------
> 2 files changed, 25 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index bd5a22bfee68..f8b5d3d4e5b0 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio *bio)
> struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
> struct extent_io_tree *tree, *failure_tree;
> struct processed_extent processed = { 0 };
> - u64 offset = 0;
> + u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);
This doesn't work, bi_sector can be remapped based on the underlying device, and
thus can be different between submit and endio. To illustrate this point, make
2 partitions on a single device, mkfs the second partition, and then run
xfstests with this patch applied, all sorts of fun will happen.
In fact we should probably add such a test to xfstests to catch anybody relying
on bi_sector to stay the same. Thanks,
Josef
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum()
2020-11-10 15:30 ` Josef Bacik
@ 2020-11-11 0:05 ` Qu Wenruo
0 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2020-11-11 0:05 UTC (permalink / raw)
To: Josef Bacik, Qu Wenruo, linux-btrfs
[-- Attachment #1.1: Type: text/plain, Size: 2127 bytes --]
On 2020/11/10 下午11:30, Josef Bacik wrote:
> On 11/9/20 9:09 PM, Qu Wenruo wrote:
>> Parameter @icsum for check_data_csum() is a little hard to understand.
>> So is the @phy_offset for btrfs_verify_data_csum().
>>
>> Both parameters are calculated values for csum lookup.
>>
>> Instead of some calculated value, just pass @disk_bytenr and let the
>> final and only user, check_data_csum(), to calculate whatever it needs.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/extent_io.c | 14 ++++++++------
>> fs/btrfs/inode.c | 26 +++++++++++++++++---------
>> 2 files changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index bd5a22bfee68..f8b5d3d4e5b0 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -2878,7 +2878,7 @@ static void end_bio_extent_readpage(struct bio
>> *bio)
>> struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
>> struct extent_io_tree *tree, *failure_tree;
>> struct processed_extent processed = { 0 };
>> - u64 offset = 0;
>> + u64 disk_bytenr = (bio->bi_iter.bi_sector << 9);
>
> This doesn't work, bi_sector can be remapped based on the underlying
> device, and thus can be different between submit and endio. To
> illustrate this point, make 2 partitions on a single device, mkfs the
> second partition, and then run xfstests with this patch applied, all
> sorts of fun will happen.
Then it still doesn't matter.
The important thing is, we only use that "disk_bytenr" to calculate the
offset against the beginning of the bio.
Thus the result is the same.
Although the new naming would be a little confusing then, it's not
really disk_bytenr used by btrfs.
In that case, if we want (and I believe we want) real disk_bytenr,
btrfs_io_bio would be the correct location to add this member.
>
> In fact we should probably add such a test to xfstests to catch anybody
> relying on bi_sector to stay the same. Thanks,
Thankfully, not for this patch.
Thanks,
Qu
>
> Josef
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-11-11 0:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10 2:09 [PATCH v2 0/2] btrfs: paramater refactors for data and metadata endio call backs Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 1/2] btrfs: remove the phy_offset parameter for btrfs_validate_metadata_buffer() Qu Wenruo
2020-11-10 2:09 ` [PATCH v2 2/2] btrfs: pass disk_bytenr directly for check_data_csum() Qu Wenruo
2020-11-10 15:30 ` Josef Bacik
2020-11-11 0:05 ` Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).