* [PATCH v2] btrfs: populate extent_map::generation when reading from disk
@ 2022-02-08 5:31 Qu Wenruo
2022-02-08 16:39 ` Filipe Manana
2022-02-08 18:42 ` David Sterba
0 siblings, 2 replies; 3+ messages in thread
From: Qu Wenruo @ 2022-02-08 5:31 UTC (permalink / raw)
To: linux-btrfs
[WEIRD BEHAVIOR]
When btrfs_get_extent() tries to get some file extent from disk, it
never populates extent_map::generation , leaving the value to be 0.
On the other hand, for extent map generated by IO, it will get its
generation properly set at finish_ordered_io()
finish_ordered_io()
|- unpin_extent_cache(gen = trans->transid)
|- em->generation = gen;
[CAUSE]
Since extent_map::generation is mostly used by fsync code, and for fsync
they only care about modified extents, which all have their em::generation > 0.
Thus it's fine to not populate em read from disk for fsync.
[CORNER CASE]
However autodefrag also relies on em::geneartion to determine if one extent
needs to be defragged.
This unpopulated extent_map::geneartion can prevent the following autodefrag
case from working:
mkfs.btrfs -f $dev
mount $dev $mnt -o autodefrag
# initial write to queue the inode for autodefrag
xfs_io -f -c "pwrite 0 4k" $mnt/file
sync
# Real fragmented write
xfs_io -f -s -c "pwrite -b 4096 0 32k" $mnt/file
sync
echo "=== before autodefrag ==="
xfs_io -c "fiemap -v" $mnt/file
# Drop cache to force em to be read from disk
echo 3 > /proc/sys/vm/drop_caches
mount -o remount,commit=1 $mnt
sleep 3
sync
echo "=== After autodefrag ==="
xfs_io -c "fiemap -v" $mnt/file
umount $mnt
The result looks like this:
=== before autodefrag ===
/mnt/btrfs/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..15]: 26672..26687 16 0x0
1: [16..31]: 26656..26671 16 0x0
2: [32..47]: 26640..26655 16 0x0
3: [48..63]: 26624..26639 16 0x1
=== After autodefrag ===
/mnt/btrfs/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..15]: 26672..26687 16 0x0
1: [16..31]: 26656..26671 16 0x0
2: [32..47]: 26640..26655 16 0x0
3: [48..63]: 26624..26639 16 0x1
This fragmented 32K will not be defragged by autodefrag.
[FIX]
To make things less weird, just populate extent_map::generation when
reading file extents from disk.
This would make above fragmented extents to be properly defragged:
== before autodefrag ===
/mnt/btrfs/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..15]: 26672..26687 16 0x0
1: [16..31]: 26656..26671 16 0x0
2: [32..47]: 26640..26655 16 0x0
3: [48..63]: 26624..26639 16 0x1
=== After autodefrag ===
/mnt/btrfs/file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..63]: 26688..26751 64 0x1
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Update the commit message to include a reproducer
Although this is not what we want (to reduce autodefrag IO),
the behavior still worthy fixing anyway.
---
fs/btrfs/file-item.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 90c5c38836ab..9a3de652ada8 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -1211,6 +1211,7 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode,
extent_start = key.offset;
extent_end = btrfs_file_extent_end(path);
em->ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi);
+ em->generation = btrfs_file_extent_generation(leaf, fi);
if (type == BTRFS_FILE_EXTENT_REG ||
type == BTRFS_FILE_EXTENT_PREALLOC) {
em->start = extent_start;
--
2.35.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] btrfs: populate extent_map::generation when reading from disk
2022-02-08 5:31 [PATCH v2] btrfs: populate extent_map::generation when reading from disk Qu Wenruo
@ 2022-02-08 16:39 ` Filipe Manana
2022-02-08 18:42 ` David Sterba
1 sibling, 0 replies; 3+ messages in thread
From: Filipe Manana @ 2022-02-08 16:39 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Feb 08, 2022 at 01:31:19PM +0800, Qu Wenruo wrote:
> [WEIRD BEHAVIOR]
>
> When btrfs_get_extent() tries to get some file extent from disk, it
> never populates extent_map::generation , leaving the value to be 0.
>
> On the other hand, for extent map generated by IO, it will get its
> generation properly set at finish_ordered_io()
>
> finish_ordered_io()
> |- unpin_extent_cache(gen = trans->transid)
> |- em->generation = gen;
>
> [CAUSE]
> Since extent_map::generation is mostly used by fsync code, and for fsync
> they only care about modified extents, which all have their em::generation > 0.
>
> Thus it's fine to not populate em read from disk for fsync.
>
> [CORNER CASE]
> However autodefrag also relies on em::geneartion to determine if one extent
> needs to be defragged.
em::geneartion -> em::generation
>
> This unpopulated extent_map::geneartion can prevent the following autodefrag
> case from working:
Same here.
>
> mkfs.btrfs -f $dev
> mount $dev $mnt -o autodefrag
>
> # initial write to queue the inode for autodefrag
> xfs_io -f -c "pwrite 0 4k" $mnt/file
> sync
>
> # Real fragmented write
> xfs_io -f -s -c "pwrite -b 4096 0 32k" $mnt/file
> sync
> echo "=== before autodefrag ==="
> xfs_io -c "fiemap -v" $mnt/file
>
> # Drop cache to force em to be read from disk
> echo 3 > /proc/sys/vm/drop_caches
> mount -o remount,commit=1 $mnt
> sleep 3
> sync
>
> echo "=== After autodefrag ==="
> xfs_io -c "fiemap -v" $mnt/file
> umount $mnt
>
> The result looks like this:
>
> === before autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
> === After autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
>
> This fragmented 32K will not be defragged by autodefrag.
>
> [FIX]
> To make things less weird, just populate extent_map::generation when
> reading file extents from disk.
>
> This would make above fragmented extents to be properly defragged:
>
> == before autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
> === After autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..63]: 26688..26751 64 0x1
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
I don't want to make you send yet another version because only of
a typo in the changelog, so:
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Thanks.
> ---
> Changelog:
> v2:
> - Update the commit message to include a reproducer
> Although this is not what we want (to reduce autodefrag IO),
> the behavior still worthy fixing anyway.
> ---
> fs/btrfs/file-item.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index 90c5c38836ab..9a3de652ada8 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -1211,6 +1211,7 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode,
> extent_start = key.offset;
> extent_end = btrfs_file_extent_end(path);
> em->ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi);
> + em->generation = btrfs_file_extent_generation(leaf, fi);
> if (type == BTRFS_FILE_EXTENT_REG ||
> type == BTRFS_FILE_EXTENT_PREALLOC) {
> em->start = extent_start;
> --
> 2.35.0
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] btrfs: populate extent_map::generation when reading from disk
2022-02-08 5:31 [PATCH v2] btrfs: populate extent_map::generation when reading from disk Qu Wenruo
2022-02-08 16:39 ` Filipe Manana
@ 2022-02-08 18:42 ` David Sterba
1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2022-02-08 18:42 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Feb 08, 2022 at 01:31:19PM +0800, Qu Wenruo wrote:
> [WEIRD BEHAVIOR]
>
> When btrfs_get_extent() tries to get some file extent from disk, it
> never populates extent_map::generation , leaving the value to be 0.
>
> On the other hand, for extent map generated by IO, it will get its
> generation properly set at finish_ordered_io()
>
> finish_ordered_io()
> |- unpin_extent_cache(gen = trans->transid)
> |- em->generation = gen;
>
> [CAUSE]
> Since extent_map::generation is mostly used by fsync code, and for fsync
> they only care about modified extents, which all have their em::generation > 0.
>
> Thus it's fine to not populate em read from disk for fsync.
>
> [CORNER CASE]
> However autodefrag also relies on em::geneartion to determine if one extent
> needs to be defragged.
>
> This unpopulated extent_map::geneartion can prevent the following autodefrag
> case from working:
>
> mkfs.btrfs -f $dev
> mount $dev $mnt -o autodefrag
>
> # initial write to queue the inode for autodefrag
> xfs_io -f -c "pwrite 0 4k" $mnt/file
> sync
>
> # Real fragmented write
> xfs_io -f -s -c "pwrite -b 4096 0 32k" $mnt/file
> sync
> echo "=== before autodefrag ==="
> xfs_io -c "fiemap -v" $mnt/file
>
> # Drop cache to force em to be read from disk
> echo 3 > /proc/sys/vm/drop_caches
> mount -o remount,commit=1 $mnt
> sleep 3
> sync
>
> echo "=== After autodefrag ==="
> xfs_io -c "fiemap -v" $mnt/file
> umount $mnt
>
> The result looks like this:
>
> === before autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
> === After autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
>
> This fragmented 32K will not be defragged by autodefrag.
>
> [FIX]
> To make things less weird, just populate extent_map::generation when
> reading file extents from disk.
>
> This would make above fragmented extents to be properly defragged:
>
> == before autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..15]: 26672..26687 16 0x0
> 1: [16..31]: 26656..26671 16 0x0
> 2: [32..47]: 26640..26655 16 0x0
> 3: [48..63]: 26624..26639 16 0x1
> === After autodefrag ===
> /mnt/btrfs/file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..63]: 26688..26751 64 0x1
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Update the commit message to include a reproducer
> Although this is not what we want (to reduce autodefrag IO),
> the behavior still worthy fixing anyway.
With the typos fixed, added to misc-next, thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-02-08 18:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-08 5:31 [PATCH v2] btrfs: populate extent_map::generation when reading from disk Qu Wenruo
2022-02-08 16:39 ` Filipe Manana
2022-02-08 18:42 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).