* [PATCH 1/4] Btrfs: fix defragmentation regression
@ 2011-09-02 7:56 Li Zefan
2011-09-02 7:56 ` [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file() Li Zefan
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Li Zefan @ 2011-09-02 7:56 UTC (permalink / raw)
To: linux-btrfs
There's an off-by-one bug:
# create a file with lots of 4K file extents
# btrfs fi defrag /mnt/file
# sync
# filefrag -v /mnt/file
Filesystem type is: 9123683e
File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 3372 64
1 64 3136 3435 1
2 65 3436 3136 64
3 129 3201 3499 1
4 130 3500 3201 64
5 194 3266 3563 1
6 195 3564 3266 64
7 259 3331 3627 1
8 260 3628 3331 40 eof
After this patch:
...
# filefrag -v /mnt/file
Filesystem type is: 9123683e
File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 3372 300 eof
/mnt/file: 1 extent found
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
fs/btrfs/ioctl.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 970977a..31fe6d4 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1081,7 +1081,6 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
defrag_count += ret;
balance_dirty_pages_ratelimited_nr(inode->i_mapping, ret);
- i += ret;
if (newer_than) {
if (newer_off == (u64)-1)
@@ -1101,7 +1100,10 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
break;
}
} else {
- i++;
+ if (ret > 0)
+ i += ret;
+ else
+ i++;
}
}
--
1.7.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file()
2011-09-02 7:56 [PATCH 1/4] Btrfs: fix defragmentation regression Li Zefan
@ 2011-09-02 7:56 ` Li Zefan
2011-09-22 10:58 ` David Sterba
2011-09-02 7:56 ` [PATCH 3/4] Btrfs: fix wrong max_to_defrag " Li Zefan
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2011-09-02 7:56 UTC (permalink / raw)
To: linux-btrfs
Don't use inode->i_size directly, since we're not holding i_mutex.
This also fixes another bug, that i_size can change after it's checked
against 0 and then (i_size - 1) can be negative.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
fs/btrfs/ioctl.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 31fe6d4..6f2b257 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -972,6 +972,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
struct btrfs_super_block *disk_super;
struct file_ra_state *ra = NULL;
unsigned long last_index;
+ u64 isize = i_size_read(inode);
u64 features;
u64 last_len = 0;
u64 skip = 0;
@@ -997,7 +998,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
compress_type = range->compress_type;
}
- if (inode->i_size == 0)
+ if (isize == 0)
return 0;
/*
@@ -1022,10 +1023,10 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
/* find the last page to defrag */
if (range->start + range->len > range->start) {
- last_index = min_t(u64, inode->i_size - 1,
+ last_index = min_t(u64, isize - 1,
range->start + range->len - 1) >> PAGE_CACHE_SHIFT;
} else {
- last_index = (inode->i_size - 1) >> PAGE_CACHE_SHIFT;
+ last_index = (isize - 1) >> PAGE_CACHE_SHIFT;
}
if (newer_than) {
--
1.7.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/4] Btrfs: fix wrong max_to_defrag in btrfs_defrag_file()
2011-09-02 7:56 [PATCH 1/4] Btrfs: fix defragmentation regression Li Zefan
2011-09-02 7:56 ` [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file() Li Zefan
@ 2011-09-02 7:56 ` Li Zefan
2011-09-22 10:42 ` David Sterba
2011-09-02 7:57 ` [PATCH 4/4] Btrfs: honor extent thresh during defragmentation Li Zefan
2011-09-02 8:42 ` [PATCH 1/4] Btrfs: fix defragmentation regression Christoph Hellwig
3 siblings, 1 reply; 9+ messages in thread
From: Li Zefan @ 2011-09-02 7:56 UTC (permalink / raw)
To: linux-btrfs
It's off-by-one, and thus we may skip the last page while defragmenting.
An example case:
# create /mnt/file with 2 4K file extents
# btrfs fi defrag /mnt/file
# sync
# filefrag /mnt/file
/mnt/file: 2 extents found
So it's not defragmented.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
fs/btrfs/ioctl.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 6f2b257..57aa5b7 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1046,7 +1046,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
i = range->start >> PAGE_CACHE_SHIFT;
}
if (!max_to_defrag)
- max_to_defrag = last_index - 1;
+ max_to_defrag = last_index;
while (i <= last_index && defrag_count < max_to_defrag) {
/*
--
1.7.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/4] Btrfs: honor extent thresh during defragmentation
2011-09-02 7:56 [PATCH 1/4] Btrfs: fix defragmentation regression Li Zefan
2011-09-02 7:56 ` [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file() Li Zefan
2011-09-02 7:56 ` [PATCH 3/4] Btrfs: fix wrong max_to_defrag " Li Zefan
@ 2011-09-02 7:57 ` Li Zefan
2011-09-02 8:42 ` [PATCH 1/4] Btrfs: fix defragmentation regression Christoph Hellwig
3 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2011-09-02 7:57 UTC (permalink / raw)
To: linux-btrfs
We won't defrag an extent, if it's bigger than the threshold we
specified and there's no small extent before it, but actually
the code doesn't work this way.
There are three bugs:
- When should_defrag_range() decides we should keep on defragmenting
an extent, last_len is not incremented. (old bug)
- The length that passes to should_defrag_range() is not the length
we're going to defrag. (new bug)
- We always defrag 256K bytes data, and a big extent can be part of
this range. (new bug)
For a file with 4 extents:
| 4K | 4K | 256K | 256K |
The result of defrag with (the default) 256K extent thresh should be:
| 264K | 256K |
but with those bugs, we'll get:
| 520K |
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
---
fs/btrfs/ioctl.c | 37 ++++++++++++++++++++++++++-----------
1 files changed, 26 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 57aa5b7..90d7904 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -760,7 +760,7 @@ static int should_defrag_range(struct inode *inode, u64 start, u64 len,
int ret = 1;
/*
- * make sure that once we start defragging and extent, we keep on
+ * make sure that once we start defragging an extent, we keep on
* defragging it
*/
if (start < *defrag_end)
@@ -805,7 +805,6 @@ static int should_defrag_range(struct inode *inode, u64 start, u64 len,
* extent will force at least part of that big extent to be defragged.
*/
if (ret) {
- *last_len += len;
*defrag_end = extent_map_end(em);
} else {
*last_len = 0;
@@ -978,13 +977,14 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
u64 skip = 0;
u64 defrag_end = 0;
u64 newer_off = range->start;
- int newer_left = 0;
unsigned long i;
+ unsigned long ra_index = 0;
int ret;
int defrag_count = 0;
int compress_type = BTRFS_COMPRESS_ZLIB;
int extent_thresh = range->extent_thresh;
- int newer_cluster = (256 * 1024) >> PAGE_CACHE_SHIFT;
+ int max_cluster = (256 * 1024) >> PAGE_CACHE_SHIFT;
+ int cluster = max_cluster;
u64 new_align = ~((u64)128 * 1024 - 1);
struct page **pages = NULL;
@@ -1014,7 +1014,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
ra = &file->f_ra;
}
- pages = kmalloc(sizeof(struct page *) * newer_cluster,
+ pages = kmalloc(sizeof(struct page *) * max_cluster,
GFP_NOFS);
if (!pages) {
ret = -ENOMEM;
@@ -1039,7 +1039,6 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
* the extents in the file evenly spaced
*/
i = (newer_off & new_align) >> PAGE_CACHE_SHIFT;
- newer_left = newer_cluster;
} else
goto out_ra;
} else {
@@ -1071,12 +1070,26 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
i = max(i + 1, next);
continue;
}
+
+ if (!newer_than) {
+ cluster = (PAGE_CACHE_ALIGN(defrag_end) >>
+ PAGE_CACHE_SHIFT) - i;
+ cluster = min(cluster, max_cluster);
+ } else {
+ cluster = max_cluster;
+ }
+
if (range->flags & BTRFS_DEFRAG_RANGE_COMPRESS)
BTRFS_I(inode)->force_compress = compress_type;
- btrfs_force_ra(inode->i_mapping, ra, file, i, newer_cluster);
+ if (i + cluster > ra_index) {
+ ra_index = max(i, ra_index);
+ btrfs_force_ra(inode->i_mapping, ra, file, ra_index,
+ cluster);
+ ra_index += max_cluster;
+ }
- ret = cluster_pages_for_defrag(inode, pages, i, newer_cluster);
+ ret = cluster_pages_for_defrag(inode, pages, i, cluster);
if (ret < 0)
goto out_ra;
@@ -1096,15 +1109,17 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
if (!ret) {
range->start = newer_off;
i = (newer_off & new_align) >> PAGE_CACHE_SHIFT;
- newer_left = newer_cluster;
} else {
break;
}
} else {
- if (ret > 0)
+ if (ret > 0) {
i += ret;
- else
+ last_len += ret << PAGE_CACHE_SHIFT;
+ } else {
i++;
+ last_len = 0;
+ }
}
}
--
1.7.3.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] Btrfs: fix defragmentation regression
2011-09-02 7:56 [PATCH 1/4] Btrfs: fix defragmentation regression Li Zefan
` (2 preceding siblings ...)
2011-09-02 7:57 ` [PATCH 4/4] Btrfs: honor extent thresh during defragmentation Li Zefan
@ 2011-09-02 8:42 ` Christoph Hellwig
2011-10-18 8:48 ` Dan Merillat
3 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2011-09-02 8:42 UTC (permalink / raw)
To: Li Zefan; +Cc: linux-btrfs
On Fri, Sep 02, 2011 at 03:56:25PM +0800, Li Zefan wrote:
> There's an off-by-one bug:
>
> # create a file with lots of 4K file extents
> # btrfs fi defrag /mnt/file
> # sync
> # filefrag -v /mnt/file
> Filesystem type is: 9123683e
> File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
> ext logical physical expected length flags
> 0 0 3372 64
> 1 64 3136 3435 1
> 2 65 3436 3136 64
> 3 129 3201 3499 1
> 4 130 3500 3201 64
> 5 194 3266 3563 1
> 6 195 3564 3266 64
> 7 259 3331 3627 1
> 8 260 3628 3331 40 eof
>
> After this patch:
Can you please create an xfstests testcase for this?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/4] Btrfs: fix wrong max_to_defrag in btrfs_defrag_file()
2011-09-02 7:56 ` [PATCH 3/4] Btrfs: fix wrong max_to_defrag " Li Zefan
@ 2011-09-22 10:42 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2011-09-22 10:42 UTC (permalink / raw)
To: Li Zefan; +Cc: linux-btrfs
On Fri, Sep 02, 2011 at 03:56:55PM +0800, Li Zefan wrote:
> It's off-by-one, and thus we may skip the last page while defragmenting.
Good catch.
>
> An example case:
>
> # create /mnt/file with 2 4K file extents
> # btrfs fi defrag /mnt/file
> # sync
> # filefrag /mnt/file
> /mnt/file: 2 extents found
>
> So it's not defragmented.
>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
> ---
> fs/btrfs/ioctl.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 6f2b257..57aa5b7 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1046,7 +1046,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
> i = range->start >> PAGE_CACHE_SHIFT;
> }
> if (!max_to_defrag)
> - max_to_defrag = last_index - 1;
> + max_to_defrag = last_index;
>
> while (i <= last_index && defrag_count < max_to_defrag) {
> /*
> --
> 1.7.3.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file()
2011-09-02 7:56 ` [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file() Li Zefan
@ 2011-09-22 10:58 ` David Sterba
0 siblings, 0 replies; 9+ messages in thread
From: David Sterba @ 2011-09-22 10:58 UTC (permalink / raw)
To: Li Zefan; +Cc: linux-btrfs
On Fri, Sep 02, 2011 at 03:56:39PM +0800, Li Zefan wrote:
> Don't use inode->i_size directly, since we're not holding i_mutex.
>
> This also fixes another bug, that i_size can change after it's checked
> against 0 and then (i_size - 1) can be negative.
>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
> ---
> fs/btrfs/ioctl.c | 7 ++++---
> 1 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 31fe6d4..6f2b257 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -972,6 +972,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
> struct btrfs_super_block *disk_super;
> struct file_ra_state *ra = NULL;
> unsigned long last_index;
> + u64 isize = i_size_read(inode);
> u64 features;
> u64 last_len = 0;
> u64 skip = 0;
> @@ -997,7 +998,7 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
> compress_type = range->compress_type;
> }
>
> - if (inode->i_size == 0)
> + if (isize == 0)
> return 0;
>
> /*
> @@ -1022,10 +1023,10 @@ int btrfs_defrag_file(struct inode *inode, struct file *file,
>
> /* find the last page to defrag */
> if (range->start + range->len > range->start) {
> - last_index = min_t(u64, inode->i_size - 1,
> + last_index = min_t(u64, isize - 1,
> range->start + range->len - 1) >> PAGE_CACHE_SHIFT;
> } else {
> - last_index = (inode->i_size - 1) >> PAGE_CACHE_SHIFT;
> + last_index = (isize - 1) >> PAGE_CACHE_SHIFT;
> }
>
> if (newer_than) {
> --
> 1.7.3.1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] Btrfs: fix defragmentation regression
2011-09-02 8:42 ` [PATCH 1/4] Btrfs: fix defragmentation regression Christoph Hellwig
@ 2011-10-18 8:48 ` Dan Merillat
2011-10-18 8:52 ` Li Zefan
0 siblings, 1 reply; 9+ messages in thread
From: Dan Merillat @ 2011-10-18 8:48 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Li Zefan, linux-btrfs
On Fri, Sep 2, 2011 at 4:42 AM, Christoph Hellwig <hch@infradead.org> w=
rote:
> On Fri, Sep 02, 2011 at 03:56:25PM +0800, Li Zefan wrote:
>> There's an off-by-one bug:
>>
>> =A0 # create a file with lots of 4K file extents
>> =A0 # btrfs fi defrag /mnt/file
>> =A0 # sync
>> =A0 # filefrag -v /mnt/file
>> =A0 Filesystem type is: 9123683e
>> =A0 File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
>> =A0 =A0ext logical physical expected length flags
>> =A0 =A0 =A00 =A0 =A0 =A0 0 =A0 =A0 3372 =A0 =A0 =A0 =A0 =A0 =A0 =A06=
4
>> =A0 =A0 =A01 =A0 =A0 =A064 =A0 =A0 3136 =A0 =A0 3435 =A0 =A0 =A01
>> =A0 =A0 =A02 =A0 =A0 =A065 =A0 =A0 3436 =A0 =A0 3136 =A0 =A0 64
>> =A0 =A0 =A03 =A0 =A0 129 =A0 =A0 3201 =A0 =A0 3499 =A0 =A0 =A01
>> =A0 =A0 =A04 =A0 =A0 130 =A0 =A0 3500 =A0 =A0 3201 =A0 =A0 64
>> =A0 =A0 =A05 =A0 =A0 194 =A0 =A0 3266 =A0 =A0 3563 =A0 =A0 =A01
>> =A0 =A0 =A06 =A0 =A0 195 =A0 =A0 3564 =A0 =A0 3266 =A0 =A0 64
>> =A0 =A0 =A07 =A0 =A0 259 =A0 =A0 3331 =A0 =A0 3627 =A0 =A0 =A01
>> =A0 =A0 =A08 =A0 =A0 260 =A0 =A0 3628 =A0 =A0 3331 =A0 =A0 40 eof
>>
>> After this patch:
>
> Can you please create an xfstests testcase for this?
Did this fix get lost? I don't see it in git, and defragmenting a
file still results in 10x as many fragments as it started with.
(3.1-rc9)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] Btrfs: fix defragmentation regression
2011-10-18 8:48 ` Dan Merillat
@ 2011-10-18 8:52 ` Li Zefan
0 siblings, 0 replies; 9+ messages in thread
From: Li Zefan @ 2011-10-18 8:52 UTC (permalink / raw)
To: Dan Merillat; +Cc: Christoph Hellwig, linux-btrfs
16:48, Dan Merillat wrote:
> On Fri, Sep 2, 2011 at 4:42 AM, Christoph Hellwig <hch@infradead.org> wrote:
>> On Fri, Sep 02, 2011 at 03:56:25PM +0800, Li Zefan wrote:
>>> There's an off-by-one bug:
>>>
>>> # create a file with lots of 4K file extents
>>> # btrfs fi defrag /mnt/file
>>> # sync
>>> # filefrag -v /mnt/file
>>> Filesystem type is: 9123683e
>>> File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
>>> ext logical physical expected length flags
>>> 0 0 3372 64
>>> 1 64 3136 3435 1
>>> 2 65 3436 3136 64
>>> 3 129 3201 3499 1
>>> 4 130 3500 3201 64
>>> 5 194 3266 3563 1
>>> 6 195 3564 3266 64
>>> 7 259 3331 3627 1
>>> 8 260 3628 3331 40 eof
>>>
>>> After this patch:
>>
>> Can you please create an xfstests testcase for this?
>
> Did this fix get lost? I don't see it in git, and defragmenting a
> file still results in 10x as many fragments as it started with.
> (3.1-rc9)
>
No, it's queued for 3.2, but I think it's a good candidate for 3.1.x.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-10-18 8:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-02 7:56 [PATCH 1/4] Btrfs: fix defragmentation regression Li Zefan
2011-09-02 7:56 ` [PATCH 2/4] Btrfs: use i_size_read() in btrfs_defrag_file() Li Zefan
2011-09-22 10:58 ` David Sterba
2011-09-02 7:56 ` [PATCH 3/4] Btrfs: fix wrong max_to_defrag " Li Zefan
2011-09-22 10:42 ` David Sterba
2011-09-02 7:57 ` [PATCH 4/4] Btrfs: honor extent thresh during defragmentation Li Zefan
2011-09-02 8:42 ` [PATCH 1/4] Btrfs: fix defragmentation regression Christoph Hellwig
2011-10-18 8:48 ` Dan Merillat
2011-10-18 8:52 ` Li Zefan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.