All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] btrfs-progs: check: continue to check space cache if sb cache_generation is 0
@ 2021-02-15  9:20 Su Yue
  2021-02-17 22:18 ` Boris Burkov
  0 siblings, 1 reply; 3+ messages in thread
From: Su Yue @ 2021-02-15  9:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: l, nborisov, boris, dsterba

User reported that test fsck-tests/037-freespacetree-repair fails:
 # TEST=037\* ./fsck-tests.sh
    [TEST/fsck]   037-freespacetree-repair
btrfs check should have detected corruption
test failed for case 037-freespacetree-repair

The test tries to corrupt FST, call btrfs check readonly then repair FST
using btrfs check. Above case failed at the second readonly check steup.
Test log said "cache and super generation don't match, space cache will
be invalidated" which is printed by validate_free_space_cache().
If cache_generation of the superblock is not -1ULL,
validate_free_space_cache() requires that cache_generation must equal
to the superblock's generation. Otherwise, it skips the check of space
cache(v1, v2) like the above case where the sb cache_generation is 0.

Since kernel commit 948462294577 ("btrfs: keep sb cache_generation
consistent with space_cache"), sb cache_generation will be set to be 0
once space cache v1 is disabled(nospace_cache/space_cache=v2).

Fix it by adding the condition if sb cache_generation is 0 in
validate_free_space_cache() as it's valid now since the kernel commit
mentioned above.

Link: https://github.com/kdave/btrfs-progs/issues/338
Signed-off-by: Su Yue <l@damenly.su>

---
Hi, while reading free space cache v1 related code, I am (still)
confused about the value meanings of cache_generation in sb.

For outdated space cache v1, cache_gen < sb gen.
For valid space cache v1, cache_gen == sb gen.
AS for values 0 and (u64)-1, many places use (u64)-1 to represent
cleared v1 caches. The only three places setting cache_gen to 0 are
1)  above kernel commit 948462294577.
2)  kernel btrfs_parse_options() while mounting zoned device.
3)  progs image/main.c:1147 while restoring image.

I'm wondering whether loosing check condition in this patch is correct
or not. So make it RFC. Thanks.
---
 check/main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/check/main.c b/check/main.c
index c7c5408bea19..a652e445de90 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9951,7 +9951,12 @@ static int validate_free_space_cache(struct btrfs_root *root)
 {
 	int ret;
 
+	/*
+	 * If cache generation is between 0 and -1ULL, sb generation must equal to
+	 *   sb cache generation or the v1 space caches are outdated.
+	 */
 	if (btrfs_super_cache_generation(gfs_info->super_copy) != -1ULL &&
+	    btrfs_super_cache_generation(gfs_info->super_copy) != 0 &&
 	    btrfs_super_generation(gfs_info->super_copy) !=
 	    btrfs_super_cache_generation(gfs_info->super_copy)) {
 		printf(
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] btrfs-progs: check: continue to check space cache if sb cache_generation is 0
  2021-02-15  9:20 [PATCH RFC] btrfs-progs: check: continue to check space cache if sb cache_generation is 0 Su Yue
@ 2021-02-17 22:18 ` Boris Burkov
  2021-02-18  0:55   ` Su Yue
  0 siblings, 1 reply; 3+ messages in thread
From: Boris Burkov @ 2021-02-17 22:18 UTC (permalink / raw)
  To: Su Yue; +Cc: linux-btrfs, nborisov, dsterba

On Mon, Feb 15, 2021 at 05:20:11PM +0800, Su Yue wrote:
> User reported that test fsck-tests/037-freespacetree-repair fails:
>  # TEST=037\* ./fsck-tests.sh
>     [TEST/fsck]   037-freespacetree-repair
> btrfs check should have detected corruption
> test failed for case 037-freespacetree-repair
> 
> The test tries to corrupt FST, call btrfs check readonly then repair FST
> using btrfs check. Above case failed at the second readonly check steup.
> Test log said "cache and super generation don't match, space cache will
> be invalidated" which is printed by validate_free_space_cache().
> If cache_generation of the superblock is not -1ULL,
> validate_free_space_cache() requires that cache_generation must equal
> to the superblock's generation. Otherwise, it skips the check of space
> cache(v1, v2) like the above case where the sb cache_generation is 0.
> 
> Since kernel commit 948462294577 ("btrfs: keep sb cache_generation
> consistent with space_cache"), sb cache_generation will be set to be 0
> once space cache v1 is disabled(nospace_cache/space_cache=v2).
> 
> Fix it by adding the condition if sb cache_generation is 0 in
> validate_free_space_cache() as it's valid now since the kernel commit
> mentioned above.

Sorry that I didn't notice the fsck issue.
The fix looks good to me.

> 
> Link: https://github.com/kdave/btrfs-progs/issues/338
> Signed-off-by: Su Yue <l@damenly.su>
Reviewed-by: Boris Burkov <boris@bur.io>
> 
> ---
> Hi, while reading free space cache v1 related code, I am (still)
> confused about the value meanings of cache_generation in sb.
> 
> For outdated space cache v1, cache_gen < sb gen.
> For valid space cache v1, cache_gen == sb gen.
> AS for values 0 and (u64)-1, many places use (u64)-1 to represent
> cleared v1 caches. The only three places setting cache_gen to 0 are
> 1)  above kernel commit 948462294577.
> 2)  kernel btrfs_parse_options() while mounting zoned device.
> 3)  progs image/main.c:1147 while restoring image.
> 
> I'm wondering whether loosing check condition in this patch is correct
> or not. So make it RFC. Thanks.

I did some archaeology on this, because when I wrote the patch that
broke this, I was sure that 0 meant "no space cache" and couldn't
actually answer your question without digging in.

I believe that the situation is as follows:

In the kernel code, super_cache_gen == 0 <-> no space cache. This has
been true since the introduction of the field in
'0af3d00bad38 Btrfs: create special free space cache inode'.
Another interesting kernel patch to note is:
'73bc187680f9 Btrfs: introduce mount option no_space_cache'
which explicitly mentions the check against super_cache_gen==0.

However, in btrfs-progs, as you have correctly pointed out, (u64)-1 is
the sentinel value of interest. The reason is that in the original
free-space-cache mkfs patch:
'e2a6859d9325 Btrfs-progs: update super fields for space cache'
-1 is used intentionally as a non-zero value to make space cache the
default setting. Quoting the patch:
"It also makes us set it to -1 on mkfs so any new filesystem will get
the space cache stuff turned on"
This was incompatible with the original check code which would fail on
freshly mkfs'd fses, which was fixed in:
b4f4473e8a888 'Btrfs-progs: don't output baffling message when checking
a fresh fs'

So tl;dr:
in the kernel: 0 means no cache, only became consistent with
nospace_cache recently, which broke fsck
in progs: 0 still means no cache, -1 is a sentinel in mkfs to force the
first transaction to set it to the real generation value. -1 was added
to the fsck for fresh fses, but the 0 case never came up until now.

Thanks for looking into this and sending a fix,
Boris
> ---
>  check/main.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/check/main.c b/check/main.c
> index c7c5408bea19..a652e445de90 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -9951,7 +9951,12 @@ static int validate_free_space_cache(struct btrfs_root *root)
>  {
>  	int ret;
>  
> +	/*
> +	 * If cache generation is between 0 and -1ULL, sb generation must equal to
> +	 *   sb cache generation or the v1 space caches are outdated.
> +	 */
>  	if (btrfs_super_cache_generation(gfs_info->super_copy) != -1ULL &&
> +	    btrfs_super_cache_generation(gfs_info->super_copy) != 0 &&
>  	    btrfs_super_generation(gfs_info->super_copy) !=
>  	    btrfs_super_cache_generation(gfs_info->super_copy)) {
>  		printf(
> -- 
> 2.30.0
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH RFC] btrfs-progs: check: continue to check space cache if sb cache_generation is 0
  2021-02-17 22:18 ` Boris Burkov
@ 2021-02-18  0:55   ` Su Yue
  0 siblings, 0 replies; 3+ messages in thread
From: Su Yue @ 2021-02-18  0:55 UTC (permalink / raw)
  To: Boris Burkov; +Cc: linux-btrfs, nborisov, dsterba


On Thu 18 Feb 2021 at 06:18, Boris Burkov <boris@bur.io> wrote:

> On Mon, Feb 15, 2021 at 05:20:11PM +0800, Su Yue wrote:
>> User reported that test fsck-tests/037-freespacetree-repair 
>> fails:
>>  # TEST=037\* ./fsck-tests.sh
>>     [TEST/fsck]   037-freespacetree-repair
>> btrfs check should have detected corruption
>> test failed for case 037-freespacetree-repair
>>
>> The test tries to corrupt FST, call btrfs check readonly then 
>> repair FST
>> using btrfs check. Above case failed at the second readonly 
>> check steup.
>> Test log said "cache and super generation don't match, space 
>> cache will
>> be invalidated" which is printed by 
>> validate_free_space_cache().
>> If cache_generation of the superblock is not -1ULL,
>> validate_free_space_cache() requires that cache_generation must 
>> equal
>> to the superblock's generation. Otherwise, it skips the check 
>> of space
>> cache(v1, v2) like the above case where the sb cache_generation 
>> is 0.
>>
>> Since kernel commit 948462294577 ("btrfs: keep sb 
>> cache_generation
>> consistent with space_cache"), sb cache_generation will be set 
>> to be 0
>> once space cache v1 is disabled(nospace_cache/space_cache=v2).
>>
>> Fix it by adding the condition if sb cache_generation is 0 in
>> validate_free_space_cache() as it's valid now since the kernel 
>> commit
>> mentioned above.
>
> Sorry that I didn't notice the fsck issue.
> The fix looks good to me.
>
>>
>> Link: https://github.com/kdave/btrfs-progs/issues/338
>> Signed-off-by: Su Yue <l@damenly.su>
> Reviewed-by: Boris Burkov <boris@bur.io>
>>
>> ---
>> Hi, while reading free space cache v1 related code, I am 
>> (still)
>> confused about the value meanings of cache_generation in sb.
>>
>> For outdated space cache v1, cache_gen < sb gen.
>> For valid space cache v1, cache_gen == sb gen.
>> AS for values 0 and (u64)-1, many places use (u64)-1 to 
>> represent
>> cleared v1 caches. The only three places setting cache_gen to 0 
>> are
>> 1)  above kernel commit 948462294577.
>> 2)  kernel btrfs_parse_options() while mounting zoned device.
>> 3)  progs image/main.c:1147 while restoring image.
>>
>> I'm wondering whether loosing check condition in this patch is 
>> correct
>> or not. So make it RFC. Thanks.
>
> I did some archaeology on this, because when I wrote the patch 
> that
> broke this, I was sure that 0 meant "no space cache" and 
> couldn't
> actually answer your question without digging in.
>
> I believe that the situation is as follows:
>
> In the kernel code, super_cache_gen == 0 <-> no space cache. 
> This has
> been true since the introduction of the field in
> '0af3d00bad38 Btrfs: create special free space cache inode'.
> Another interesting kernel patch to note is:
> '73bc187680f9 Btrfs: introduce mount option no_space_cache'
> which explicitly mentions the check against super_cache_gen==0.
>
> However, in btrfs-progs, as you have correctly pointed out, 
> (u64)-1 is
> the sentinel value of interest. The reason is that in the 
> original
> free-space-cache mkfs patch:
> 'e2a6859d9325 Btrfs-progs: update super fields for space cache'
> -1 is used intentionally as a non-zero value to make space cache 
> the
> default setting. Quoting the patch:
> "It also makes us set it to -1 on mkfs so any new filesystem 
> will get
> the space cache stuff turned on"
> This was incompatible with the original check code which would 
> fail on
> freshly mkfs'd fses, which was fixed in:
> b4f4473e8a888 'Btrfs-progs: don't output baffling message when 
> checking
> a fresh fs'
>

Thanks for your detailed explanation! It's historical indeed.

> So tl;dr:
> in the kernel: 0 means no cache, only became consistent with
> nospace_cache recently, which broke fsck
> in progs: 0 still means no cache, -1 is a sentinel in mkfs to 
> force the
> first transaction to set it to the real generation value. -1 was 
> added
> to the fsck for fresh fses, but the 0 case never came up until 
> now.
>
That makes sense.

Thanks,
Su
> Thanks for looking into this and sending a fix,
> Boris
>> ---
>>  check/main.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/check/main.c b/check/main.c
>> index c7c5408bea19..a652e445de90 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -9951,7 +9951,12 @@ static int 
>> validate_free_space_cache(struct btrfs_root *root)
>>  {
>>  	int ret;
>>
>> +	/*
>> +	 * If cache generation is between 0 and -1ULL, sb 
>> generation must equal to
>> +	 *   sb cache generation or the v1 space caches are 
>> outdated.
>> +	 */
>>  	if (btrfs_super_cache_generation(gfs_info->super_copy) != 
>>  -1ULL &&
>> +	    btrfs_super_cache_generation(gfs_info->super_copy) != 
>> 0 &&
>>  	    btrfs_super_generation(gfs_info->super_copy) !=
>>  	    btrfs_super_cache_generation(gfs_info->super_copy)) {
>>  		printf(
>> --
>> 2.30.0
>>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-18  0:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-15  9:20 [PATCH RFC] btrfs-progs: check: continue to check space cache if sb cache_generation is 0 Su Yue
2021-02-17 22:18 ` Boris Burkov
2021-02-18  0:55   ` Su Yue

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.