* Leaf corruption due to csum range
@ 2021-05-10 20:50 Philipp Fent
2021-05-11 8:18 ` Wang Yugui
2021-05-11 8:56 ` Filipe Manana
0 siblings, 2 replies; 8+ messages in thread
From: Philipp Fent @ 2021-05-10 20:50 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2304 bytes --]
I encountered a btrfs error on my system. I run Microsoft SQL Server in
a docker container on a btrfs filesystem on an SSD. When bulk-loading
some benchmark data, my system reproducibly enters in the following
failing state:
[ 366.665714] BTRFS critical (device sda): corrupt leaf:
root=18446744073709551610 block=507544305664 slot=0, csum end range
(308900515840) goes beyond the start range (308900384768) of the next
csum item
[ 366.665723] BTRFS info (device sda): leaf 507544305664 gen 18292
total ptrs 4 free space 3 owner 18446744073709551610
[ 366.665725] item 0 key (18446744073709551606 128 308891275264)
itemoff 7259 itemsize 9024
[ 366.665727] item 1 key (18446744073709551606 128 308900384768)
itemoff 7067 itemsize 192
[ 366.665728] item 2 key (18446744073709551606 128 309036716032)
itemoff 2587 itemsize 4480
[ 366.665730] item 3 key (18446744073709551606 128 309041303552)
itemoff 103 itemsize 2484
[ 366.665731] BTRFS error (device sda): block=507544305664 write time
tree block corruption detected
[ 366.665821] BTRFS: error (device sda) in btrfs_sync_log:3136:
errno=-5 IO failure
[ 366.665824] BTRFS info (device sda): forced readonly
Please note the erroring ranges:
csum end: 308900515840
Start next: 308900384768
which is a difference of (1 << 17) == 0b100000000000000000 == 128KB
To me, this looks suspiciously like an off-by-one error, but I'm not too
versed in debugging btrfs.
I reproduced this several times on my machine using the attached
scripts. The only obvious similarity between the crashes is this 128KB
csum end / start next. Sometimes a get one corrupt leaf, sometimes many.
I tried to reproduce it on another machine with an HDD, but didn't
encounter this error there.
Can you help me to debug this further?
# uname -a
Linux desk 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000
x86_64 GNU/Linux
# btrfs --version
btrfs-progs v5.11.1
# btrfs fi show
Label: none uuid: 6733acf5-be40-4fe2-9d6f-819d39e49720
Total devices 1 FS bytes used 187.11GiB
devid 1 size 931.51GiB used 208.03GiB path /dev/sda
# btrfs fi df /ssdSpace
Data, single: total=207.00GiB, used=186.67GiB
System, single: total=32.00MiB, used=48.00KiB
Metadata, single: total=1.00GiB, used=450.08MiB
GlobalReserve, single: total=215.41MiB, used=0.00B
[-- Attachment #2: btrfsbug.tar.gz --]
[-- Type: application/gzip, Size: 1911 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Leaf corruption due to csum range
2021-05-10 20:50 Leaf corruption due to csum range Philipp Fent
@ 2021-05-11 8:18 ` Wang Yugui
2021-05-11 8:44 ` Qu Wenruo
2021-05-11 8:56 ` Filipe Manana
1 sibling, 1 reply; 8+ messages in thread
From: Wang Yugui @ 2021-05-11 8:18 UTC (permalink / raw)
To: Philipp Fent; +Cc: linux-btrfs
hi,
the last 'write time tree block corruption detected' is marked as
memory ECC error.
From: chil L1n <devchill1n@gmail.com>
To: linux-btrfs@vger.kernel.org
Date: Sat, 6 Mar 2021 10:10:11 +0100
Subject: btrfs error: write time tree block corruption detected
Is this a server with ECC memory?
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2021/05/11
> I encountered a btrfs error on my system. I run Microsoft SQL Server in
> a docker container on a btrfs filesystem on an SSD. When bulk-loading
> some benchmark data, my system reproducibly enters in the following
> failing state:
>
> [ 366.665714] BTRFS critical (device sda): corrupt leaf:
> root=18446744073709551610 block=507544305664 slot=0, csum end range
> (308900515840) goes beyond the start range (308900384768) of the next
> csum item
> [ 366.665723] BTRFS info (device sda): leaf 507544305664 gen 18292
> total ptrs 4 free space 3 owner 18446744073709551610
> [ 366.665725] item 0 key (18446744073709551606 128 308891275264)
> itemoff 7259 itemsize 9024
> [ 366.665727] item 1 key (18446744073709551606 128 308900384768)
> itemoff 7067 itemsize 192
> [ 366.665728] item 2 key (18446744073709551606 128 309036716032)
> itemoff 2587 itemsize 4480
> [ 366.665730] item 3 key (18446744073709551606 128 309041303552)
> itemoff 103 itemsize 2484
> [ 366.665731] BTRFS error (device sda): block=507544305664 write time
> tree block corruption detected
> [ 366.665821] BTRFS: error (device sda) in btrfs_sync_log:3136:
> errno=-5 IO failure
> [ 366.665824] BTRFS info (device sda): forced readonly
>
> Please note the erroring ranges:
> csum end: 308900515840
> Start next: 308900384768
> which is a difference of (1 << 17) == 0b100000000000000000 == 128KB
> To me, this looks suspiciously like an off-by-one error, but I'm not too
> versed in debugging btrfs.
>
> I reproduced this several times on my machine using the attached
> scripts. The only obvious similarity between the crashes is this 128KB
> csum end / start next. Sometimes a get one corrupt leaf, sometimes many.
> I tried to reproduce it on another machine with an HDD, but didn't
> encounter this error there.
> Can you help me to debug this further?
>
> # uname -a
> Linux desk 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000
> x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v5.11.1
> # btrfs fi show
> Label: none uuid: 6733acf5-be40-4fe2-9d6f-819d39e49720
> Total devices 1 FS bytes used 187.11GiB
> devid 1 size 931.51GiB used 208.03GiB path /dev/sda
> # btrfs fi df /ssdSpace
> Data, single: total=207.00GiB, used=186.67GiB
> System, single: total=32.00MiB, used=48.00KiB
> Metadata, single: total=1.00GiB, used=450.08MiB
> GlobalReserve, single: total=215.41MiB, used=0.00B
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Leaf corruption due to csum range
2021-05-11 8:18 ` Wang Yugui
@ 2021-05-11 8:44 ` Qu Wenruo
0 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2021-05-11 8:44 UTC (permalink / raw)
To: Wang Yugui, Philipp Fent; +Cc: linux-btrfs
On 2021/5/11 下午4:18, Wang Yugui wrote:
> hi,
>
> the last 'write time tree block corruption detected' is marked as
> memory ECC error.
So ECC can failed to recovery the bitflip?
Now I can't even rely on ECC memories nowadays?
(At least tree-check rocks again)
Thanks,
QU
>
> From: chil L1n <devchill1n@gmail.com>
> To: linux-btrfs@vger.kernel.org
> Date: Sat, 6 Mar 2021 10:10:11 +0100
> Subject: btrfs error: write time tree block corruption detected
>
> Is this a server with ECC memory?
>
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2021/05/11
>
>> I encountered a btrfs error on my system. I run Microsoft SQL Server in
>> a docker container on a btrfs filesystem on an SSD. When bulk-loading
>> some benchmark data, my system reproducibly enters in the following
>> failing state:
>>
>> [ 366.665714] BTRFS critical (device sda): corrupt leaf:
>> root=18446744073709551610 block=507544305664 slot=0, csum end range
>> (308900515840) goes beyond the start range (308900384768) of the next
>> csum item
>> [ 366.665723] BTRFS info (device sda): leaf 507544305664 gen 18292
>> total ptrs 4 free space 3 owner 18446744073709551610
>> [ 366.665725] item 0 key (18446744073709551606 128 308891275264)
>> itemoff 7259 itemsize 9024
>> [ 366.665727] item 1 key (18446744073709551606 128 308900384768)
>> itemoff 7067 itemsize 192
>> [ 366.665728] item 2 key (18446744073709551606 128 309036716032)
>> itemoff 2587 itemsize 4480
>> [ 366.665730] item 3 key (18446744073709551606 128 309041303552)
>> itemoff 103 itemsize 2484
>> [ 366.665731] BTRFS error (device sda): block=507544305664 write time
>> tree block corruption detected
>> [ 366.665821] BTRFS: error (device sda) in btrfs_sync_log:3136:
>> errno=-5 IO failure
>> [ 366.665824] BTRFS info (device sda): forced readonly
>>
>> Please note the erroring ranges:
>> csum end: 308900515840
>> Start next: 308900384768
>> which is a difference of (1 << 17) == 0b100000000000000000 == 128KB
>> To me, this looks suspiciously like an off-by-one error, but I'm not too
>> versed in debugging btrfs.
>>
>> I reproduced this several times on my machine using the attached
>> scripts. The only obvious similarity between the crashes is this 128KB
>> csum end / start next. Sometimes a get one corrupt leaf, sometimes many.
>> I tried to reproduce it on another machine with an HDD, but didn't
>> encounter this error there.
>> Can you help me to debug this further?
>>
>> # uname -a
>> Linux desk 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000
>> x86_64 GNU/Linux
>> # btrfs --version
>> btrfs-progs v5.11.1
>> # btrfs fi show
>> Label: none uuid: 6733acf5-be40-4fe2-9d6f-819d39e49720
>> Total devices 1 FS bytes used 187.11GiB
>> devid 1 size 931.51GiB used 208.03GiB path /dev/sda
>> # btrfs fi df /ssdSpace
>> Data, single: total=207.00GiB, used=186.67GiB
>> System, single: total=32.00MiB, used=48.00KiB
>> Metadata, single: total=1.00GiB, used=450.08MiB
>> GlobalReserve, single: total=215.41MiB, used=0.00B
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Leaf corruption due to csum range
2021-05-10 20:50 Leaf corruption due to csum range Philipp Fent
2021-05-11 8:18 ` Wang Yugui
@ 2021-05-11 8:56 ` Filipe Manana
[not found] ` <ad414944-2418-3728-ac1a-5d4d37e37ac1@in.tum.de>
1 sibling, 1 reply; 8+ messages in thread
From: Filipe Manana @ 2021-05-11 8:56 UTC (permalink / raw)
To: Philipp Fent; +Cc: linux-btrfs
On Mon, May 10, 2021 at 10:01 PM Philipp Fent <fent@in.tum.de> wrote:
>
> I encountered a btrfs error on my system. I run Microsoft SQL Server in
> a docker container on a btrfs filesystem on an SSD. When bulk-loading
> some benchmark data, my system reproducibly enters in the following
> failing state:
>
> [ 366.665714] BTRFS critical (device sda): corrupt leaf:
> root=18446744073709551610 block=507544305664 slot=0, csum end range
> (308900515840) goes beyond the start range (308900384768) of the next
> csum item
> [ 366.665723] BTRFS info (device sda): leaf 507544305664 gen 18292
> total ptrs 4 free space 3 owner 18446744073709551610
> [ 366.665725] item 0 key (18446744073709551606 128 308891275264)
> itemoff 7259 itemsize 9024
> [ 366.665727] item 1 key (18446744073709551606 128 308900384768)
> itemoff 7067 itemsize 192
> [ 366.665728] item 2 key (18446744073709551606 128 309036716032)
> itemoff 2587 itemsize 4480
> [ 366.665730] item 3 key (18446744073709551606 128 309041303552)
> itemoff 103 itemsize 2484
> [ 366.665731] BTRFS error (device sda): block=507544305664 write time
> tree block corruption detected
> [ 366.665821] BTRFS: error (device sda) in btrfs_sync_log:3136:
> errno=-5 IO failure
> [ 366.665824] BTRFS info (device sda): forced readonly
>
> Please note the erroring ranges:
> csum end: 308900515840
> Start next: 308900384768
> which is a difference of (1 << 17) == 0b100000000000000000 == 128KB
> To me, this looks suspiciously like an off-by-one error, but I'm not too
> versed in debugging btrfs.
Most likely it's a race when adding checksums. In this case for the
log tree (fsync).
This has happened in the past and the most recent fix was:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e289f03ea79bbc6574b78ac25682555423a91cbb
There were cases too that affected the csum tree and not the log tree,
but those are many years old now.
>
> I reproduced this several times on my machine using the attached
> scripts. The only obvious similarity between the crashes is this 128KB
> csum end / start next. Sometimes a get one corrupt leaf, sometimes many.
> I tried to reproduce it on another machine with an HDD, but didn't
> encounter this error there.
> Can you help me to debug this further?
Try to see if there are reflink operations (clone and dedupe) done by
sql server (or maybe docker), in case there aren't, that excludes
shared extents being the cause of the problem.
I'll have to look at the code and think what might go wrong to lead to
that, so I can't say that I have exact steps on how to debug that.
Thanks.
>
> # uname -a
> Linux desk 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000
> x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v5.11.1
> # btrfs fi show
> Label: none uuid: 6733acf5-be40-4fe2-9d6f-819d39e49720
> Total devices 1 FS bytes used 187.11GiB
> devid 1 size 931.51GiB used 208.03GiB path /dev/sda
> # btrfs fi df /ssdSpace
> Data, single: total=207.00GiB, used=186.67GiB
> System, single: total=32.00MiB, used=48.00KiB
> Metadata, single: total=1.00GiB, used=450.08MiB
> GlobalReserve, single: total=215.41MiB, used=0.00B
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-05-13 11:11 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-10 20:50 Leaf corruption due to csum range Philipp Fent
2021-05-11 8:18 ` Wang Yugui
2021-05-11 8:44 ` Qu Wenruo
2021-05-11 8:56 ` Filipe Manana
[not found] ` <ad414944-2418-3728-ac1a-5d4d37e37ac1@in.tum.de>
2021-05-11 12:35 ` Filipe Manana
[not found] ` <ef9ea56e-fb47-f719-137b-ffb545a09db7@in.tum.de>
2021-05-13 9:57 ` Filipe Manana
2021-05-13 10:50 ` Filipe Manana
2021-05-13 11:11 ` Philipp Fent
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).