All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Wang Yugui <wangyugui@e16-tech.com>, Philipp Fent <fent@in.tum.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Leaf corruption due to csum range
Date: Tue, 11 May 2021 16:44:28 +0800	[thread overview]
Message-ID: <36f2cb11-e87e-2eb6-56e2-19c87b061b49@gmx.com> (raw)
In-Reply-To: <20210511161806.B601.409509F4@e16-tech.com>



On 2021/5/11 下午4:18, Wang Yugui wrote:
> hi,
>
> the last 'write time tree block corruption detected' is marked as
> memory ECC error.

So ECC can failed to recovery the bitflip?

Now I can't even rely on ECC memories nowadays?
(At least tree-check rocks again)

Thanks,
QU
>
>   From:    chil L1n <devchill1n@gmail.com>
>   To:      linux-btrfs@vger.kernel.org
>   Date:    Sat, 6 Mar 2021 10:10:11 +0100
>   Subject: btrfs error: write time tree block corruption detected
>
> Is this a server with ECC memory?
>
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2021/05/11
>
>> I encountered a btrfs error on my system. I run Microsoft SQL Server in
>> a docker container on a btrfs filesystem on an SSD. When bulk-loading
>> some benchmark data, my system reproducibly enters in the following
>> failing state:
>>
>> [  366.665714] BTRFS critical (device sda): corrupt leaf:
>> root=18446744073709551610 block=507544305664 slot=0, csum end range
>> (308900515840) goes beyond the start range (308900384768) of the next
>> csum item
>> [  366.665723] BTRFS info (device sda): leaf 507544305664 gen 18292
>> total ptrs 4 free space 3 owner 18446744073709551610
>> [  366.665725]  item 0 key (18446744073709551606 128 308891275264)
>> itemoff 7259 itemsize 9024
>> [  366.665727]  item 1 key (18446744073709551606 128 308900384768)
>> itemoff 7067 itemsize 192
>> [  366.665728]  item 2 key (18446744073709551606 128 309036716032)
>> itemoff 2587 itemsize 4480
>> [  366.665730]  item 3 key (18446744073709551606 128 309041303552)
>> itemoff 103 itemsize 2484
>> [  366.665731] BTRFS error (device sda): block=507544305664 write time
>> tree block corruption detected
>> [  366.665821] BTRFS: error (device sda) in btrfs_sync_log:3136:
>> errno=-5 IO failure
>> [  366.665824] BTRFS info (device sda): forced readonly
>>
>> Please note the erroring ranges:
>> csum end:   308900515840
>> Start next: 308900384768
>> which is a difference of (1 << 17) == 0b100000000000000000 == 128KB
>> To me, this looks suspiciously like an off-by-one error, but I'm not too
>> versed in debugging btrfs.
>>
>> I reproduced this several times on my machine using the attached
>> scripts. The only obvious similarity between the crashes is this 128KB
>> csum end / start next. Sometimes a get one corrupt leaf, sometimes many.
>> I tried to reproduce it on another machine with an HDD, but didn't
>> encounter this error there.
>> Can you help me to debug this further?
>>
>> # uname -a
>> Linux desk 5.12.2-arch1-1 #1 SMP PREEMPT Fri, 07 May 2021 15:36:06 +0000
>> x86_64 GNU/Linux
>> # btrfs --version
>> btrfs-progs v5.11.1
>> # btrfs fi show
>> Label: none  uuid: 6733acf5-be40-4fe2-9d6f-819d39e49720
>>          Total devices 1 FS bytes used 187.11GiB
>>          devid    1 size 931.51GiB used 208.03GiB path /dev/sda
>> # btrfs fi df /ssdSpace
>> Data, single: total=207.00GiB, used=186.67GiB
>> System, single: total=32.00MiB, used=48.00KiB
>> Metadata, single: total=1.00GiB, used=450.08MiB
>> GlobalReserve, single: total=215.41MiB, used=0.00B
>
>

  reply	other threads:[~2021-05-11  8:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-10 20:50 Leaf corruption due to csum range Philipp Fent
2021-05-11  8:18 ` Wang Yugui
2021-05-11  8:44   ` Qu Wenruo [this message]
2021-05-11  8:56 ` Filipe Manana
     [not found]   ` <ad414944-2418-3728-ac1a-5d4d37e37ac1@in.tum.de>
2021-05-11 12:35     ` Filipe Manana
     [not found]       ` <ef9ea56e-fb47-f719-137b-ffb545a09db7@in.tum.de>
2021-05-13  9:57         ` Filipe Manana
2021-05-13 10:50           ` Filipe Manana
2021-05-13 11:11             ` Philipp Fent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36f2cb11-e87e-2eb6-56e2-19c87b061b49@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=fent@in.tum.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wangyugui@e16-tech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.