All of lore.kernel.org
 help / color / mirror / Atom feed
* My first attempt to use btrfs failed miserably
@ 2020-02-02 12:45 Skibbi
  2020-02-02 12:56 ` Qu Wenruo
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Skibbi @ 2020-02-02 12:45 UTC (permalink / raw)
  To: linux-btrfs

Hello,
So I decided to try btrfs on my new portable WD Password Drive
attached to Raspberry Pi 4. I created GPT partition, created luks2
volume and formatted it with btrfs. Then I created 3 subvolumes and
started copying data from other disks to one of the subvolumes. After
writing around 40GB of data my filesystem crashed. That was super fast
and totally discouraged me from next attempts to use btrfs :(
But I would like to help with development so before I reformat my
drive I can help you identifying potential issues with this filesystem
by providing some debugging info.

Here are some details:

root@rpi4b:~# uname -a
Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l GNU/Linux

root@rpi4b:~# btrfs --version
btrfs-progs v4.20.1

root@rpi4b:~# btrfs fi show
Label: 'NAS'  uuid: b16b5b3f-ce5e-42e6-bccd-b48cc641bf96
        Total devices 1 FS bytes used 42.48GiB
        devid    1 size 4.55TiB used 45.02GiB path /dev/mapper/NAS

root@rpi4b:~# dmesg |grep btrfs
[223167.290255] BTRFS: error (device dm-0) in
btrfs_run_delayed_refs:2935: errno=-5 IO failure
[223167.389690] BTRFS: error (device dm-0) in
btrfs_run_delayed_refs:2935: errno=-5 IO failure
root@rpi4b:~# dmesg |grep BTRFS
[201688.941552] BTRFS: device label NAS devid 1 transid 5 /dev/sda1
[201729.894774] BTRFS info (device sda1): disk space caching is enabled
[201729.894789] BTRFS info (device sda1): has skinny extents
[201729.894801] BTRFS info (device sda1): flagging fs with big metadata feature
[201729.902120] BTRFS info (device sda1): checking UUID tree
[202297.695253] BTRFS info (device sda1): disk space caching is enabled
[202297.695271] BTRFS info (device sda1): has skinny extents
[202439.515956] BTRFS info (device sda1): disk space caching is enabled
[202439.515976] BTRFS info (device sda1): has skinny extents
[202928.275644] BTRFS error (device sda1): open_ctree failed
[202934.389346] BTRFS info (device sda1): disk space caching is enabled
[202934.389361] BTRFS info (device sda1): has skinny extents
[203040.718845] BTRFS info (device sda1): disk space caching is enabled
[203040.718863] BTRFS info (device sda1): has skinny extents
[203285.351377] BTRFS error (device sda1): bad tree block start, want
31457280 have 0
[203285.368602] BTRFS error (device sda1): bad tree block start, want
31457280 have 0
[203285.369340] BTRFS error (device sda1): bad tree block start, want
31440896 have 0
[203285.380616] BTRFS error (device sda1): bad tree block start, want
31440896 have 0
[203285.381100] BTRFS error (device sda1): bad tree block start, want
31440896 have 0
[203285.381540] BTRFS error (device sda1): bad tree block start, want
31440896 have 0
[203285.382061] BTRFS error (device sda1): bad tree block start, want
31506432 have 0
[203285.382409] BTRFS error (device sda1): bad tree block start, want
31506432 have 0
[203285.382836] BTRFS error (device sda1): bad tree block start, want
31506432 have 0
[203285.383180] BTRFS error (device sda1): bad tree block start, want
31506432 have 0
[203285.466743] BTRFS info (device sda1): read error corrected: ino 0
off 32735232 (dev /dev/sda1 sector 80320)
[203285.466982] BTRFS info (device sda1): read error corrected: ino 0
off 32739328 (dev /dev/sda1 sector 80328)
[203285.467215] BTRFS info (device sda1): read error corrected: ino 0
off 32743424 (dev /dev/sda1 sector 80336)
[203285.467713] BTRFS info (device sda1): read error corrected: ino 0
off 32747520 (dev /dev/sda1 sector 80344)
[203285.468820] BTRFS info (device sda1): read error corrected: ino 0
off 32751616 (dev /dev/sda1 sector 80352)
[203285.469053] BTRFS info (device sda1): read error corrected: ino 0
off 32755712 (dev /dev/sda1 sector 80360)
[203285.469285] BTRFS info (device sda1): read error corrected: ino 0
off 32759808 (dev /dev/sda1 sector 80368)
[203285.469515] BTRFS info (device sda1): read error corrected: ino 0
off 32763904 (dev /dev/sda1 sector 80376)
[204448.566295] BTRFS: device label NAS devid 1 transid 5 /dev/dm-0
[204464.083776] BTRFS info (device dm-0): disk space caching is enabled
[204464.083792] BTRFS info (device dm-0): has skinny extents
[204464.083804] BTRFS info (device dm-0): flagging fs with big metadata feature
[204464.099978] BTRFS info (device dm-0): checking UUID tree
[218811.383208] BTRFS error (device dm-0): bad tree block start, want
50659328 have 7653333615399691647
[218811.458203] BTRFS error (device dm-0): bad tree block start, want
50659328 have 11439613481626299565
[222717.551578] BTRFS error (device dm-0): bad tree block start, want
69222400 have 13548117933796719565
[222717.563137] BTRFS error (device dm-0): bad tree block start, want
69222400 have 7380016245193299115
[223167.098981] BTRFS error (device dm-0): bad tree block start, want
73252864 have 13360254792515176285
[223167.162808] BTRFS error (device dm-0): bad tree block start, want
73252864 have 11805635508241231341
[223167.269483] BTRFS error (device dm-0): bad tree block start, want
73252864 have 13360254792515176285
[223167.290178] BTRFS error (device dm-0): bad tree block start, want
73252864 have 11805635508241231341
[223167.290255] BTRFS: error (device dm-0) in
btrfs_run_delayed_refs:2935: errno=-5 IO failure
[223167.299414] BTRFS info (device dm-0): forced readonly
[223167.322053] BTRFS error (device dm-0): bad tree block start, want
73252864 have 13360254792515176285
[223167.389598] BTRFS error (device dm-0): bad tree block start, want
73252864 have 11805635508241231341
[223167.389690] BTRFS: error (device dm-0) in
btrfs_run_delayed_refs:2935: errno=-5 IO failure
[223167.399958] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[223167.413347] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[223167.487687] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[223167.499337] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260285.601565] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260285.602742] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260285.604070] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260285.605224] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260288.795773] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260288.797000] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260288.798206] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260288.799380] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260301.047239] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260301.048437] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260301.049638] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260301.050800] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260309.107260] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260309.108396] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260309.109563] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260309.110674] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260309.371483] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260309.372615] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260309.373923] BTRFS error (device dm-0): bad tree block start, want
73433088 have 3620493785417914802
[260309.391169] BTRFS error (device dm-0): bad tree block start, want
73433088 have 13303833022607090580
[260389.358616] BTRFS info (device dm-0): disk space caching is enabled
[260389.358631] BTRFS info (device dm-0): has skinny extents
[260389.430962] BTRFS error (device dm-0): bad tree block start, want
73252864 have 13360254792515176285
[260389.432087] BTRFS error (device dm-0): bad tree block start, want
73252864 have 11805635508241231341
[260389.432146] BTRFS error (device dm-0): failed to read block groups: -5
[260389.474656] BTRFS error (device dm-0): open_ctree failed
[276102.707458] BTRFS warning (device dm-0): 'recovery' is deprecated,
use 'usebackuproot' instead
[276102.707475] BTRFS info (device dm-0): trying to use backup root at
mount time
[276102.707493] BTRFS info (device dm-0): disabling disk space caching
[276102.707506] BTRFS info (device dm-0): force clearing of disk cache
[276102.707518] BTRFS info (device dm-0): has skinny extents
[276102.731022] BTRFS error (device dm-0): bad tree block start, want
73252864 have 13360254792515176285
[276102.732407] BTRFS error (device dm-0): bad tree block start, want
73252864 have 11805635508241231341
[276102.732472] BTRFS error (device dm-0): failed to read block groups: -5
[276102.781625] BTRFS error (device dm-0): open_ctree failed

--
Best regards

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
@ 2020-02-02 12:56 ` Qu Wenruo
  2020-02-02 13:22   ` Stephan von Krawczynski
  2020-02-02 13:29   ` Martin Raiber
  2020-02-02 14:14 ` Roman Mamedov
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Qu Wenruo @ 2020-02-02 12:56 UTC (permalink / raw)
  To: Skibbi, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 3260 bytes --]



On 2020/2/2 下午8:45, Skibbi wrote:
> Hello,
> So I decided to try btrfs on my new portable WD Password Drive
> attached to Raspberry Pi 4. I created GPT partition, created luks2
> volume and formatted it with btrfs. Then I created 3 subvolumes and
> started copying data from other disks to one of the subvolumes. After
> writing around 40GB of data my filesystem crashed. That was super fast
> and totally discouraged me from next attempts to use btrfs :(
> But I would like to help with development so before I reformat my
> drive I can help you identifying potential issues with this filesystem
> by providing some debugging info.
> 
> Here are some details:
> 
> root@rpi4b:~# uname -a
> Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l GNU/Linux

Pretty old kernel, nor recently enough backports.

And since you're already using rpi4, no reason to use armv7 kernel.
You can go aarch64, Archlinux ARM has latest kernel for it.

> 
> root@rpi4b:~# btrfs --version
> btrfs-progs v4.20.1

Old progs too.

> 
> root@rpi4b:~# btrfs fi show
> Label: 'NAS'  uuid: b16b5b3f-ce5e-42e6-bccd-b48cc641bf96
>         Total devices 1 FS bytes used 42.48GiB
>         devid    1 size 4.55TiB used 45.02GiB path /dev/mapper/NAS
> 
> root@rpi4b:~# dmesg |grep btrfs
> [223167.290255] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> [223167.389690] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> root@rpi4b:~# dmesg |grep BTRFS
> [201688.941552] BTRFS: device label NAS devid 1 transid 5 /dev/sda1
> [201729.894774] BTRFS info (device sda1): disk space caching is enabled
> [201729.894789] BTRFS info (device sda1): has skinny extents
> [201729.894801] BTRFS info (device sda1): flagging fs with big metadata feature
> [201729.902120] BTRFS info (device sda1): checking UUID tree
> [202297.695253] BTRFS info (device sda1): disk space caching is enabled
> [202297.695271] BTRFS info (device sda1): has skinny extents
> [202439.515956] BTRFS info (device sda1): disk space caching is enabled
> [202439.515976] BTRFS info (device sda1): has skinny extents
> [202928.275644] BTRFS error (device sda1): open_ctree failed
> [202934.389346] BTRFS info (device sda1): disk space caching is enabled
> [202934.389361] BTRFS info (device sda1): has skinny extents
> [203040.718845] BTRFS info (device sda1): disk space caching is enabled
> [203040.718863] BTRFS info (device sda1): has skinny extents
> [203285.351377] BTRFS error (device sda1): bad tree block start, want
> 31457280 have 0

This means some tree read failed miserably.
It looks like btrfs is trying to read something from trimmed range.

> [203285.383180] BTRFS error (device sda1): bad tree block start, want
> 31506432 have 0
> [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
> off 32735232 (dev /dev/sda1 sector 80320)

This means btrfs still can get one good copy.

Something is not working properly, either from btrfs or the lower stack.

Have you tried to do the same thing without LUKS? Just btrfs over raw
partition.

And it's recommended to use newer kernel anyway.

Thanks,
Qu
> 
> --
> Best regards
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:56 ` Qu Wenruo
@ 2020-02-02 13:22   ` Stephan von Krawczynski
  2020-02-02 20:04     ` Chris Murphy
  2020-02-02 13:29   ` Martin Raiber
  1 sibling, 1 reply; 24+ messages in thread
From: Stephan von Krawczynski @ 2020-02-02 13:22 UTC (permalink / raw)
  To: linux-btrfs

On Sun, 2 Feb 2020 20:56:20 +0800
Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:

> On 2020/2/2 下午8:45, Skibbi wrote:
> > Hello,
> > So I decided to try btrfs on my new portable WD Password Drive
> > attached to Raspberry Pi 4. I created GPT partition, created luks2
> > volume and formatted it with btrfs. Then I created 3 subvolumes and
> > started copying data from other disks to one of the subvolumes. After
> > writing around 40GB of data my filesystem crashed. That was super fast
> > and totally discouraged me from next attempts to use btrfs :(
> > But I would like to help with development so before I reformat my
> > drive I can help you identifying potential issues with this filesystem
> > by providing some debugging info.
> > 
> > Here are some details:
> > 
> > root@rpi4b:~# uname -a
> > Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l
> > GNU/Linux  
> 
> Pretty old kernel, nor recently enough backports.

Exactly this kind of answer made me leave btrfs and never come back again.
4.19.93 is not very far away from the _latest_ longterm kernel released (which
is 4.19.101).
What you are saying here is that there is no stable working btrfs in longterm
kernels at all.
Hear, hear.
My advice to the OP: use ZFS. Great performance, absolutely stable, no crash
in years.

-- 
Regards,
Stephan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:56 ` Qu Wenruo
  2020-02-02 13:22   ` Stephan von Krawczynski
@ 2020-02-02 13:29   ` Martin Raiber
  2020-02-02 13:36     ` Qu Wenruo
  1 sibling, 1 reply; 24+ messages in thread
From: Martin Raiber @ 2020-02-02 13:29 UTC (permalink / raw)
  To: Qu Wenruo, Skibbi, linux-btrfs

On 02.02.2020 13:56 Qu Wenruo wrote:
>
> On 2020/2/2 下午8:45, Skibbi wrote:
>> Hello,
>> So I decided to try btrfs on my new portable WD Password Drive
>> attached to Raspberry Pi 4. I created GPT partition, created luks2
>> volume and formatted it with btrfs. Then I created 3 subvolumes and
>> started copying data from other disks to one of the subvolumes. After
>> writing around 40GB of data my filesystem crashed. That was super fast
>> and totally discouraged me from next attempts to use btrfs :(
>> But I would like to help with development so before I reformat my
>> drive I can help you identifying potential issues with this filesystem
>> by providing some debugging info.
>>
>> Here are some details:
>>
>> root@rpi4b:~# uname -a
>> Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l GNU/Linux
> Pretty old kernel, nor recently enough backports.
>
> And since you're already using rpi4, no reason to use armv7 kernel.
> You can go aarch64, Archlinux ARM has latest kernel for it.
>
>> root@rpi4b:~# btrfs --version
>> btrfs-progs v4.20.1
> Old progs too.
>
>> root@rpi4b:~# btrfs fi show
>> Label: 'NAS'  uuid: b16b5b3f-ce5e-42e6-bccd-b48cc641bf96
>>         Total devices 1 FS bytes used 42.48GiB
>>         devid    1 size 4.55TiB used 45.02GiB path /dev/mapper/NAS
>>
>> root@rpi4b:~# dmesg |grep btrfs
>> [223167.290255] BTRFS: error (device dm-0) in
>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>> [223167.389690] BTRFS: error (device dm-0) in
>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>> root@rpi4b:~# dmesg |grep BTRFS
>> [201688.941552] BTRFS: device label NAS devid 1 transid 5 /dev/sda1
>> [201729.894774] BTRFS info (device sda1): disk space caching is enabled
>> [201729.894789] BTRFS info (device sda1): has skinny extents
>> [201729.894801] BTRFS info (device sda1): flagging fs with big metadata feature
>> [201729.902120] BTRFS info (device sda1): checking UUID tree
>> [202297.695253] BTRFS info (device sda1): disk space caching is enabled
>> [202297.695271] BTRFS info (device sda1): has skinny extents
>> [202439.515956] BTRFS info (device sda1): disk space caching is enabled
>> [202439.515976] BTRFS info (device sda1): has skinny extents
>> [202928.275644] BTRFS error (device sda1): open_ctree failed
>> [202934.389346] BTRFS info (device sda1): disk space caching is enabled
>> [202934.389361] BTRFS info (device sda1): has skinny extents
>> [203040.718845] BTRFS info (device sda1): disk space caching is enabled
>> [203040.718863] BTRFS info (device sda1): has skinny extents
>> [203285.351377] BTRFS error (device sda1): bad tree block start, want
>> 31457280 have 0
> This means some tree read failed miserably.
> It looks like btrfs is trying to read something from trimmed range.
>
>> [203285.383180] BTRFS error (device sda1): bad tree block start, want
>> 31506432 have 0
>> [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
>> off 32735232 (dev /dev/sda1 sector 80320)
> This means btrfs still can get one good copy.
>
> Something is not working properly, either from btrfs or the lower stack.
>
> Have you tried to do the same thing without LUKS? Just btrfs over raw
> partition.
>
> And it's recommended to use newer kernel anyway.

I disagree. 4.19.y is an okay kernel to use w.r.t. btrfs, especially
since all the newer stable versions currently have the statfs() is zero
bug. The btrfs-tools version doesn't matter much, unless one has to use
"btrfs check", which is (hopefully) not usually necessary. As you can
see the kernel is also ~20 days old and 4.19.y is a LTS kernel, so it
still gets (btrfs) updates/bugfixes.

I would suspect a hardware issue with the WD disk (run badblocks for a
while to check). The USB can also cause problems (the USB 3.0 DMA was a
hack in RPI4 that wasn't merged upstream last I looked), but you didn't
list the whole dmesg...

>
> Thanks,
> Qu
>> --
>> Best regards
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 13:29   ` Martin Raiber
@ 2020-02-02 13:36     ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2020-02-02 13:36 UTC (permalink / raw)
  To: Martin Raiber, Skibbi, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4338 bytes --]



On 2020/2/2 下午9:29, Martin Raiber wrote:
> On 02.02.2020 13:56 Qu Wenruo wrote:
>>
>> On 2020/2/2 下午8:45, Skibbi wrote:
>>> Hello,
>>> So I decided to try btrfs on my new portable WD Password Drive
>>> attached to Raspberry Pi 4. I created GPT partition, created luks2
>>> volume and formatted it with btrfs. Then I created 3 subvolumes and
>>> started copying data from other disks to one of the subvolumes. After
>>> writing around 40GB of data my filesystem crashed. That was super fast
>>> and totally discouraged me from next attempts to use btrfs :(
>>> But I would like to help with development so before I reformat my
>>> drive I can help you identifying potential issues with this filesystem
>>> by providing some debugging info.
>>>
>>> Here are some details:
>>>
>>> root@rpi4b:~# uname -a
>>> Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l GNU/Linux
>> Pretty old kernel, nor recently enough backports.
>>
>> And since you're already using rpi4, no reason to use armv7 kernel.
>> You can go aarch64, Archlinux ARM has latest kernel for it.
>>
>>> root@rpi4b:~# btrfs --version
>>> btrfs-progs v4.20.1
>> Old progs too.
>>
>>> root@rpi4b:~# btrfs fi show
>>> Label: 'NAS'  uuid: b16b5b3f-ce5e-42e6-bccd-b48cc641bf96
>>>         Total devices 1 FS bytes used 42.48GiB
>>>         devid    1 size 4.55TiB used 45.02GiB path /dev/mapper/NAS
>>>
>>> root@rpi4b:~# dmesg |grep btrfs
>>> [223167.290255] BTRFS: error (device dm-0) in
>>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>>> [223167.389690] BTRFS: error (device dm-0) in
>>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>>> root@rpi4b:~# dmesg |grep BTRFS
>>> [201688.941552] BTRFS: device label NAS devid 1 transid 5 /dev/sda1
>>> [201729.894774] BTRFS info (device sda1): disk space caching is enabled
>>> [201729.894789] BTRFS info (device sda1): has skinny extents
>>> [201729.894801] BTRFS info (device sda1): flagging fs with big metadata feature
>>> [201729.902120] BTRFS info (device sda1): checking UUID tree
>>> [202297.695253] BTRFS info (device sda1): disk space caching is enabled
>>> [202297.695271] BTRFS info (device sda1): has skinny extents
>>> [202439.515956] BTRFS info (device sda1): disk space caching is enabled
>>> [202439.515976] BTRFS info (device sda1): has skinny extents
>>> [202928.275644] BTRFS error (device sda1): open_ctree failed
>>> [202934.389346] BTRFS info (device sda1): disk space caching is enabled
>>> [202934.389361] BTRFS info (device sda1): has skinny extents
>>> [203040.718845] BTRFS info (device sda1): disk space caching is enabled
>>> [203040.718863] BTRFS info (device sda1): has skinny extents
>>> [203285.351377] BTRFS error (device sda1): bad tree block start, want
>>> 31457280 have 0
>> This means some tree read failed miserably.
>> It looks like btrfs is trying to read something from trimmed range.
>>
>>> [203285.383180] BTRFS error (device sda1): bad tree block start, want
>>> 31506432 have 0
>>> [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
>>> off 32735232 (dev /dev/sda1 sector 80320)
>> This means btrfs still can get one good copy.
>>
>> Something is not working properly, either from btrfs or the lower stack.
>>
>> Have you tried to do the same thing without LUKS? Just btrfs over raw
>> partition.
>>
>> And it's recommended to use newer kernel anyway.
> 
> I disagree. 4.19.y is an okay kernel to use w.r.t. btrfs, especially
> since all the newer stable versions currently have the statfs() is zero
> bug. The btrfs-tools version doesn't matter much, unless one has to use
> "btrfs check", which is (hopefully) not usually necessary. As you can
> see the kernel is also ~20 days old and 4.19.y is a LTS kernel, so it
> still gets (btrfs) updates/bugfixes.
> 
> I would suspect a hardware issue with the WD disk (run badblocks for a
> while to check). The USB can also cause problems (the USB 3.0 DMA was a
> hack in RPI4 that wasn't merged upstream last I looked), but you didn't
> list the whole dmesg...

You see, the support for RPI4 is not in LTS kernel branch either...

Thus I recommend to go latest or even latest rc just like archlinuxarm
is providing.

Thanks,
Qu

> 
>>
>> Thanks,
>> Qu
>>> --
>>> Best regards
>>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
  2020-02-02 12:56 ` Qu Wenruo
@ 2020-02-02 14:14 ` Roman Mamedov
  2020-02-02 14:45 ` Swâmi Petaramesh
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Roman Mamedov @ 2020-02-02 14:14 UTC (permalink / raw)
  To: Skibbi; +Cc: linux-btrfs

On Sun, 2 Feb 2020 13:45:58 +0100
Skibbi <skibbi@gmail.com> wrote:

> root@rpi4b:~# dmesg |grep btrfs
> [223167.290255] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> [223167.389690] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> root@rpi4b:~# dmesg |grep BTRFS

Try without that grep, and see if anything else happened to cause these errors.

...
> [203285.469285] BTRFS info (device sda1): read error corrected: ino 0
> off 32759808 (dev /dev/sda1 sector 80368)
> [203285.469515] BTRFS info (device sda1): read error corrected: ino 0
> off 32763904 (dev /dev/sda1 sector 80376)
> [204448.566295] BTRFS: device label NAS devid 1 transid 5 /dev/dm-0

Such as here, doesn't this look like the device may have disconnected and
reappeared (to make btrfs show the "device" message).

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
  2020-02-02 12:56 ` Qu Wenruo
  2020-02-02 14:14 ` Roman Mamedov
@ 2020-02-02 14:45 ` Swâmi Petaramesh
  2020-02-02 23:34   ` Zygo Blaxell
  2020-02-02 19:56 ` Chris Murphy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Swâmi Petaramesh @ 2020-02-02 14:45 UTC (permalink / raw)
  To: Skibbi, linux-btrfs

Le 02/02/2020 à 13:45, Skibbi a écrit :
> So I decided to try btrfs on my new portable WD Password Drive
> attached to Raspberry Pi 4. I created GPT partition, created luks2
> volume and formatted it with btrfs. Then I created 3 subvolumes and
> started copying data from other disks to one of the subvolumes. After
> writing around 40GB of data my filesystem crashed. That was super fast
> and totally discouraged me from next attempts to use btrfs :(

For what it's worth, I've been using BTRFS for 5+ *years* on removable,
encrypted hard disks, and use them daily on Raspberry Pis with 4.19
kernels and *never* hit a single problem.

The only time I lost a filesystem whas when I got hit by the infamous
5.2 bug, and it was on a classical laptop, not on a pi...

Kind regards.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
                   ` (2 preceding siblings ...)
  2020-02-02 14:45 ` Swâmi Petaramesh
@ 2020-02-02 19:56 ` Chris Murphy
  2020-02-03  6:38   ` Skibbi
  2020-02-02 19:57 ` Chris Murphy
  2020-02-03 19:14 ` Achim Gratz
  5 siblings, 1 reply; 24+ messages in thread
From: Chris Murphy @ 2020-02-02 19:56 UTC (permalink / raw)
  To: Skibbi; +Cc: Btrfs BTRFS

On Sun, Feb 2, 2020 at 5:45 AM Skibbi <skibbi@gmail.com> wrote:

> root@rpi4b:~# dmesg |grep btrfs
> [223167.290255] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> [223167.389690] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure
> root@rpi4b:~# dmesg |grep BTRFS

The entire unfiltered dmesg is needed. This older kernel doesn't have
new enough Btrfs tree checker code to help determine what the problem
is.

> [203285.351377] BTRFS error (device sda1): bad tree block start, want
> 31457280 have 0

> [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
> off 32735232 (dev /dev/sda1 sector 80320)

> [218811.383208] BTRFS error (device dm-0): bad tree block start, want
> 50659328 have 7653333615399691647

These happening together suggest lower storage stack failure. Since
kernel messages are filtered it only shows that Btrfs is working as
designed, complaining about known bad file system metadata. But
because it's filtered, it's not clear why the metadata has gone bad.

> [223167.290255] BTRFS: error (device dm-0) in
> btrfs_run_delayed_refs:2935: errno=-5 IO failure

More suggestion of IO failure, whether physical device or logical
layer in between Btrfs and physical device. Btrfs trusts the storage
stack *less* than other file systems, by design. It's a kind of canary
in the coal mine. Other file systems assume the storage stack is
working, so they're less likely to complain. Only recent versions of
e2fsprogs will format ext4 using metadata checksumming enabled. The
kind of problems you're reporting look so bad and happen so fast I'd
expect a good chance you'd reproduce the same problem with any
metadata checksumming file system, if you have new enough progs to
enable them.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
                   ` (3 preceding siblings ...)
  2020-02-02 19:56 ` Chris Murphy
@ 2020-02-02 19:57 ` Chris Murphy
  2020-02-03 19:14 ` Achim Gratz
  5 siblings, 0 replies; 24+ messages in thread
From: Chris Murphy @ 2020-02-02 19:57 UTC (permalink / raw)
  To: Skibbi; +Cc: Btrfs BTRFS

Also, what are the mount options for this file system?
`mount | grep btrfs`


--
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 13:22   ` Stephan von Krawczynski
@ 2020-02-02 20:04     ` Chris Murphy
  0 siblings, 0 replies; 24+ messages in thread
From: Chris Murphy @ 2020-02-02 20:04 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Btrfs BTRFS

On Sun, Feb 2, 2020 at 6:29 AM Stephan von Krawczynski
<skraw.ml@ithnet.com> wrote:
>
> On Sun, 2 Feb 2020 20:56:20 +0800
> Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
> > On 2020/2/2 下午8:45, Skibbi wrote:
> > > Hello,
> > > So I decided to try btrfs on my new portable WD Password Drive
> > > attached to Raspberry Pi 4. I created GPT partition, created luks2
> > > volume and formatted it with btrfs. Then I created 3 subvolumes and
> > > started copying data from other disks to one of the subvolumes. After
> > > writing around 40GB of data my filesystem crashed. That was super fast
> > > and totally discouraged me from next attempts to use btrfs :(
> > > But I would like to help with development so before I reformat my
> > > drive I can help you identifying potential issues with this filesystem
> > > by providing some debugging info.
> > >
> > > Here are some details:
> > >
> > > root@rpi4b:~# uname -a
> > > Linux rpi4b 4.19.93-v7l+ #1290 SMP Fri Jan 10 16:45:11 GMT 2020 armv7l
> > > GNU/Linux
> >
> > Pretty old kernel, nor recently enough backports.
>
> Exactly this kind of answer made me leave btrfs and never come back again.
> 4.19.93 is not very far away from the _latest_ longterm kernel released (which
> is 4.19.101).
> What you are saying here is that there is no stable working btrfs in longterm
> kernels at all.

No, Qu means there's not enough tree checking information to do
anything other than speculate what the problem is. There's a lot more
information provided by recent kernels about fs corruption causes, and
that's just not going to be backported, it's too much work.

> Hear, hear.
> My advice to the OP: use ZFS. Great performance, absolutely stable, no crash
> in years.

Based on the available information, it would probably spectacularly
fail too because of underlying storage betrayal. And if those kernel
messages were likewise filtered, it'd suggests ZFS confusion.
Unsurprising. ZFS doesn't let you magically use a failing storage
stack.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 14:45 ` Swâmi Petaramesh
@ 2020-02-02 23:34   ` Zygo Blaxell
  2020-02-03  6:28     ` Skibbi
  2020-02-03  7:00     ` Swâmi Petaramesh
  0 siblings, 2 replies; 24+ messages in thread
From: Zygo Blaxell @ 2020-02-02 23:34 UTC (permalink / raw)
  To: Swâmi Petaramesh; +Cc: Skibbi, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4108 bytes --]

On Sun, Feb 02, 2020 at 03:45:00PM +0100, Swâmi Petaramesh wrote:
> Le 02/02/2020 à 13:45, Skibbi a écrit :
> > So I decided to try btrfs on my new portable WD Password Drive
> > attached to Raspberry Pi 4. I created GPT partition, created luks2
> > volume and formatted it with btrfs. Then I created 3 subvolumes and
> > started copying data from other disks to one of the subvolumes. After
> > writing around 40GB of data my filesystem crashed. That was super fast
> > and totally discouraged me from next attempts to use btrfs :(
> 
> For what it's worth, I've been using BTRFS for 5+ *years* on removable,
> encrypted hard disks, and use them daily on Raspberry Pis with 4.19
> kernels and *never* hit a single problem.

Same here, except I have seen problems as well as successes.  Some hints:

The log is incomplete but there is some evidence of USB disconnects.
These are bad.  Fix those before you try to use this hardware to store
data.

Disable write caching (hdparm -W0).  The worst case is a USB disconnect
while there are uncompleted writes still in the drive memory.  Filesystems
get severely damaged when that happens.  Most filesystems silently
corrupt your data when that happens.  If write cache is disabled (and
the USB-SATA bridge firmware isn't garbage) then a disconnect doesn't
do as much damage and most filesystems can recover from it.  btrfs is
very good at batching up writes so write caching does not contribute
significantly to performance.

Cables can be a near-bottomless source of problems, because a bad
cable will trigger USB disconnects.  I find that a USB data cable will
work for a certain number of connections and disconnections, and once
that number is exceeded the cable is garbage and should be recycled.
For cheaper cables that number can be as low as 5.  Some even fail on
the first connection.

Some USB->SATA bridge firmwares are broken, just swap it out with a
different model and it'll be fine (though it may be difficult to do this
with a WD Passport drive without taking the drive apart and placing the
drive in a generic USB drive enclosure).  It is not possible to tell
what board revision or chip/firmware revision is used from the outside,
you have to open the drive and look at the USB-SATA bridge electronics.
Sometimes you can buy two of the same model USB-SATA bridges from
the same shop on the same day and the boards (and bugs) are completely
different inside.  You may find one drive mysteriously works and another
"identical" drive does not.

If the drive disconnects, umount it before reconnecting.  Disable any
configuration settings that might try to hide a USB device disconnection
from the upper storage layers.  btrfs normally detects this and sets
itself read-only, but if somehow that doesn't happen, the filesystem will
be destroyed because part of the commit history will be missing on disk.
On RAID1 arrays of USB devices it's more complicated, you need to run
replace or scrub on the disk that disconnected to reconstruct the
missing data from drives that didn't disconnect.

Once you've purged your setup of broken firmware and cables, it can run
for years without incident.

4.19 doesn't have metadata-corrupting bugs that I know of.

I would be wary of 32-bit ARM.  btrfs is most tested on amd64, and
other architectures sometimes have problems that amd64 simply does not,
especially on large (8T+) filesystems where uint32 isn't enough for a
device address.  That said, I have a dozen Raspberry Pis on 5.0.21 and
haven't encountered issues other than the usual SD card failure every
few years--but the largest filesystem on these is 128GB.

Also watch out for weak power supplies on Raspberry Pi boards.  The CPU
and memory run at a significantly lower voltage than the USB interface,
and one symptom of a power supply that is too small or too old is that
all the USB devices stop working reliably.

> The only time I lost a filesystem whas when I got hit by the infamous
> 5.2 bug, and it was on a classical laptop, not on a pi...
>
> Kind regards.
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 23:34   ` Zygo Blaxell
@ 2020-02-03  6:28     ` Skibbi
  2020-02-03 16:12       ` Chris Murphy
  2020-02-03  7:00     ` Swâmi Petaramesh
  1 sibling, 1 reply; 24+ messages in thread
From: Skibbi @ 2020-02-03  6:28 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Swâmi Petaramesh, linux-btrfs

pon., 3 lut 2020 o 00:34 Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
napisał(a):
>
> Same here, except I have seen problems as well as successes.  Some hints:
>
> The log is incomplete but there is some evidence of USB disconnects.
> These are bad.  Fix those before you try to use this hardware to store
> data.

Yeah, I found out some errors in dmesg suggesting this:
[  370.569700] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  428.820969] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  473.621875] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  618.254211] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd
[  664.334958] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
using xhci_hcd

> Disable write caching (hdparm -W0).  The worst case is a USB disconnect
> while there are uncompleted writes still in the drive memory.  Filesystems
> get severely damaged when that happens.  Most filesystems silently
> corrupt your data when that happens.  If write cache is disabled (and
> the USB-SATA bridge firmware isn't garbage) then a disconnect doesn't
> do as much damage and most filesystems can recover from it.  btrfs is
> very good at batching up writes so write caching does not contribute
> significantly to performance.

Thanks for the tip - will try this.

> Cables can be a near-bottomless source of problems, because a bad
> cable will trigger USB disconnects.  I find that a USB data cable will
> work for a certain number of connections and disconnections, and once
> that number is exceeded the cable is garbage and should be recycled.
> For cheaper cables that number can be as low as 5.  Some even fail on
> the first connection.

The disk is brand new so I don't expect that cable is broken. I tested
the drive under windows and it was working OK.

> Some USB->SATA bridge firmwares are broken, just swap it out with a
> different model and it'll be fine (though it may be difficult to do this
> with a WD Passport drive without taking the drive apart and placing the
> drive in a generic USB drive enclosure).  It is not possible to tell
> what board revision or chip/firmware revision is used from the outside,
> you have to open the drive and look at the USB-SATA bridge electronics.
> Sometimes you can buy two of the same model USB-SATA bridges from
> the same shop on the same day and the boards (and bugs) are completely
> different inside.  You may find one drive mysteriously works and another
> "identical" drive does not.

Yeah, WD Passport Drives are using USB-SATA. I will experiment a bit
more with that.

> If the drive disconnects, umount it before reconnecting.  Disable any
> configuration settings that might try to hide a USB device disconnection
> from the upper storage layers.  btrfs normally detects this and sets
> itself read-only, but if somehow that doesn't happen, the filesystem will
> be destroyed because part of the commit history will be missing on disk.
> On RAID1 arrays of USB devices it's more complicated, you need to run
> replace or scrub on the disk that disconnected to reconstruct the
> missing data from drives that didn't disconnect.
>
> Once you've purged your setup of broken firmware and cables, it can run
> for years without incident.
>
> 4.19 doesn't have metadata-corrupting bugs that I know of.
>
> I would be wary of 32-bit ARM.  btrfs is most tested on amd64, and
> other architectures sometimes have problems that amd64 simply does not,
> especially on large (8T+) filesystems where uint32 isn't enough for a
> device address.  That said, I have a dozen Raspberry Pis on 5.0.21 and
> haven't encountered issues other than the usual SD card failure every
> few years--but the largest filesystem on these is 128GB.
>
> Also watch out for weak power supplies on Raspberry Pi boards.  The CPU
> and memory run at a significantly lower voltage than the USB interface,
> and one symptom of a power supply that is too small or too old is that
> all the USB devices stop working reliably.

Yeah, I need to check if my Pi is not having power issues under heavy
load (save data on encrypted partition).

> > The only time I lost a filesystem whas when I got hit by the infamous
> > 5.2 bug, and it was on a classical laptop, not on a pi...
> >
> > Kind regards.
> >
Best regards

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 19:56 ` Chris Murphy
@ 2020-02-03  6:38   ` Skibbi
  2020-02-03  6:51     ` Qu Wenruo
  2020-02-03 16:17     ` Chris Murphy
  0 siblings, 2 replies; 24+ messages in thread
From: Skibbi @ 2020-02-03  6:38 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

niedz., 2 lut 2020 o 20:56 Chris Murphy <lists@colorremedies.com> napisał(a):
>
> On Sun, Feb 2, 2020 at 5:45 AM Skibbi <skibbi@gmail.com> wrote:
>
> > root@rpi4b:~# dmesg |grep btrfs
> > [223167.290255] BTRFS: error (device dm-0) in
> > btrfs_run_delayed_refs:2935: errno=-5 IO failure
> > [223167.389690] BTRFS: error (device dm-0) in
> > btrfs_run_delayed_refs:2935: errno=-5 IO failure
> > root@rpi4b:~# dmesg |grep BTRFS
>
> The entire unfiltered dmesg is needed. This older kernel doesn't have
> new enough Btrfs tree checker code to help determine what the problem
> is.

OK, I need to reformat my drive and reproduce the issue again.

> > [203285.351377] BTRFS error (device sda1): bad tree block start, want
> > 31457280 have 0
>
> > [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
> > off 32735232 (dev /dev/sda1 sector 80320)
>
> > [218811.383208] BTRFS error (device dm-0): bad tree block start, want
> > 50659328 have 7653333615399691647
>
> These happening together suggest lower storage stack failure. Since
> kernel messages are filtered it only shows that Btrfs is working as
> designed, complaining about known bad file system metadata. But
> because it's filtered, it's not clear why the metadata has gone bad.
>
> > [223167.290255] BTRFS: error (device dm-0) in
> > btrfs_run_delayed_refs:2935: errno=-5 IO failure
>
> More suggestion of IO failure, whether physical device or logical
> layer in between Btrfs and physical device. Btrfs trusts the storage
> stack *less* than other file systems, by design. It's a kind of canary
> in the coal mine. Other file systems assume the storage stack is
> working, so they're less likely to complain. Only recent versions of
> e2fsprogs will format ext4 using metadata checksumming enabled. The
> kind of problems you're reporting look so bad and happen so fast I'd
> expect a good chance you'd reproduce the same problem with any
> metadata checksumming file system, if you have new enough progs to
> enable them.

I removed luks encryption and had the same btrfs errors after several
GB of writes. Then I reformatted drive to ext4 and was able to save
60GB without hiccups. Of course, you may be right that ext4 silently
damages my data, but at least I was able to see it on the drive after
remount/reboot.
I'm beginning to think that my Pi draws more power when used with
external drive (I used only pendrives so far) so I need to investigate
for power issues.
And also I need to figure out how to get newer kernel. Raspbian is not
the freshest distro...

-- 
Best regards

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03  6:38   ` Skibbi
@ 2020-02-03  6:51     ` Qu Wenruo
  2020-02-03  8:42       ` Skibbi
  2020-02-03 16:17     ` Chris Murphy
  1 sibling, 1 reply; 24+ messages in thread
From: Qu Wenruo @ 2020-02-03  6:51 UTC (permalink / raw)
  To: Skibbi, Chris Murphy; +Cc: Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 3467 bytes --]



On 2020/2/3 下午2:38, Skibbi wrote:
> niedz., 2 lut 2020 o 20:56 Chris Murphy <lists@colorremedies.com> napisał(a):
>>
>> On Sun, Feb 2, 2020 at 5:45 AM Skibbi <skibbi@gmail.com> wrote:
>>
>>> root@rpi4b:~# dmesg |grep btrfs
>>> [223167.290255] BTRFS: error (device dm-0) in
>>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>>> [223167.389690] BTRFS: error (device dm-0) in
>>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>>> root@rpi4b:~# dmesg |grep BTRFS
>>
>> The entire unfiltered dmesg is needed. This older kernel doesn't have
>> new enough Btrfs tree checker code to help determine what the problem
>> is.
> 
> OK, I need to reformat my drive and reproduce the issue again.
> 
>>> [203285.351377] BTRFS error (device sda1): bad tree block start, want
>>> 31457280 have 0
>>
>>> [203285.466743] BTRFS info (device sda1): read error corrected: ino 0
>>> off 32735232 (dev /dev/sda1 sector 80320)
>>
>>> [218811.383208] BTRFS error (device dm-0): bad tree block start, want
>>> 50659328 have 7653333615399691647
>>
>> These happening together suggest lower storage stack failure. Since
>> kernel messages are filtered it only shows that Btrfs is working as
>> designed, complaining about known bad file system metadata. But
>> because it's filtered, it's not clear why the metadata has gone bad.
>>
>>> [223167.290255] BTRFS: error (device dm-0) in
>>> btrfs_run_delayed_refs:2935: errno=-5 IO failure
>>
>> More suggestion of IO failure, whether physical device or logical
>> layer in between Btrfs and physical device. Btrfs trusts the storage
>> stack *less* than other file systems, by design. It's a kind of canary
>> in the coal mine. Other file systems assume the storage stack is
>> working, so they're less likely to complain. Only recent versions of
>> e2fsprogs will format ext4 using metadata checksumming enabled. The
>> kind of problems you're reporting look so bad and happen so fast I'd
>> expect a good chance you'd reproduce the same problem with any
>> metadata checksumming file system, if you have new enough progs to
>> enable them.
> 
> I removed luks encryption and had the same btrfs errors after several
> GB of writes. Then I reformatted drive to ext4 and was able to save
> 60GB without hiccups. Of course, you may be right that ext4 silently
> damages my data, but at least I was able to see it on the drive after
> remount/reboot.

BTW, still the same USB related error when the write to btrfs fails?
Or just plain btrfs errors without any other USB/block layer related
error messages?

> I'm beginning to think that my Pi draws more power when used with
> external drive (I used only pendrives so far) so I need to investigate
> for power issues.

That also looks promising.
But since it's a USB hdd, what about try it with regular PC?

IIRC, if you can prove that the same disk work fine with PC (even with
the same kernel version), then it's obvious who is to blame.

> And also I need to figure out how to get newer kernel. Raspbian is not
> the freshest distro...
> 

Just as mentioned, Archlinux ARM has the latest kernel.

But it's a pain in the ass to setup, especially when RPI4 is not
officially supported, you have to go through something even harder than
regular Archlinux insitallation procedure.

Another recommendation is Manjaro ARM, which has just slightly older
kernels, but much easier to use I guess.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 23:34   ` Zygo Blaxell
  2020-02-03  6:28     ` Skibbi
@ 2020-02-03  7:00     ` Swâmi Petaramesh
  1 sibling, 0 replies; 24+ messages in thread
From: Swâmi Petaramesh @ 2020-02-03  7:00 UTC (permalink / raw)
  To: Zygo Blaxell; +Cc: Skibbi, linux-btrfs

On 2020-02-03 00:34, Zygo Blaxell wrote:
>
> Some USB->SATA bridge firmwares are broken, just swap it out with a
> different model and it'll be fine (though it may be difficult to do this
> with a WD Passport drive without taking the drive apart and placing the
> drive in a generic USB drive enclosure).

There are even drives with the USB bridge built-in and no SATA connector 
at all...

> Also watch out for weak power supplies on Raspberry Pi boards.  The CPU
> and memory run at a significantly lower voltage than the USB interface,
> and one symptom of a power supply that is too small or too old is that
> all the USB devices stop working reliably.

This is probably the most important advice. On Raspberry Pis + HDs, weak 
power supplies are the major problem providers.

I happen to have a different device, an HP Pavillion Detachable X2 
(tablet PC-like) on which one of my USB HDs, one I made myself by 
putting a 2"1/2 SATA HD in an USB enclosure, will disconnect everytime I 
try to use it without having the laptop powered by its AC power supply...

It uses BTRFS and gets USB disconnected at first attempt to write or 
remount the FS. But this never damaged the FS.

Kind regards.

-- 

ॐ

Swâmi Petaramesh <swami@petaramesh.org> PGP 9076E32E


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03  6:51     ` Qu Wenruo
@ 2020-02-03  8:42       ` Skibbi
  2020-02-03 10:10         ` Qu Wenruo
  0 siblings, 1 reply; 24+ messages in thread
From: Skibbi @ 2020-02-03  8:42 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Chris Murphy, Btrfs BTRFS

pon., 3 lut 2020 o 07:51 Qu Wenruo <quwenruo.btrfs@gmx.com> napisał(a):
>
> > I'm beginning to think that my Pi draws more power when used with
> > external drive (I used only pendrives so far) so I need to investigate
> > for power issues.
>
> That also looks promising.
> But since it's a USB hdd, what about try it with regular PC?

I tried on widows and disk worked fine. I replaced Pi power supply and
surprise-surprise my disk is working fine! Btrfs + luks encryption. So
it seems power was the culprit.
However I'm a bit concerned about stability of the filesystem. I would
expect some data loss when drive is disconnected, but why the whole
filesystem is broken?
I can't ensure that power failures will not happen in the future, so
I'm still not sure if I should go with btrfs?

-- 
Best regards

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03  8:42       ` Skibbi
@ 2020-02-03 10:10         ` Qu Wenruo
  2020-02-03 10:17           ` Qu Wenruo
  2020-02-03 10:56           ` Skibbi
  0 siblings, 2 replies; 24+ messages in thread
From: Qu Wenruo @ 2020-02-03 10:10 UTC (permalink / raw)
  To: Skibbi; +Cc: Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 1733 bytes --]



On 2020/2/3 下午4:42, Skibbi wrote:
> pon., 3 lut 2020 o 07:51 Qu Wenruo <quwenruo.btrfs@gmx.com> napisał(a):
>>
>>> I'm beginning to think that my Pi draws more power when used with
>>> external drive (I used only pendrives so far) so I need to investigate
>>> for power issues.
>>
>> That also looks promising.
>> But since it's a USB hdd, what about try it with regular PC?
> 
> I tried on widows and disk worked fine. I replaced Pi power supply and
> surprise-surprise my disk is working fine! Btrfs + luks encryption. So
> it seems power was the culprit.

That's great to know!

> However I'm a bit concerned about stability of the filesystem. I would
> expect some data loss when drive is disconnected, but why the whole
> filesystem is broken?

It depends on the timing.

In fact, as your initial report said, btrfs even succeeded to read some
tree copy from the disk when we lost the device for a while.
And finally goes RO if btrfs fails to write any tree blocks.

In all cases, btrfs shouldn't fail if all disks follows FUA/FLUSH
behavior (aka, if FUA/FLUSH returns, all related write should reach disk).

> I can't ensure that power failures will not happen in the future, so
> I'm still not sure if I should go with btrfs?
> 
IIRC, either other fses just ignore any write error (and cause more
serious problem later, not only data corruption but also metadata
corruption) or just fail like btrfs when disks suddenly disappear.

If the disk is not reliable, then it depends on what really you want.

A kinda paranoid fs which refuses to further screw up the fs, or a fs
ignoring most errors until the whole thing experience a rapid
unscheduled disassembly.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03 10:10         ` Qu Wenruo
@ 2020-02-03 10:17           ` Qu Wenruo
  2020-02-03 10:56           ` Skibbi
  1 sibling, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2020-02-03 10:17 UTC (permalink / raw)
  To: Skibbi; +Cc: Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 2588 bytes --]



On 2020/2/3 下午6:10, Qu Wenruo wrote:
> 
> 
> On 2020/2/3 下午4:42, Skibbi wrote:
>> pon., 3 lut 2020 o 07:51 Qu Wenruo <quwenruo.btrfs@gmx.com> napisał(a):
>>>
>>>> I'm beginning to think that my Pi draws more power when used with
>>>> external drive (I used only pendrives so far) so I need to investigate
>>>> for power issues.
>>>
>>> That also looks promising.
>>> But since it's a USB hdd, what about try it with regular PC?
>>
>> I tried on widows and disk worked fine. I replaced Pi power supply and
>> surprise-surprise my disk is working fine! Btrfs + luks encryption. So
>> it seems power was the culprit.
> 
> That's great to know!
> 
>> However I'm a bit concerned about stability of the filesystem. I would
>> expect some data loss when drive is disconnected, but why the whole
>> filesystem is broken?
> 
> It depends on the timing.
> 
> In fact, as your initial report said, btrfs even succeeded to read some
> tree copy from the disk when we lost the device for a while.
> And finally goes RO if btrfs fails to write any tree blocks.
> 
> In all cases, btrfs shouldn't fail if all disks follows FUA/FLUSH
> behavior (aka, if FUA/FLUSH returns, all related write should reach disk).

BTW, here "shouldn't fail" I mean, btrfs shouldn't lose COW data or
metadata at all.

Either the disk fails before superblock write, then the fs should be
what it used to be.
Or the disk fails after super block write, then the fs should be in the
new stats (including both metadata and CoWed data).

Unlike traditional fs, they only keep metadata sane while can lose data
when experiencing power loss by default. (They can do journal protection
for data, but has huge performance penalty than btrfs COW)

For re-appearing disk, I really don't have much good idea to address,
nor other fses would.
What we can really do is just to keep the fs is still fine to be mounted
back after disk disapperance.

Thanks,
Qu
> 
>> I can't ensure that power failures will not happen in the future, so
>> I'm still not sure if I should go with btrfs?
>>
> IIRC, either other fses just ignore any write error (and cause more
> serious problem later, not only data corruption but also metadata
> corruption) or just fail like btrfs when disks suddenly disappear.
> 
> If the disk is not reliable, then it depends on what really you want.
> 
> A kinda paranoid fs which refuses to further screw up the fs, or a fs
> ignoring most errors until the whole thing experience a rapid
> unscheduled disassembly.
> 
> Thanks,
> Qu
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03 10:10         ` Qu Wenruo
  2020-02-03 10:17           ` Qu Wenruo
@ 2020-02-03 10:56           ` Skibbi
  2020-02-03 11:09             ` Qu Wenruo
  1 sibling, 1 reply; 24+ messages in thread
From: Skibbi @ 2020-02-03 10:56 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Chris Murphy, Btrfs BTRFS

pon., 3 lut 2020 o 11:11 Qu Wenruo <quwenruo.btrfs@gmx.com> napisał(a):

> It depends on the timing.
>
> In fact, as your initial report said, btrfs even succeeded to read some
> tree copy from the disk when we lost the device for a while.
> And finally goes RO if btrfs fails to write any tree blocks.

Yeah, it wen't RO but when I tried to remount I got bad superblock bla
bla. And I was unable to fix this by using btrfs repair for example.
I'm not sure if it possible to recover from the error I got. That's
why I'm concerned about power issues in the future. I've been using
ext4 for decades and I don't remember that fatal filesystem crash.
Yeah I lost some data due to bad sectors or power loss but I was
always able to mount the filesystem.

-- 
Best regards

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03 10:56           ` Skibbi
@ 2020-02-03 11:09             ` Qu Wenruo
  0 siblings, 0 replies; 24+ messages in thread
From: Qu Wenruo @ 2020-02-03 11:09 UTC (permalink / raw)
  To: Skibbi; +Cc: Chris Murphy, Btrfs BTRFS


[-- Attachment #1.1: Type: text/plain, Size: 1690 bytes --]



On 2020/2/3 下午6:56, Skibbi wrote:
> pon., 3 lut 2020 o 11:11 Qu Wenruo <quwenruo.btrfs@gmx.com> napisał(a):
> 
>> It depends on the timing.
>>
>> In fact, as your initial report said, btrfs even succeeded to read some
>> tree copy from the disk when we lost the device for a while.
>> And finally goes RO if btrfs fails to write any tree blocks.
> 
> Yeah, it wen't RO but when I tried to remount I got bad superblock bla
> bla.

Then that's the most important part.

> And I was unable to fix this by using btrfs repair for example.

btrfs check --repair is really the hardest part.
In theory we shouldn't even need it, but you know that's not the reality.

> I'm not sure if it possible to recover from the error I got. That's
> why I'm concerned about power issues in the future. I've been using
> ext4 for decades and I don't remember that fatal filesystem crash.

All fses should survive power loss, obviously including btrfs.

> Yeah I lost some data due to bad sectors or power loss but I was
> always able to mount the filesystem.
> 
The current conclusion is, as long as the disk follows FUA/FLUSH
correctly, btrfs should provide even data consistency across power loss.
(except certain known bugs in the past, which should all have been fixed)

But the problem is, there seems to be some disks not following such
spec, especially in consumer grade HDDs, thus sometimes it's recommended
to disable write cache (aka, all writes returns when it reaches disk).

If you want to be extra safe, then you can go that solution.
The performance impact shouldn't be obvious, as linux page cache is
handling thing really well.

Thanks,
Qu


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03  6:28     ` Skibbi
@ 2020-02-03 16:12       ` Chris Murphy
  2020-02-03 19:01         ` Marc Joliet
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Murphy @ 2020-02-03 16:12 UTC (permalink / raw)
  To: Skibbi; +Cc: Zygo Blaxell, Swâmi Petaramesh, Btrfs BTRFS

On Sun, Feb 2, 2020 at 11:28 PM Skibbi <skibbi@gmail.com> wrote:
>
> pon., 3 lut 2020 o 00:34 Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
> napisał(a):
> >
> > Same here, except I have seen problems as well as successes.  Some hints:
> >
> > The log is incomplete but there is some evidence of USB disconnects.
> > These are bad.  Fix those before you try to use this hardware to store
> > data.
>
> Yeah, I found out some errors in dmesg suggesting this:
> [  370.569700] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> using xhci_hcd
> [  428.820969] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> using xhci_hcd
> [  473.621875] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> using xhci_hcd
> [  618.254211] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> using xhci_hcd
> [  664.334958] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> using xhci_hcd

I get these with a very common USB-SATA enclosure bridge chipset,
plugged directly into an Intel NUC. I also sometimes see dropped
writes. When I use a Dyconn USB hub (externally powered) it never
happens. I'm not a USB expert, but my understanding is a hub isn't a
simple thing, it's reading and rewriting the whole stream to and from
host and device. So any peculiarities between them tend to get cleaned
up.

> Yeah, WD Passport Drives are using USB-SATA. I will experiment a bit
> more with that.

It might be defaulting to using the Linux kernel's uas driver, there's
a way to blacklist that if it's causing problems. I have yet another
enclosure that gives me fits with uas driver, but again no problem if
connected through the hub.


> Yeah, I need to check if my Pi is not having power issues under heavy
> load (save data on encrypted partition).

A laptop drive will draw more than 1A on startup. And about 0.3A while
spinning and writing. That's quite a lot, hence also why I stick it on
a hub with an external power supply.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03  6:38   ` Skibbi
  2020-02-03  6:51     ` Qu Wenruo
@ 2020-02-03 16:17     ` Chris Murphy
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Murphy @ 2020-02-03 16:17 UTC (permalink / raw)
  To: Skibbi; +Cc: Chris Murphy, Btrfs BTRFS

On Sun, Feb 2, 2020 at 11:39 PM Skibbi <skibbi@gmail.com> wrote:
>
> I removed luks encryption and had the same btrfs errors after several
> GB of writes. Then I reformatted drive to ext4 and was able to save
> 60GB without hiccups. Of course, you may be right that ext4 silently
> damages my data, but at least I was able to see it on the drive after
> remount/reboot.

It could be days or months later that it shows up as a problem, if
there's no checksumming. What version of e2fsprogs? metadata_csum
became default in e2fsprogs 1.44.0.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-03 16:12       ` Chris Murphy
@ 2020-02-03 19:01         ` Marc Joliet
  0 siblings, 0 replies; 24+ messages in thread
From: Marc Joliet @ 2020-02-03 19:01 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1367 bytes --]

Am Montag, 3. Februar 2020, 17:12:17 CET schrieben Sie:
> > Yeah, I found out some errors in dmesg suggesting this:
> > [  370.569700] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> > using xhci_hcd
> > [  428.820969] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> > using xhci_hcd
> > [  473.621875] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> > using xhci_hcd
> > [  618.254211] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> > using xhci_hcd
> > [  664.334958] usb 2-1: reset SuperSpeed Gen 1 USB device number 2
> > using xhci_hcd
>
> I get these with a very common USB-SATA enclosure bridge chipset,
> plugged directly into an Intel NUC. I also sometimes see dropped
> writes. When I use a Dyconn USB hub (externally powered) it never
> happens. I'm not a USB expert, but my understanding is a hub isn't a
> simple thing, it's reading and rewriting the whole stream to and from
> host and device. So any peculiarities between them tend to get cleaned
> up.

FWIW, I used to see errors like this with my external HDD (3TB Toshiba), but
not anymore after I increased its device timeout, i.e., its SCSI command
timeout, to 3 minutes (following a recommendation on the Debian wiki).

--
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: My first attempt to use btrfs failed miserably
  2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
                   ` (4 preceding siblings ...)
  2020-02-02 19:57 ` Chris Murphy
@ 2020-02-03 19:14 ` Achim Gratz
  5 siblings, 0 replies; 24+ messages in thread
From: Achim Gratz @ 2020-02-03 19:14 UTC (permalink / raw)
  To: linux-btrfs

Skibbi writes:
> So I decided to try btrfs on my new portable WD Password Drive
> attached to Raspberry Pi 4.

That's an actual harddisk that likely pulls a lot of peak current, just
under the limit allowed by the USB standard.  You are begging for
trouble connecting that to the rasPi without an extra PSU for the drive.
I don't have that exact model here, but if you can feed external power
into the drive, do it.  Second, the rasPi4 itself is a bit of a power
hog, so you will need a stable power supply there as well (250mV of
overvoltage don't hurt either), even when the drive doesn't draw current
from the USB port.  The annoying thing with drives on USB is that they
may well seem to be OK if you didn't specifically look for
disconnect/reconnect events, but each time it happens that data is most
likely lost anyway.  Btrfs just tells you sooner than some other fs.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Samples for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-04-05  9:45 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-02 12:45 My first attempt to use btrfs failed miserably Skibbi
2020-02-02 12:56 ` Qu Wenruo
2020-02-02 13:22   ` Stephan von Krawczynski
2020-02-02 20:04     ` Chris Murphy
2020-02-02 13:29   ` Martin Raiber
2020-02-02 13:36     ` Qu Wenruo
2020-02-02 14:14 ` Roman Mamedov
2020-02-02 14:45 ` Swâmi Petaramesh
2020-02-02 23:34   ` Zygo Blaxell
2020-02-03  6:28     ` Skibbi
2020-02-03 16:12       ` Chris Murphy
2020-02-03 19:01         ` Marc Joliet
2020-02-03  7:00     ` Swâmi Petaramesh
2020-02-02 19:56 ` Chris Murphy
2020-02-03  6:38   ` Skibbi
2020-02-03  6:51     ` Qu Wenruo
2020-02-03  8:42       ` Skibbi
2020-02-03 10:10         ` Qu Wenruo
2020-02-03 10:17           ` Qu Wenruo
2020-02-03 10:56           ` Skibbi
2020-02-03 11:09             ` Qu Wenruo
2020-02-03 16:17     ` Chris Murphy
2020-02-02 19:57 ` Chris Murphy
2020-02-03 19:14 ` Achim Gratz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.