linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Doing anything with the external disk except mounting causes whole system lockup
@ 2022-12-19 17:32 Cerem Cem ASLAN
  2022-12-20 11:09 ` Qu Wenruo
  0 siblings, 1 reply; 4+ messages in thread
From: Cerem Cem ASLAN @ 2022-12-19 17:32 UTC (permalink / raw)
  To: Btrfs BTRFS

I've been using my scripts for mounting partitions, unmounting, btrfs
send|receive etc. while I'm dealing with my external/backup hard
disks.

Recently I had trouble so I reformatted my external spinning disk and
transferred all snapshots to it (~800GB).

At the end of transfer, there was an error (I might have modified the
bash script that is currently running) so after finishing the `btrbk
...` command, my script gave an error (that's normal), so I restarted
it. From this moment on, I could mount my partitions but I never did a
btrfs send|receive or scrub or unmount again because the system was
simply getting locked up.

I run `dmesg -w` command on another terminal but I didn't save it
(because the system was locked), so I took a photo of it:
https://imgur.com/LJfgjbY

I haven't lost any data and I still have another backup disk, so no
worries. I'm just keeping the disk just in case you may require some
more information this week.

* Linux erik3 6.0.0-0.deb11.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian
6.0.3-1~bpo11+1 (2022-10-29) x86_64 GNU/Linux
* btrfs-progs v5.10.1
* mount options I'm using: rw,noatime

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Doing anything with the external disk except mounting causes whole system lockup
  2022-12-19 17:32 Doing anything with the external disk except mounting causes whole system lockup Cerem Cem ASLAN
@ 2022-12-20 11:09 ` Qu Wenruo
  2022-12-20 11:56   ` Filipe Manana
  0 siblings, 1 reply; 4+ messages in thread
From: Qu Wenruo @ 2022-12-20 11:09 UTC (permalink / raw)
  To: Cerem Cem ASLAN, Btrfs BTRFS



On 2022/12/20 01:32, Cerem Cem ASLAN wrote:
> I've been using my scripts for mounting partitions, unmounting, btrfs
> send|receive etc. while I'm dealing with my external/backup hard
> disks.
> 
> Recently I had trouble so I reformatted my external spinning disk and
> transferred all snapshots to it (~800GB).
> 
> At the end of transfer, there was an error (I might have modified the
> bash script that is currently running) so after finishing the `btrbk
> ...` command, my script gave an error (that's normal), so I restarted
> it. From this moment on, I could mount my partitions but I never did a
> btrfs send|receive or scrub or unmount again because the system was
> simply getting locked up.
> 
> I run `dmesg -w` command on another terminal but I didn't save it
> (because the system was locked), so I took a photo of it:
> https://imgur.com/LJfgjbY

RCU stall, then it can be anything, I doubt if it's really btrfs causing 
the problem.

In fact, I hit similar problems before, very randomly, sometimes when 
playing some steam games, sometimes just random crash/lockup.

At least if you can setup a netconsole, we can have better view of the 
whole situation.

In my case, netconsole also sometimes points to RCU, sometimes some 
random generic protection error.

I tried replacing my RAM, no help. Finally I brought a new mobo and CPU 
(switched from 5900X + B450I to 13700K + B660I) and no crash anymore.

Thus if you're hitting RCU stalls, I'd strongly recommend to test the 
same fs on other systems.

Thanks,
Qu
> 
> I haven't lost any data and I still have another backup disk, so no
> worries. I'm just keeping the disk just in case you may require some
> more information this week.
> 
> * Linux erik3 6.0.0-0.deb11.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian
> 6.0.3-1~bpo11+1 (2022-10-29) x86_64 GNU/Linux
> * btrfs-progs v5.10.1
> * mount options I'm using: rw,noatime

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Doing anything with the external disk except mounting causes whole system lockup
  2022-12-20 11:09 ` Qu Wenruo
@ 2022-12-20 11:56   ` Filipe Manana
  2022-12-25  0:40     ` Cerem Cem ASLAN
  0 siblings, 1 reply; 4+ messages in thread
From: Filipe Manana @ 2022-12-20 11:56 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Cerem Cem ASLAN, Btrfs BTRFS

On Tue, Dec 20, 2022 at 11:52 AM Qu Wenruo <wqu@suse.com> wrote:
>
>
>
> On 2022/12/20 01:32, Cerem Cem ASLAN wrote:
> > I've been using my scripts for mounting partitions, unmounting, btrfs
> > send|receive etc. while I'm dealing with my external/backup hard
> > disks.
> >
> > Recently I had trouble so I reformatted my external spinning disk and
> > transferred all snapshots to it (~800GB).
> >
> > At the end of transfer, there was an error (I might have modified the
> > bash script that is currently running) so after finishing the `btrbk
> > ...` command, my script gave an error (that's normal), so I restarted
> > it. From this moment on, I could mount my partitions but I never did a
> > btrfs send|receive or scrub or unmount again because the system was
> > simply getting locked up.
> >
> > I run `dmesg -w` command on another terminal but I didn't save it
> > (because the system was locked), so I took a photo of it:
> > https://imgur.com/LJfgjbY
>
> RCU stall, then it can be anything, I doubt if it's really btrfs causing
> the problem.

Nop, it's a btrfs problem.
This was caused by a bad backport to 6.0.3, which is what Cerem is
running according to the screenshot.

This has been reported before, for example at:

https://lore.kernel.org/linux-btrfs/2291416ef48d98059f9fdc5d865b0ff040148237.camel@scientia.org/
https://lore.kernel.org/linux-btrfs/1c531dd5de7477c8b6ec15d4aebb8e42ae460925.camel@scientia.org/

This was eventually fixed in 6.0.5:

https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.0.5

by commit:

commit 217fd7557896d990c3dd8beea83a6feeb504f235
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Oct 26 12:24:13 2022 +0200

    Revert "btrfs: call __btrfs_remove_free_space_cache_locked on
cache load failure"

    This reverts commit 3ea7c50339859394dd667184b5b16eee1ebb53bc which is
    commit 8a1ae2781dee9fc21ca82db682d37bea4bd074ad upstream.

    It causes many reported btrfs issues, so revert it for now.

    Cc: Josef Bacik <josef@toxicpanda.com>
    Cc: David Sterba <dsterba@suse.com>
    Cc: Sasha Levin <sashal@kernel.org>
    Reported-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
    Link: https://lore.kernel.org/r/CAHfPjO8G1Tq2iJDhPry-dPj1vQZRh4NYuRmhHByHgu7_2rQkrQ@mail.gmail.com
    Reported-by: Ernst Herzberg <earny@net4u.de>
    Link: https://lore.kernel.org/r/8196dd88-4e11-78a7-8f96-20cf3e886e68@net4u.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

So just upgrade the kernel in your distro, the latest 6.0 stable
release is 6.0.14, so anything between 6.0.5 and that should be fine.


>
> In fact, I hit similar problems before, very randomly, sometimes when
> playing some steam games, sometimes just random crash/lockup.
>
> At least if you can setup a netconsole, we can have better view of the
> whole situation.
>
> In my case, netconsole also sometimes points to RCU, sometimes some
> random generic protection error.
>
> I tried replacing my RAM, no help. Finally I brought a new mobo and CPU
> (switched from 5900X + B450I to 13700K + B660I) and no crash anymore.
>
> Thus if you're hitting RCU stalls, I'd strongly recommend to test the
> same fs on other systems.
>
> Thanks,
> Qu
> >
> > I haven't lost any data and I still have another backup disk, so no
> > worries. I'm just keeping the disk just in case you may require some
> > more information this week.
> >
> > * Linux erik3 6.0.0-0.deb11.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian
> > 6.0.3-1~bpo11+1 (2022-10-29) x86_64 GNU/Linux
> > * btrfs-progs v5.10.1
> > * mount options I'm using: rw,noatime

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Doing anything with the external disk except mounting causes whole system lockup
  2022-12-20 11:56   ` Filipe Manana
@ 2022-12-25  0:40     ` Cerem Cem ASLAN
  0 siblings, 0 replies; 4+ messages in thread
From: Cerem Cem ASLAN @ 2022-12-25  0:40 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Qu Wenruo, Btrfs BTRFS

Since I didn't find the stable 6.0 kernel package in Debian
repositories, I downgraded the kernel to 5.19.0-0.deb11.2-amd64. This
solved the issue.

Thank you.

Filipe Manana <fdmanana@kernel.org>, 20 Ara 2022 Sal, 14:57 tarihinde
şunu yazdı:
>
> On Tue, Dec 20, 2022 at 11:52 AM Qu Wenruo <wqu@suse.com> wrote:
> >
> >
> >
> > On 2022/12/20 01:32, Cerem Cem ASLAN wrote:
> > > I've been using my scripts for mounting partitions, unmounting, btrfs
> > > send|receive etc. while I'm dealing with my external/backup hard
> > > disks.
> > >
> > > Recently I had trouble so I reformatted my external spinning disk and
> > > transferred all snapshots to it (~800GB).
> > >
> > > At the end of transfer, there was an error (I might have modified the
> > > bash script that is currently running) so after finishing the `btrbk
> > > ...` command, my script gave an error (that's normal), so I restarted
> > > it. From this moment on, I could mount my partitions but I never did a
> > > btrfs send|receive or scrub or unmount again because the system was
> > > simply getting locked up.
> > >
> > > I run `dmesg -w` command on another terminal but I didn't save it
> > > (because the system was locked), so I took a photo of it:
> > > https://imgur.com/LJfgjbY
> >
> > RCU stall, then it can be anything, I doubt if it's really btrfs causing
> > the problem.
>
> Nop, it's a btrfs problem.
> This was caused by a bad backport to 6.0.3, which is what Cerem is
> running according to the screenshot.
>
> This has been reported before, for example at:
>
> https://lore.kernel.org/linux-btrfs/2291416ef48d98059f9fdc5d865b0ff040148237.camel@scientia.org/
> https://lore.kernel.org/linux-btrfs/1c531dd5de7477c8b6ec15d4aebb8e42ae460925.camel@scientia.org/
>
> This was eventually fixed in 6.0.5:
>
> https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.0.5
>
> by commit:
>
> commit 217fd7557896d990c3dd8beea83a6feeb504f235
> Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date:   Wed Oct 26 12:24:13 2022 +0200
>
>     Revert "btrfs: call __btrfs_remove_free_space_cache_locked on
> cache load failure"
>
>     This reverts commit 3ea7c50339859394dd667184b5b16eee1ebb53bc which is
>     commit 8a1ae2781dee9fc21ca82db682d37bea4bd074ad upstream.
>
>     It causes many reported btrfs issues, so revert it for now.
>
>     Cc: Josef Bacik <josef@toxicpanda.com>
>     Cc: David Sterba <dsterba@suse.com>
>     Cc: Sasha Levin <sashal@kernel.org>
>     Reported-by: Tobias Powalowski <tobias.powalowski@googlemail.com>
>     Link: https://lore.kernel.org/r/CAHfPjO8G1Tq2iJDhPry-dPj1vQZRh4NYuRmhHByHgu7_2rQkrQ@mail.gmail.com
>     Reported-by: Ernst Herzberg <earny@net4u.de>
>     Link: https://lore.kernel.org/r/8196dd88-4e11-78a7-8f96-20cf3e886e68@net4u.de
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> So just upgrade the kernel in your distro, the latest 6.0 stable
> release is 6.0.14, so anything between 6.0.5 and that should be fine.
>
>
> >
> > In fact, I hit similar problems before, very randomly, sometimes when
> > playing some steam games, sometimes just random crash/lockup.
> >
> > At least if you can setup a netconsole, we can have better view of the
> > whole situation.
> >
> > In my case, netconsole also sometimes points to RCU, sometimes some
> > random generic protection error.
> >
> > I tried replacing my RAM, no help. Finally I brought a new mobo and CPU
> > (switched from 5900X + B450I to 13700K + B660I) and no crash anymore.
> >
> > Thus if you're hitting RCU stalls, I'd strongly recommend to test the
> > same fs on other systems.
> >
> > Thanks,
> > Qu
> > >
> > > I haven't lost any data and I still have another backup disk, so no
> > > worries. I'm just keeping the disk just in case you may require some
> > > more information this week.
> > >
> > > * Linux erik3 6.0.0-0.deb11.2-amd64 #1 SMP PREEMPT_DYNAMIC Debian
> > > 6.0.3-1~bpo11+1 (2022-10-29) x86_64 GNU/Linux
> > > * btrfs-progs v5.10.1
> > > * mount options I'm using: rw,noatime

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-12-25  0:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-19 17:32 Doing anything with the external disk except mounting causes whole system lockup Cerem Cem ASLAN
2022-12-20 11:09 ` Qu Wenruo
2022-12-20 11:56   ` Filipe Manana
2022-12-25  0:40     ` Cerem Cem ASLAN

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).