* 5.4.8: WARNING: errors detected during scrubbing, corrected
@ 2020-01-09 16:28 Marc MERLIN
2020-01-10 15:03 ` Josef Bacik
0 siblings, 1 reply; 3+ messages in thread
From: Marc MERLIN @ 2020-01-09 16:28 UTC (permalink / raw)
To: linux-btrfs
Howdy,
I have 6 btrfs pools on my laptop on 3 different SSDs.
After a few years, one of them is now very slow to scrub
and hands my laptop while it runs.
This started under 5.3.8, but upgrading to 5.4.8 didn't fix it.
Also, it output 'errors during scrubbing', but I see nothing in the kernel log:
btrfs scrub start -Bd /mnt/btrfs_pool2
scrub device /dev/mapper/pool2 (id 1) done
scrub started at Thu Jan 9 01:46:45 2020 and finished after 01:29:49
total bytes scrubbed: 1.27TiB with 0 errors
WARNING: errors detected during scrubbing, corrected
real 89m49.190s
user 0m0.000s
sys 13m26.548s
89mn is also longer than normal
balance works ok:
logger: Quick Metadata and Data Balance of /mnt/btrfs_pool2 (/dev/mapper/pool2)
Done, had to relocate 0 out of 837 chunks
Done, had to relocate 0 out of 837 chunks
Done, had to relocate 0 out of 837 chunks
I re-ran a bigger balance, and it ran fine too:
trfs balance start -musage=60 /mnt/btrfs_pool2; btrfs balance start -dusage=60 /mnt/btrfs_pool2
Jan 9 01:46:45 saruman kernel: [14530.056667] BTRFS info (device dm-3): balance: start -musage=0 -susage=0
Jan 9 01:46:45 saruman kernel: [14530.059623] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.134043] BTRFS info (device dm-3): balance: start -dusage=0
Jan 9 01:46:45 saruman kernel: [14530.135525] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.193798] BTRFS info (device dm-3): balance: start -dusage=20
Jan 9 01:46:45 saruman kernel: [14530.195642] BTRFS info (device dm-3): balance: ended with status: 0
Jan 9 01:46:45 saruman kernel: [14530.240290] BTRFS info (device dm-3): scrub: started on devid 1
Jan 9 01:58:21 saruman kernel: [15226.254196] Tainted: G W OE 5.4.8-amd64-preempt-sysrq-20190816 #1
Jan 9 01:58:21 saruman kernel: [15226.254198] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 9 01:58:21 saruman kernel: [15226.254201] btrfs-transacti D 0 12403 2 0x80004000
Jan 9 01:58:21 saruman kernel: [15226.254204] Call Trace:
Jan 9 01:58:21 saruman kernel: [15226.254211] ? __schedule+0x575/0x5d0
Jan 9 01:58:21 saruman kernel: [15226.254215] ? __list_add+0x12/0x2b
Jan 9 01:58:21 saruman kernel: [15226.254218] schedule+0x7b/0xac
Jan 9 01:58:21 saruman kernel: [15226.254222] btrfs_scrub_pause+0x99/0xd3
Jan 9 01:58:21 saruman kernel: [15226.254226] ? finish_wait+0x62/0x62
Jan 9 01:58:21 saruman kernel: [15226.254231] btrfs_commit_transaction+0x307/0x82b
Jan 9 01:58:21 saruman kernel: [15226.254235] ? start_transaction+0x37b/0x3ec
Jan 9 01:58:21 saruman kernel: [15226.254239] ? schedule_timeout+0xf/0xea
Jan 9 01:58:21 saruman kernel: [15226.254243] transaction_kthread+0xdd/0x151
Jan 9 01:58:21 saruman kernel: [15226.254247] ? btrfs_cleanup_transaction+0x417/0x417
Jan 9 01:58:21 saruman kernel: [15226.254250] kthread+0xf5/0xfa
Jan 9 01:58:21 saruman kernel: [15226.254253] ? kthread_create_worker_on_cpu+0x65/0x65
Jan 9 01:58:21 saruman kernel: [15226.254256] ret_from_fork+0x35/0x40
Jan 9 01:58:21 saruman kernel: [15226.254554] INFO: task cron:3869 blocked for more than 120 seconds.
from here, lots of hangs until eventually:
Jan 9 03:16:34 saruman kernel: [19919.454109] BTRFS info (device dm-3): scrub: finished on devid 1 with status: 0
I see no error about the scrub though.
saruman:/mnt/btrfs_pool2# btrfs fi show .
Label: 'btrfs_pool2' uuid: c3ac7621-79da-4d4f-bd59-d12fe7ba3578
Total devices 1 FS bytes used 785.58GiB
devid 1 size 1.12TiB used 831.21GiB path /dev/mapper/pool2
saruman:/mnt/btrfs_pool2# btrfs fi df .
Data, single: total=817.08GiB, used=779.88GiB
System, DUP: total=64.00MiB, used=128.00KiB
Metadata, DUP: total=7.00GiB, used=5.70GiB
GlobalReserve, single: total=512.00MiB, used=64.00KiB
saruman:/mnt/btrfs_pool2# btrfs fi usage .
Overall:
Device size: 1.12TiB
Device allocated: 831.21GiB
Device unallocated: 315.79GiB
Device missing: 0.00B
Used: 791.28GiB
Free (estimated): 352.99GiB (min: 195.10GiB)
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Data,single: Size:817.08GiB, Used:779.88GiB
/dev/mapper/pool2 817.08GiB
Metadata,DUP: Size:7.00GiB, Used:5.70GiB
/dev/mapper/pool2 14.00GiB
System,DUP: Size:64.00MiB, Used:128.00KiB
/dev/mapper/pool2 128.00MiB
Unallocated:
/dev/mapper/pool2 315.79GiB
I'm going to stop the scrub for now, but clearly that's not so good.
What should I try next?
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 5.4.8: WARNING: errors detected during scrubbing, corrected
2020-01-09 16:28 5.4.8: WARNING: errors detected during scrubbing, corrected Marc MERLIN
@ 2020-01-10 15:03 ` Josef Bacik
2020-01-10 16:22 ` Marc MERLIN
0 siblings, 1 reply; 3+ messages in thread
From: Josef Bacik @ 2020-01-10 15:03 UTC (permalink / raw)
To: Marc MERLIN, linux-btrfs
On 1/9/20 11:28 AM, Marc MERLIN wrote:
> Howdy,
>
> I have 6 btrfs pools on my laptop on 3 different SSDs.
> After a few years, one of them is now very slow to scrub
> and hands my laptop while it runs.
> This started under 5.3.8, but upgrading to 5.4.8 didn't fix it.
>
What the hell kind of laptop are you running that has 3 different SSDs? That
thing has got to weight a ton.
> Also, it output 'errors during scrubbing', but I see nothing in the kernel log:
> btrfs scrub start -Bd /mnt/btrfs_pool2
> scrub device /dev/mapper/pool2 (id 1) done
> scrub started at Thu Jan 9 01:46:45 2020 and finished after 01:29:49
> total bytes scrubbed: 1.27TiB with 0 errors
> WARNING: errors detected during scrubbing, corrected
>
> real 89m49.190s
> user 0m0.000s
> sys 13m26.548s
>
>
> 89mn is also longer than normal
Can you run the bcc tool offcputime
https://github.com/iovisor/bcc/blob/master/tools/offcputime.py
while scrub is running to get a few stack traces of where we're spending all of
our time? It'll help narrow down who is to blame. Thanks,
Josef
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 5.4.8: WARNING: errors detected during scrubbing, corrected
2020-01-10 15:03 ` Josef Bacik
@ 2020-01-10 16:22 ` Marc MERLIN
0 siblings, 0 replies; 3+ messages in thread
From: Marc MERLIN @ 2020-01-10 16:22 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs
On Fri, Jan 10, 2020 at 10:03:57AM -0500, Josef Bacik wrote:
> On 1/9/20 11:28 AM, Marc MERLIN wrote:
> > Howdy,
> >
> > I have 6 btrfs pools on my laptop on 3 different SSDs.
> > After a few years, one of them is now very slow to scrub
> > and hands my laptop while it runs.
> > This started under 5.3.8, but upgrading to 5.4.8 didn't fix it.
>
> What the hell kind of laptop are you running that has 3 different SSDs?
> That thing has got to weight a ton.
Eheh :)
Thinkpad P70, one M2 drive (room for one mor) and 2x 2.5" SSDs
> Can you run the bcc tool offcputime
>
> https://github.com/iovisor/bcc/blob/master/tools/offcputime.py
>
> while scrub is running to get a few stack traces of where we're spending all
> of our time? It'll help narrow down who is to blame. Thanks,
Will do and report back, thanks.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-01-10 16:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 16:28 5.4.8: WARNING: errors detected during scrubbing, corrected Marc MERLIN
2020-01-10 15:03 ` Josef Bacik
2020-01-10 16:22 ` Marc MERLIN
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.