linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* btrfs scrub's dmesg log is fairly incomplete (rate-limiting?)
@ 2019-12-01 21:52 Fedja Beader
  2019-12-05 19:09 ` David Sterba
  0 siblings, 1 reply; 2+ messages in thread
From: Fedja Beader @ 2019-12-01 21:52 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I had a broken hard-disk from which ddrescue recovered all but about 1600MB of data. As a result, the copy of it had roughly 50000 uncorrectable errors as reported after scrub.

I have saved the dmesg log recorded during this scrub, parsed logical numbers out of it and finaly used "btrfs inspect-internal logical-resolve" to obtain a list of files.

However, after manually removing or restoring those files, the subsequent run of "btrfs scrub" still produced >45000 uncorrectable errors. Indeed, the reported files that were again obtained with the above method, are damaged (input/output error on cat > /dev/null).

It was suggested that rate-limiting could be the cause of this. I then recompiled the kernel with the (the, as in 4.9.24 there is only one occurance of it in btrfs_printk) "if (__ratelimit..." conditional commented out, rebooted and disabled dmesg ratelimiting with sysctl kernel.printk_ratelimit=0. Then again ran scrub.

The result of this scrub was 41000 uncorrectable errors. However, after manually repairing all the problems and re-running scrub, 39000 uncorrectable errors still remain.


Is there more rate-limiting going on? If so, how do I disable it?

It was also suggested to me to run btrfs check --check-data-csum, but it seems exceptionally slow (roughly 4 MB/s). Has this been addressed or am I doing something wrong?


kernel 4.9.24
btrfs-progs v4.6.1


With kind regards,
Fedja


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: btrfs scrub's dmesg log is fairly incomplete (rate-limiting?)
  2019-12-01 21:52 btrfs scrub's dmesg log is fairly incomplete (rate-limiting?) Fedja Beader
@ 2019-12-05 19:09 ` David Sterba
  0 siblings, 0 replies; 2+ messages in thread
From: David Sterba @ 2019-12-05 19:09 UTC (permalink / raw)
  To: Fedja Beader; +Cc: linux-btrfs

On Sun, Dec 01, 2019 at 09:52:13PM +0000, Fedja Beader wrote:
> I had a broken hard-disk from which ddrescue recovered all but about
> 1600MB of data. As a result, the copy of it had roughly 50000
> uncorrectable errors as reported after scrub.
> 
> I have saved the dmesg log recorded during this scrub, parsed logical
> numbers out of it and finaly used "btrfs inspect-internal
> logical-resolve" to obtain a list of files.
> 
> However, after manually removing or restoring those files, the
> subsequent run of "btrfs scrub" still produced >45000 uncorrectable
> errors. Indeed, the reported files that were again obtained with the
> above method, are damaged (input/output error on cat > /dev/null).
> 
> It was suggested that rate-limiting could be the cause of this. I then
> recompiled the kernel with the (the, as in 4.9.24 there is only one
> occurance of it in btrfs_printk) "if (__ratelimit..." conditional
> commented out, rebooted and disabled dmesg ratelimiting with sysctl
> kernel.printk_ratelimit=0. Then again ran scrub.
> 
> The result of this scrub was 41000 uncorrectable errors. However,
> after manually repairing all the problems and re-running scrub, 39000
> uncorrectable errors still remain.
> 
> Is there more rate-limiting going on? If so, how do I disable it?

That's indeed caused by ratelimiting. There are __ratelimit calls
specific to the scrub error messages (called in
scrub_handle_errored_block, scrub_print_warning). You can remove the
ratelimiting and get the flood of the messages for processing.

The dmesg messages are more or less supposed to point out to a handful
of problems like a few damaged blocks, for 40k messages it would be
really a lot. The ratelimiting can happen also internally when printk
decides that it throws away the messages (though I know it's trying not
to).

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-12-05 19:09 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-01 21:52 btrfs scrub's dmesg log is fairly incomplete (rate-limiting?) Fedja Beader
2019-12-05 19:09 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).