All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: checksum error in metadata node - best way to move root fs to new drive?
@ 2016-08-11 20:23 Dave T
  2016-08-12  4:13 ` Duncan
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Dave T @ 2016-08-11 20:23 UTC (permalink / raw)
  To: Duncan
  Cc: Nicholas D Steeves, Chris Murphy, Btrfs BTRFS, Austin S. Hemmelgarn

What I have gathered so far is the following:

1. my RAM is not faulty and I feel comfortable ruling out a memory
error as having anything to do with the reported problem.

2. my storage device does not seem to be faulty. I have not figured
out how to do more definitive testing, but smartctl reports it as
healthy.

3. this problem first happened on a normally running system in light
use. It had not recently crashed. But the root fs went read-only for
an unknown reason.

4. the aftermath of the initial problem may have been exacerbated by
hard resetting the system, but that's only a guess

> The compression-related problem is this:  Btrfs is considerably less tolerant of checksum-related errors on btrfs-compressed data

I'm an unsophisticated user. The argument in support of this statement
sounds convincing to me. Therefore, I think I should discontinue using
compression. Anyone disagree?

Is there anything else I should change? (Do I need to provide
additional information?)

What can I do to find out more about what caused the initial problem.
I have heard memory errors mentioned, but that's apparently not the
case here. I have heard crash recovery mentioned, but that isn't how
my problem initially happened.

I also have a few general questions:

1. Can one discontinue using the compress mount option if it has been
used previously? What happens to existing data if the compress mount
option is 1) added when it wasn't used before, or 2) dropped when it
had been used.

2. I understand that the compress option generally improves btrfs
performance (via Phoronix article I read in the past; I don't find the
link). Since encryption has some characteristics in common with
compression, would one expect any decrease in performance from
dropping compression when using btrfs on dm-crypt? (For more context,
with an i7 6700K which has aes-ni, CPU performance should not be a
bottleneck on my computer.)

3. How do I find out if it is appropriate to use dup metadata on a
Samsung 950 Pro NVMe drive? I don't see deduplication mentioned in the
drive's datasheet:
http://www.samsung.com/semiconductor/minisite/ssd/downloads/document/Samsung_SSD_950_PRO_Data_Sheet_Rev_1_2.pdf

4. Given that my drive is not reporting problems, does it seem
reasonable to re-use this drive after the errors I reported? If so,
how should I do that? Can I simply make a new btrfs filesystem and
copy my data back? Should I start at a lower level and re-do the
dm-crypt layer?

5. Would most of you guys use btrfs + dm-crypt on a production file
server (with spinning disks in JBOD configuration -- i.e., no RAID).
In this situation, the data is very important, of course. My past
experience indicated that RAID only improves uptime, which is not so
critical in our environment. Our main criteria is that we should never
ever have data loss. As far as I understand it, we do have to use
encryption.

Thanks for the discussion so far. It's very educational for me.

^ permalink raw reply	[flat|nested] 28+ messages in thread
* checksum error in metadata node - best way to move root fs to new drive?
@ 2016-08-10  3:27 Dave T
  2016-08-10  6:27 ` Duncan
  2016-08-10 21:15 ` Chris Murphy
  0 siblings, 2 replies; 28+ messages in thread
From: Dave T @ 2016-08-10  3:27 UTC (permalink / raw)
  To: linux-btrfs

btrfs scrub returned with uncorrectable errors. Searching in dmesg
returns the following information:

BTRFS warning (device dm-0): checksum error at logical NNNNN on
/dev/mapper/[crypto] sector: yyyyy metadata node (level 2) in tree 250

it also says:

unable to fixup (regular) error at logical NNNNNN on /dev/mapper/[crypto]


I assume I have a bad block device. Does that seem correct? The
important data is backed up.

However, it would save me a lot of time reinstalling the operating
system and setting up my work environment if I can copy this root
filesystem to another storage device.

Can I do that, considering the errors I have mentioned?? With the
uncorrectable error being in a metadata node, what (if anything) does
that imply about restoring from this drive?

If I can copy this entire root filesystem, what is the best way to do
it? The btrfs restore tool? cp? rsync? Some cloning tool? Other
options?

If I use the btrfs restore tool, should I use options x, m and S? In
particular I wonder exactly what the S option does. If I leave S out,
are all symlinks ignored?

I'm trying to save time and clone this so that I get the operating
system and all my tweaks / configurations back. As I said, the really
important data is separately backed up.

I appreciate all suggestions.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-08-15 11:33 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-11 20:23 checksum error in metadata node - best way to move root fs to new drive? Dave T
2016-08-12  4:13 ` Duncan
2016-08-12  8:14 ` Adam Borowski
2016-08-12 12:04 ` Austin S. Hemmelgarn
2016-08-12 15:06   ` Duncan
2016-08-15 11:33     ` Austin S. Hemmelgarn
2016-08-12 17:02   ` Chris Murphy
  -- strict thread matches above, loose matches on Subject: below --
2016-08-10  3:27 Dave T
2016-08-10  6:27 ` Duncan
2016-08-10 19:46   ` Austin S. Hemmelgarn
2016-08-10 21:21   ` Chris Murphy
2016-08-10 22:01     ` Dave T
2016-08-10 22:23       ` Chris Murphy
2016-08-10 22:52         ` Dave T
2016-08-11 14:12           ` Nicholas D Steeves
2016-08-11 14:45             ` Austin S. Hemmelgarn
2016-08-11 19:07             ` Duncan
2016-08-11 20:43               ` Chris Murphy
2016-08-12  3:11                 ` Duncan
2016-08-12  3:51                   ` Chris Murphy
2016-08-11 20:33             ` Chris Murphy
2016-08-11  7:18         ` Andrei Borzenkov
2016-08-11  4:50       ` Duncan
2016-08-11  5:06         ` Gareth Pye
2016-08-11  8:20           ` Duncan
2016-08-12 17:00     ` Patrik Lundquist
2016-08-10 21:15 ` Chris Murphy
2016-08-10 22:50   ` Dave T

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.