All of lore.kernel.org
 help / color / mirror / Atom feed
* ERROR: failed to repair root items: Input/output error
@ 2017-12-10 15:18 constantine
  2017-12-10 15:38 ` Tomasz Pala
  0 siblings, 1 reply; 3+ messages in thread
From: constantine @ 2017-12-10 15:18 UTC (permalink / raw)
  To: Btrfs BTRFS

I have a laptop root hard drive (Samsung SSD 850 EVO 1TB), which is
within warranty.
I can't mount it read-write ("no rw mounting  after error").
The data are not really critical (I will overcome the shock of losing
them within a couple of days).


Btrfs check --repair throws an error:

sudo btrfs check --repair /dev/sdb1
enabling repair mode
Checking filesystem on /dev/sdb1
UUID: a955bc5f-e5f0-42ce-bd5a-de5eb8d5d3aa
checking extents
ERROR: add_tree_backref failed (non-leaf block): File exists
parent transid verify failed on 103009185792 wanted 5026 found 345954
parent transid verify failed on 103009185792 wanted 5026 found 345954
Ignoring transid failure
leaf parent key incorrect 103009185792
bad block 103009185792
ERROR: errors found in extent allocation tree or chunk allocation
checksum verify failed on 103009173504 found 25334496 wanted 00003500
checksum verify failed on 103009173504 found 25334496 wanted 00003500
bytenr mismatch, want=103009173504, have=889192478
ERROR: failed to repair root items: Input/output error

What do these errors mean?
What should I do to fix the filesystem and be able to mount it read-write?


Thank you,

Konstantinos Tsardounis


PS:
I login with an Ubuntu LiveCD now which returns:

uname -a
Linux ubuntu 4.13.0-16-generic #19-Ubuntu SMP Wed Oct 11 18:35:14 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux
btrfs --version
btrfs-progs v4.12
btrfs fi show
Label: 'arch'  uuid: a955bc5f-e5f0-42ce-bd5a-de5eb8d5d3aa
    Total devices 1 FS bytes used 876.26GiB
    devid    1 size 931.01GiB used 931.01GiB path /dev/sdb1
btrfs fi df sdb1
Data, single: total=926.25GiB, used=872.65GiB
System, single: total=4.00MiB, used=128.00KiB
Metadata, single: total=4.76GiB, used=3.60GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

and the dmesg.log is on:
https://gist.github.com/anonymous/16344244259bb2989701f3ec43e26f39

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ERROR: failed to repair root items: Input/output error
  2017-12-10 15:18 ERROR: failed to repair root items: Input/output error constantine
@ 2017-12-10 15:38 ` Tomasz Pala
  2017-12-11  2:06   ` Duncan
  0 siblings, 1 reply; 3+ messages in thread
From: Tomasz Pala @ 2017-12-10 15:38 UTC (permalink / raw)
  To: constantine; +Cc: Btrfs BTRFS

On Sun, Dec 10, 2017 at 15:18:32 +0000, constantine wrote:

> I have a laptop root hard drive (Samsung SSD 850 EVO 1TB), which is
> within warranty.
> I can't mount it read-write ("no rw mounting  after error").

There is a data-corruption issue with this controller!
The same as 840 EVO - just google this.

In short: either use recent kernel (AFAIR 4.0.5+ for 840 EVO and some
newer for entire 8* Samsung SSD family blacklisting) or disable NCQ.

Using queued TRIM on this drive leads to data loss! Firmware zeroes fist
512 bytes of a block, sorry.

If you only had smaller drive, as 850s up to 512 GB have different
controller...

> checksum verify failed on 103009173504 found 25334496 wanted 00003500
> bytenr mismatch, want=103009173504, have=889192478
> ERROR: failed to repair root items: Input/output error
> 
> What do these errors mean?
> What should I do to fix the filesystem and be able to mount it read-write?

You probably can't fix this - there is data missing on bare metal, so you should
recover using backups. If you don't have one, you need to perform manual
data recovery procedures (like photorec) with little chances to restore
complete files due to the nature of data loss (beginning of blocks).

-- 
Tomasz Pala <gotar@pld-linux.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ERROR: failed to repair root items: Input/output error
  2017-12-10 15:38 ` Tomasz Pala
@ 2017-12-11  2:06   ` Duncan
  0 siblings, 0 replies; 3+ messages in thread
From: Duncan @ 2017-12-11  2:06 UTC (permalink / raw)
  To: linux-btrfs

Tomasz Pala posted on Sun, 10 Dec 2017 16:38:05 +0100 as excerpted:

> On Sun, Dec 10, 2017 at 15:18:32 +0000, constantine wrote:
> 
>> I have a laptop root hard drive (Samsung SSD 850 EVO 1TB), which is
>> within warranty.
>> I can't mount it read-write ("no rw mounting  after error").
> 
> There is a data-corruption issue with this controller!
> The same as 840 EVO - just google this.

That's a bit alarmist...

It's not a problem with recent kernels (I too have a Samsung 850
EVO 1 TB), as you mention below.  Given the focus of this list, btrfs,
and the fact that btrfs is still stabilizing, not yet fully stable and
mature, so reasonably new kernels are strongly recommended (with the
second newest mainline-LTS kernel series, now 4.9 since 4.14 is LTS,
being the oldest recommended)...

Anyone already following btrfs-list kernel recommendations is already
/well/ beyond the kernel versions you mention below as adding the
blacklisting, so it shouldn't be a problem.

And he mentions Ubuntu with kernel 4.13, so he's at least well beyond
the problem kernels now, tho of course it's possible he was running
an older one previously, and that's what did the damage.

> In short: either use recent kernel (AFAIR 4.0.5+ for 840 EVO and some
> newer for entire 8* Samsung SSD family blacklisting) or disable NCQ.
> 
> Using queued TRIM on this drive leads to data loss! Firmware zeroes
> fist 512 bytes of a block, sorry.
> 
> If you only had smaller drive, as 850s up to 512 GB have different
> controller...
> 
>> checksum verify failed on 103009173504 found 25334496 wanted 00003500
>> bytenr mismatch, want=103009173504, have=889192478
>> ERROR: failed to repair root items: Input/output error
>> 
>> What do these errors mean?
>> What should I do to fix the filesystem and be able to mount it
>> read-write?
> 
> You probably can't fix this - there is data missing on bare metal, so
> you should recover using backups. If you don't have one, you need to
> perform manual data recovery procedures (like photorec) with little
> chances to restore complete files due to the nature of data loss
> (beginning of blocks).

I'll agree here.  First sysadmin rule of backups.  The value of your
data is defined not by empty claims, but by the number of backups you
consider it worth having of that data.  No backups simply defines the
data as of only trivial value, worth less than the time, trouble and
resources necessary to make that backup.

So in the event of loss of the working copy, you can always rest easy.
Either there was a backup and recovery can proceed from it, or there
wasn't, in which case you can still be happy, since after all you still
saved what was self-evidently more important than that data, the time,
trouble and resources you'd have put into backing it up if it /were/
worth it.


Meanwhile, apparently the filesystem can still be mounted read-only,
just not read-write, so the first thing I'd do would be to mount it
read-only and see how much of the data I can successfully copy off
to some other filesystem.  With a bit of luck, only a few files are
damaged and will need restored from backup (assuming they were
valuable enough to have a backup, of course), while the rest are
fine.

If that's /not/ the case, and the filesystem won't mount read-only
either, or there's too much damaged, then it's time to try
btrfs restore, to try to restore some of the files to some other
location.  Tho note that btrfs restore is an effort at recovery,
and as such, it doesn't verify checksums as normal btrfs file
operations will, so some of the otherwise successfully restored
files may still be bitrotted.  (Tho most filesystems don't checksum
in any case, so the danger of bitrot is no worse than using a normal
filesystem that doesn't do checksumming anyway.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-12-11  2:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-10 15:18 ERROR: failed to repair root items: Input/output error constantine
2017-12-10 15:38 ` Tomasz Pala
2017-12-11  2:06   ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.