How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs

linux-nilfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs
@ 2023-10-15 10:12 Martin Vahi
  2023-10-15 15:31 ` Ryusuke Konishi
  2023-12-29 15:51 ` Continuation of the topic "error Foo(now -5, once -22) while searching super root" Martin Vahi
  0 siblings, 2 replies; 3+ messages in thread
From: Martin Vahi @ 2023-10-15 10:12 UTC (permalink / raw)
  To: linux-nilfs

The symptoms are that a NilFS2 partition at a multi-TiB-sized USB-HDD that
has only one huge primary partition, the NilFS2 partition,
fails to mount. The symptoms include:

     ----start--of--citation--of--dmesg--output--last--lines---
     [  382.418297] usb 2-2: new high-speed USB device number 5 using xhci_hcd
     [  382.611471] usb 2-2: New USB device found, idVendor=152d, idProduct=578e, bcdDevice=14.05
     [  382.611480] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
     [  382.611485] usb 2-2: Product: External USB 3.0
     [  382.611489] usb 2-2: Manufacturer: Intenso
     [  382.611493] usb 2-2: SerialNumber: 20171113252B4
     [  382.616077] scsi host7: uas
     [  382.617099] scsi 7:0:0:0: Direct-Access     Intenso  External USB 3.0 1405 PQ: 0 ANSI: 6
     [  382.617889] sd 7:0:0:0: Attached scsi generic sg3 type 0
     [  382.618920] sd 7:0:0:0: [sdc] 1220942646 4096-byte logical blocks: (5.00 TB/4.55 TiB)
     [  382.619085] sd 7:0:0:0: [sdc] Write Protect is off
     [  382.619086] sd 7:0:0:0: [sdc] Mode Sense: 5f 00 00 08
     [  382.619391] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
     [  382.619619] sd 7:0:0:0: [sdc] Optimal transfer size 33550336 bytes
     [  382.673809]  sdc: sdc1
     [  382.675211] sd 7:0:0:0: [sdc] Attached SCSI disk
     [  486.063294] NILFS (sdc1): mounting unchecked fs
     [  486.099403] NILFS (sdc1): invalid segment: Magic number mismatch
     [  486.099413] NILFS (sdc1): trying rollback from an earlier position
     [  486.100080] NILFS (sdc1): invalid segment: Magic number mismatch
     [  486.100081] NILFS (sdc1): error -22 while searching super root
     [ 1034.270149] NILFS (sdc1): mounting unchecked fs
     [ 1034.313297] NILFS (sdc1): invalid segment: Magic number mismatch
     [ 1034.313308] NILFS (sdc1): trying rollback from an earlier position
     [ 1034.314722] NILFS (sdc1): invalid segment: Magic number mismatch
     [ 1034.314726] NILFS (sdc1): error -22 while searching super root
     ----end----of--citation--of--dmesg--output--last--lines---

 From an 2012_07_23 mailing list post at

     https://www.mail-archive.com/linux-nilfs@vger.kernel.org/msg01243.html

it seems that the way to may be the solution is to use
"ddrescue" for creating an image of the whole device and then
mount that HDD-image. As of 2023_10_15 I do not know, if that
"ddrescue" based solution works for me, because I do not have
big-enough empty HDD at and to try it, but the referenced
mailing list post is over 10 years old and it can be expected
that an USB-HDD that has its own power supply, can loose
power at any moment or its USB-cable can be detached
at any moment, so I suspect that there just has to be
some more elegant solution to this naturally occurring
problem than to create a HDD-image of a multi-TiB sized HDD.

My problem is that I fail to find that solution, despite
surfing the mailing list archive and reading the various
NilFS related documentation. Could anybody please provide some
links to related documentation, messages at the mailing list archive
or please provide some other hints or ideas, how I can mount
my USB-HDD without waiting a whole week or two for it to
get copied to some bigger HDD and then back again. Thank You.

Yours sincerely,
Martin.Vahi@softf1.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs
  2023-10-15 10:12 How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs Martin Vahi
@ 2023-10-15 15:31 ` Ryusuke Konishi
  2023-12-29 15:51 ` Continuation of the topic "error Foo(now -5, once -22) while searching super root" Martin Vahi
  1 sibling, 0 replies; 3+ messages in thread
From: Ryusuke Konishi @ 2023-10-15 15:31 UTC (permalink / raw)
  To: Martin Vahi; +Cc: linux-nilfs

On Sun, Oct 15, 2023 at 7:55 PM Martin Vahi wrote:
>
>
> The symptoms are that a NilFS2 partition at a multi-TiB-sized USB-HDD that
> has only one huge primary partition, the NilFS2 partition,
> fails to mount. The symptoms include:
>
>      ----start--of--citation--of--dmesg--output--last--lines---
>      [  382.418297] usb 2-2: new high-speed USB device number 5 using xhci_hcd
>      [  382.611471] usb 2-2: New USB device found, idVendor=152d, idProduct=578e, bcdDevice=14.05
>      [  382.611480] usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
>      [  382.611485] usb 2-2: Product: External USB 3.0
>      [  382.611489] usb 2-2: Manufacturer: Intenso
>      [  382.611493] usb 2-2: SerialNumber: 20171113252B4
>      [  382.616077] scsi host7: uas
>      [  382.617099] scsi 7:0:0:0: Direct-Access     Intenso  External USB 3.0 1405 PQ: 0 ANSI: 6
>      [  382.617889] sd 7:0:0:0: Attached scsi generic sg3 type 0
>      [  382.618920] sd 7:0:0:0: [sdc] 1220942646 4096-byte logical blocks: (5.00 TB/4.55 TiB)
>      [  382.619085] sd 7:0:0:0: [sdc] Write Protect is off
>      [  382.619086] sd 7:0:0:0: [sdc] Mode Sense: 5f 00 00 08
>      [  382.619391] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>      [  382.619619] sd 7:0:0:0: [sdc] Optimal transfer size 33550336 bytes
>      [  382.673809]  sdc: sdc1
>      [  382.675211] sd 7:0:0:0: [sdc] Attached SCSI disk
>      [  486.063294] NILFS (sdc1): mounting unchecked fs
>      [  486.099403] NILFS (sdc1): invalid segment: Magic number mismatch
>      [  486.099413] NILFS (sdc1): trying rollback from an earlier position
>      [  486.100080] NILFS (sdc1): invalid segment: Magic number mismatch
>      [  486.100081] NILFS (sdc1): error -22 while searching super root
>      [ 1034.270149] NILFS (sdc1): mounting unchecked fs
>      [ 1034.313297] NILFS (sdc1): invalid segment: Magic number mismatch
>      [ 1034.313308] NILFS (sdc1): trying rollback from an earlier position
>      [ 1034.314722] NILFS (sdc1): invalid segment: Magic number mismatch
>      [ 1034.314726] NILFS (sdc1): error -22 while searching super root
>      ----end----of--citation--of--dmesg--output--last--lines---
>
>  From an 2012_07_23 mailing list post at
>
>      https://www.mail-archive.com/linux-nilfs@vger.kernel.org/msg01243.html
>
> it seems that the way to may be the solution is to use
> "ddrescue" for creating an image of the whole device and then
> mount that HDD-image. As of 2023_10_15 I do not know, if that
> "ddrescue" based solution works for me, because I do not have
> big-enough empty HDD at and to try it, but the referenced
> mailing list post is over 10 years old and it can be expected
> that an USB-HDD that has its own power supply, can loose
> power at any moment or its USB-cable can be detached
> at any moment, so I suspect that there just has to be
> some more elegant solution to this naturally occurring
> problem than to create a HDD-image of a multi-TiB sized HDD.
>
> My problem is that I fail to find that solution, despite
> surfing the mailing list archive and reading the various
> NilFS related documentation. Could anybody please provide some
> links to related documentation, messages at the mailing list archive
> or please provide some other hints or ideas, how I can mount
> my USB-HDD without waiting a whole week or two for it to
> get copied to some bigger HDD and then back again. Thank You.
>
>
> Yours sincerely,
> Martin.Vahi@softf1.com

The log messages indicate that the situation is quite severe.

>      [  486.099403] NILFS (sdc1): invalid segment: Magic number mismatch

This message indicates that the magic number in the header of the log
pointed to by a superblock is abnormal, that is, the log data is
corrupted on disk.

>      [  486.099413] NILFS (sdc1): trying rollback from an earlier position
>      [  486.100080] NILFS (sdc1): invalid segment: Magic number mismatch
>      [  486.100081] NILFS (sdc1): error -22 while searching super root

And this message indicates that although recovery was attempted using
a spare superblock, the previous log pointed to by that pointer was
also corrupted.

Therefore, the data immediately before the problem probably has not
been written to the disk, and cannot be salvaged.

NILFS2 is designed to write log data  to media and then update the
superblock pointer, and to be safe, the superblock is duplicated so
that you can retroactively mount logs from a while ago.   It's rare
that either of these remedies doesn't work, and this usually doesn't
happen even with a sudden power cut.

Normally, even with a disk cache, this will not happen if the minimum
guaranteed write ordering and flushing semantics are properly
guaranteed.  There may be probably a bug in the device driver or the
disk firmware, or the disk may not be reliable to begin with.
Or perhaps the data that was supposed to have been written to the
media was unfortunately corrupted retroactively.

If there is no backup, the only way to rescue old data is to manually
rewrite the number of the last segment number (and associated
checkpoint and sequence numbers) in superblocks and successfully mount
it with read-only and norecovery mount options.  Unfortunately, there
is no easy way to recover from this level of destruction.

Since it cannot be mounted, the lssu command cannot be used to
determine the state of segments.  Instead, try checking the status
with the nilfs-tune and dumpseg commands:

$ sudo nilfs-tune -l /dev/sdc1

will list information written in a superblock.  And,

$ i=0; while [ $i -lt 511 ]; do sudo dumpseg /dev/sdc1 $i | head -4;
let i=i+1; done

writes out information (such as sequence number and creation date and
time) for segments with valid logs.
Here, 511 is the number of segments, so use the value in the "Number
of segments:" field in the nilfs-tune output for your device.

This will give you an overview of the logs (segments) written to disk.

Try dumping the information of the segment that appears to be the
newest using  dumpseg without the head command, and look for the
segment containing "ino = 3" (the inode number of that DAT metadata
file written at the end of the checkpointed log).
For instance, if the 100th segment appears to be the newest, try the following:

$ sudo dumpseg /dev/sdc1 100

Since manually rewriting the segment pointer in a superblock is a
dangerous operation, I will omit it here.
I think it is important to first understand the current situation.

Regards,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Continuation of the topic "error Foo(now -5, once -22) while searching super root"
  2023-10-15 10:12 How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs Martin Vahi
  2023-10-15 15:31 ` Ryusuke Konishi
@ 2023-12-29 15:51 ` Martin Vahi
  1 sibling, 0 replies; 3+ messages in thread
From: Martin Vahi @ 2023-12-29 15:51 UTC (permalink / raw)
  To: linux-nilfs

This letter is a continuation of the thread that I started at 2023_10_15

https://marc.info/?l=linux-nilfs&m=169738371518323&w=2
archival copy: https://archive.is/Fbw5e

The purpose of my current letter is to document, write down,
information that might contain hints to the flaw.

This time the computer was the same,

     (Two 2-threaded cores, 12GiB RAM minus the video memory. A line from /var/cpuinfo)
     Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz

Linux distribution on the computer was different,
a live-DVD with KNOPPIX version 9.1
freshly written on a MDisc DVD(As of 2023_12 still totally buyable
from original manufacturer, ritek-europe dot com, redirects to conrexx dot com,
in small quantities like a pack of 100 discs for about 200€, including all shipping and handling.
They order them from Taiwan to Netherlands and then use FedEx to send from
Netherlands to the rest of the EU. No, I do NOT sell them myself, NOR do I earn from that business in any way,
I just had hard time getting MDisc DVDs and that's where I got those,
but the delivery time was about 2 months.) and the USB-HDD was a
~465GiB < 500GiB sized magnetic disc, which was mounted
with mount options "noatime,nodiratime". The error occurred
during a long "git commit -a". The repository resided
on the USB-HDD. There was plenty of CPU-time free, because
only the window manager with a "few" "standard" KNOPPIX
programs were running and there was no shortage of RAM, because
multiple GiB was free. The "git commit -a" was given over
an SSH session, id est the USB cable stayed put, no movement
of the USB cable due to the use of a laptop keyboard.
The laptop booted from the MDisc DVD about 2 days before
the error occurred and the rest of the programs at the laptop
seem to work fine after the error without rebooting the laptop, id est
the kernel did not totally crash.

The laptop with the ~465GiB USB-HDD was not on the same table with the
keyboard that was in use, id est keyboard vibrations did not
reach the USB-HDD or the laptop in any significant amount.

     ----start--of--citation--of--dmesg--output--last--lines---
     [  150.848200] usb 3-4: new high-speed USB device number 5 using xhci_hcd
     [  150.989381] usb 3-4: New USB device found, idVendor=152d, idProduct=2329, bcdDevice= 1.00
     [  150.989390] usb 3-4: New USB device strings: Mfr=1, Product=2, SerialNumber=5
     [  150.989394] usb 3-4: Product: USB to ATA/ATAPI bridge
     [  150.989397] usb 3-4: Manufacturer: JMicron
     [  150.989401] usb 3-4: SerialNumber: 801130168383
     [  150.990928] usb-storage 3-4:1.0: USB Mass Storage device detected
     [  150.991138] usb-storage 3-4:1.0: Quirks match for vid 152d pid 2329: 8020
     [  150.991193] scsi host7: usb-storage 3-4:1.0
     [  154.102691] scsi 7:0:0:0: Direct-Access     WDC WD50 00LPLX-60ZNTT1   02.0 PQ: 0 ANSI: 2 CCS
     [  154.103140] sd 7:0:0:0: Attached scsi generic sg2 type 0
     [  154.103556] sd 7:0:0:0: [sdc] 976773168 512-byte logical blocks: (500 GB/466 GiB)
     [  154.103931] sd 7:0:0:0: [sdc] Write Protect is off
     [  154.103940] sd 7:0:0:0: [sdc] Mode Sense: 28 00 00 00
     [  154.104321] sd 7:0:0:0: [sdc] No Caching mode page found
     [  154.104328] sd 7:0:0:0: [sdc] Assuming drive cache: write through
     [  154.187547]  sdc: sdc1 sdc2 sdc3 sdc4 sdc5 sdc6 sdc7 sdc8
     [  154.188768] sd 7:0:0:0: [sdc] Attached SCSI disk
     [  222.853951] NILFS version 2 loaded
     [  222.855302] NILFS (sdc7): mounting unchecked fs
     [  226.181683] sd 7:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=2s
     [  226.181687] sd 7:0:0:0: [sdc] tag#0 Sense Key : Medium Error [current]
     [  226.181689] sd 7:0:0:0: [sdc] tag#0 Add. Sense: Unrecovered read error
     [  226.181692] sd 7:0:0:0: [sdc] tag#0 CDB: Read(10) 28 00 28 13 20 28 00 00 08 00
     [  226.181694] blk_update_request: critical medium error, dev sdc, sector 672342056 op 0x0:(READ) flags 0x0 
phys_seg 1 prio class 0
     [  226.181726] NILFS (sdc7): I/O error reading segment
     [  226.181729] NILFS (sdc7): error -5 while searching super root
     root@Microknoppix:/home/knoppix/haakimiskataloogid# mount -t nilfs2 /dev/sdc7 ./h1
     mount.nilfs2: Error while mounting /dev/sdc7 on /home/knoppix/haakimiskataloogid/h1: Input/output error
     root@Microknoppix:/home/knoppix/haakimiskataloogid#
     ----end----of--citation--of--dmesg--output--last--lines---

I find it scary that a file system can get so unusable
during ordinary use while the hardware seems to be just fine
and there is no standard tool to recover even a fraction
of the files at the NilFS2 partition. As of 2023_12_29
I have most of my files on NilFS2 partitions with the hope
that it helps to preserve them, but it turns out that
when ext4 fails, looses files, at power failures, then NilFS2 fails
at plain usage scenarios, where there is no power
failure or any other relevant event. As things stand now (2023_12_29),
my strategy is to mirror my files at different file systems:
NilFS2 to not loose files during power failures or resets and
ext4 to not loose files during plain, calm, low-intensity, HDD usage.

Another line of thought is that RAIDs are useless, if the
kernel of the computer that the RAID is connected to,
corrupts in RAM. Therefore the HDDs with different file system types
should be connected to different computers, preferably running
different operating systems. As my main operating system is some
Linux distribution (varies over time and between machines), then
FreeBSD, OpenBSD and Solaris derivatives (illumos and alike
come to mind) as hopefully sufficiently varying options.

I mentioned MDisc DVD-s, because if an operating system
boots from a DVD, then there CAN NOT BE ANY KERNEL
FILE SYSTEM CORRUPTION RELATED BOOT BINARY CORRUPTION
and unlike plain DVD-s, MDisc DVDs last longer than 10 years.
MDisc DVDs can be reliably written only with a special DVD writer that has
slightly more powerful laser than other, "ordinary", DVD-writers,
but the prices of such USB-DVD-writers are roughly the same as
with other USB-DVD-writers, only the specs differ and one
must make a slightly greater effort to find the USB-DVD-writers
that have "MDisc support". Supposedly other DVD-writers
also write MDisc DVDs, but with a great error rate. MDisc DVDs
are designed to be readable with plain DVD-writers/readers
and even old DVD-readers that lack DVD writing capability.
I mention that aspect, because with Raspberry_Pi-like computers
the Flash memory card wear related errors also appear
at the Linux kernel binary and other installed binaries and
that makes the Raspberry_Pi-like computers unstable over time.
As of 2023 the newest Raspberry_Pi official Linux, the "RaspberryOS"
has the option to use a "readonly-write-only-to-RAM filesystem", a lot like
live Linux DVDs use to reduce the wear of the memory card,
but the various scientific papers (You can search them Yourself,
there are plenty on the net, easy to find, semanticscholar dot org )
basically, depending on how one interprets the graphs and
temperature conditions of memory cards and memory sticks,
state the "sufficiently reliable" data retention rate to
about one year, 2 years tops. Again, depending on interpretation.
My interpretation is that if I touch a memory card or
a USB memory stick than I feel that it's hot and therefore
the Flash memory die in the device must be even hotter.
My personal intuitive observation matches with the
roughly 1 year retention time, after which the data should
be rewritten, including file system formatting information.

That is to say, for reliability, Flash memory card should be taken out of
a Raspberry_Pi-like computer roughly once per year and
the Linux program dd should/might be used to rewrite the
original image to the memory card, even if the memory card
is used in "readonly-write-only-to-RAM mode".

The F-RAM, used for storing program code in
car-industry microcontrollers (MCUs) does not
seem to be any better than Flash, despite initial hype,
except that may be MCUs are "stored"/used in cooler conditions.
I do not know. Car engines do get hot. That is to say,
some old-fashioned ROM can be pretty nice thing to have
and for laptops and desktops a MDisc DVD or BluRay
can be that "ROM", except that according to
some sources on the wild-wild-web the BluRay's,
including the proper MDisc BluRays that have
the inorganic die, not the
Verbatim (yes, that famous brand) fakes that only
use the MDisc as a trade-mark, supposedly have
a higher error rate than MDisc DVDs. But, even
plain DVDs can be a lot of help, for at least 5 year period,
possibly for a 10 year period. Again, if DVD
is like ROM, then any driver binary, including NilFS2 driver,
binary does not corrupt due to storage bitrot,
filesystem information corruption.

And the bad news is that the so hyped up "cloud storage",
where the storage providers advertise that they store
their clients' data at some really fancy and fast
solid state storage devices (essentially Flash memory)
has exactly the same bitrotting issue, which is why
at least one person that I know of (not me, yet)
keeps an off-cloud list of file hashes (MD5, SHA256, ...)
of files that a his clients' web application
at a server consists of. I mean, if banking information
or other critical information is stored at modern Flash-memory
based fast storage devices at the greatest and fanciest
servers and that information were to corrode due to
some file system driver issue that no RAID can compensate...

Summary of my compromise-semi-workaround:

     x) Boot from a live-DVD and use a Bash script to
        customize the running instance, id est copy
        the /etc/passwd and /etc/shadow files and
        /etc/ssh folder. Plain DVDs will do, but
        MDisc DVDs are better and with some long-term planning
        it is still (as of 2023_12) possible to get them, not as "new-old-stock", but
        as brand new products that are still being produced in relatively small volumes.
        Once their patents expire, there might be even multiple MDisc DVD producers,
        if there is enough demand for MDisc DVDs. If people still
        buy plain DVDs, then there might be also some market for MDisc DVDs.

        (I like to think of DVDs like I think of paper: relatively low data capacity,
        we don't produce them at home, id est it takes a factory to
        produce them, yet we use the old-fashioned paper still for
        data storage in many situations, like labels on apple-jam jars,
        packaging of many products contain text and image information, etc.
        In that sense DVD format as such might last for a long time,
        specially if it overcomes storage reliability issues like
        the MDisc DVDs have overcome, and if there are
        multiple producers like there are multiple paper producing
        factories.)

     x) Mirror files on different HDDs/SSDs that have
        different file system types,
        one HDD/SSD per computer to counter a situation, where
        a kernel/file_system_driver running instance corrupts
        due to some typical C/C++ related memory corruption.

     x) With Raspberry_Pi-like computers, overwrite the
        memory card once per year (with "dd", id est
        including filesystem formatting information)
        and try to avoid wearing the memory card by switching off
        the swap ("swapoff --all").

     x) With various Linux file systems use the
        "noatime,nodiratime" mount options
        ("mount -o noatime,nodiratime /dev/foodevice /bar/folder")

Thank You for reading my letter.
I hope that it helps to somehow get by
till the core of the NilFS2
corruption issue gets solved.

Yours sincerely,
Martin.Vahi@softf1.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-12-29 15:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-15 10:12 How to Elegantly Handle "error -22 while searching super root" with Multi-TiB USB-HDDs Martin Vahi
2023-10-15 15:31 ` Ryusuke Konishi
2023-12-29 15:51 ` Continuation of the topic "error Foo(now -5, once -22) while searching super root" Martin Vahi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).