All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 187051] New: "orphan list check failed" error in ext4
@ 2016-11-05 20:27 bugzilla-daemon
  2016-11-06 23:48 ` [Bug 187051] " bugzilla-daemon
                   ` (22 more replies)
  0 siblings, 23 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-05 20:27 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

            Bug ID: 187051
           Summary: "orphan list check failed" error in ext4
           Product: File System
           Version: 2.5
    Kernel Version: 4.8.6
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: me@hussam.eu.org
        Regression: No

I saw a ext4 "orphan list check failed" in my logs. It is happening on a new
1TB hard disk and mostly when idle.
I did a smart check and a a fsck check for bad sectors from the arch linux
rescue usb. neither saw anything.
I formatted the disk again and restored a backup. Then I thought I would just
get a replacement instead of wait for this to happen again.
Same issue happened 50 minutes ago.
"Filesystem features:      has_journal ext_attr resize_inode dir_index filetype
needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink
extra_isize metadata_csum".
and formatted under kernel 4.8 with latest e2fsprogs.
Checking the inode that caused the hiccup reveals it points to a folder in
/usr/share/ that was not touched since the 25th of last month and I have fscked
and rebooted plenty of times since then. I fsck on each reboot anyway.
I did a fsck -D -fv /dev/mapper/root from rescue usb after last reboot too.
Could the metadata_csum option be causing this? the kernel.org ext4 wiki
suggested it is a integrity measure.
My backups don't include system journals to save space on backup media so I
don't have the full kernel crash log.
I am sure there was no I/O error message in the crash log when I typed dmesg
and I was told on arch linux logs that this rules out any hardware errors.

I believe this is the second time this has happened.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
@ 2016-11-06 23:48 ` bugzilla-daemon
  2016-11-07 10:40 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-06 23:48 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #1 from Theodore Tso <tytso@mit.edu> ---
We really badly need the logs to be able to understand more of what's going on.

This error doesn't represent an on-disk corruption, but rather an inconsistency
in an in-memory data structure.  As such, reformatting and reloading from
backups wasn't necessary, and it's not surprising e2fsck didn't find anything.

What the error means is that at the time when the kernel tried to release an
inode from memory, it was apparently on the orphan linked list.   This is a
"should never happen" situation, and indicates either a kernel bug in ext4, a
hardware induced memory bit-flip, or a kernel bug somewhere else that involved
a wild pointer dereference that corrupted the data structure in question.  This
is why we really need the logs to see what might have happened.   The dump of
the data structure is critical here.

It doesn't make sense that the inode "points to a folder in /usr/share".  Do
you mean that the inode literally corresponds to a directory?    Directories
never are on the orphan list; only regular files, and only when they are being
deleted or truncated, or a few other specialized circumstances.   And once the
deletion or truncation is completed, they are removed from the orphan list, as
well as the linked list.   So something is really wrong, and the logs would be
very helpful to try to figure out what might be going on.

The good news is your data shouldn't be at risk, and if you can come up with a
solid reproduction case, that would be especially helpful, especially if we can
reproduce it on another machine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
  2016-11-06 23:48 ` [Bug 187051] " bugzilla-daemon
@ 2016-11-07 10:40 ` bugzilla-daemon
  2016-11-07 17:02 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-07 10:40 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #2 from Hussam Al-Tayeb <me@hussam.eu.org> ---
I will post a backtrace of the crash if it happens again.
I had written down the output of the inode check on a paper.

debugfs -R 'ncheck 54264316' /dev/dm-0 
debugfs 1.43.3 (04-Sep-2016)
Inode Pathname
54264316 /usr/share/help/oc/totem/figures

I should be more or less correct since I wrote the whole thing on paper.
The folder itself exists.

The only thing I remember which may have touched this folder since it was last
modified on the 25th of october is that I rebooted to the rescue usb disk once
and did a e4defrag. but I do no even remember the date when I did so.

tune2fs -l /dev/mapper/root says:
Mount count:              1
Maximum mount count:      1
My hard disks are all fscked at reboots.

I will monitor if this happens again and post a full log.
Thank you for the instructions.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
  2016-11-06 23:48 ` [Bug 187051] " bugzilla-daemon
  2016-11-07 10:40 ` bugzilla-daemon
@ 2016-11-07 17:02 ` bugzilla-daemon
  2016-11-07 17:50 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-07 17:02 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #3 from Theodore Tso <tytso@mit.edu> ---
So /usr/share/help/oc/totem/figures is a *directory*?    A directory has no
business being on the orphan list, so something very strange is indeed going
on.

If this happens again, please using debugfs to get a stat of the inode in
question:

debugfs /dev/dm-0
debugfs: stat <54264316>

the timestamps in the inode might be helpful.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (2 preceding siblings ...)
  2016-11-07 17:02 ` bugzilla-daemon
@ 2016-11-07 17:50 ` bugzilla-daemon
  2016-11-07 19:39 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-07 17:50 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #4 from Hussam Al-Tayeb <me@hussam.eu.org> ---
- Yes, a directory. This is partially what scared me.

- Ok, thank you. I will do so if it happens again.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (3 preceding siblings ...)
  2016-11-07 17:50 ` bugzilla-daemon
@ 2016-11-07 19:39 ` bugzilla-daemon
  2016-11-07 20:31 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-07 19:39 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #5 from Theodore Tso <tytso@mit.edu> ---
If it happens again, my suggestion would be to capture as much information
again, and then just reboot the system.   You can fsck the file system if you
are feeling especially paranoid, but it's likely that any corruption in data
structures are in-memory only.   That being said, the in-memory corruptions
could cause other corruptions that could eventually corrupt data or metadata. 
So rebooting right away would be a good idea.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (4 preceding siblings ...)
  2016-11-07 19:39 ` bugzilla-daemon
@ 2016-11-07 20:31 ` bugzilla-daemon
  2016-11-08 15:15 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-07 20:31 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

Andreas Dilger <adilger.kernelbugzilla@dilger.ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |adilger.kernelbugzilla@dilg
                   |                            |er.ca

--- Comment #6 from Andreas Dilger <adilger.kernelbugzilla@dilger.ca> ---
(In reply to Theodore Tso from comment #3)
> So /usr/share/help/oc/totem/figures is a *directory*?    A directory has no
> business being on the orphan list, so something very strange is indeed going
> on.

It *is* possible to have a directory on the orphan list, since it is possible
to have an open-unlinked directory if one cd's into the directory before it is
unlinked (note nlink = 0, dtime is an inode number, not time):

    $ mkdir foo
    $ cd foo
    $ rmdir ../foo
    $ ls -lid .
    289432 drwxrwxr-x 0 mythtv mythtv 0 Nov  7 13:23 .
    $ sudo debugfs -c -R "stat <289432>" /dev/vg_basemyth/lvroot.basemyth 
    debugfs 1.42.7.wc2 (07-Nov-2013)
    Inode: 289432   Type: directory    Mode:  0775   Flags: 0x0
    Generation: 2752543570    Version: 0x00000000:00000002
    User:   500   Group:   500   Size: 4096
    File ACL: 0    Directory ACL: 0
    Links: 0   Blockcount: 8
    Fragment:  Address: 0    Number: 0    Size: 0
     ctime: 0x5820e2b5:4aa59b94 -- Mon Nov  7 13:23:17 2016
     atime: 0x5820e2b0:0048b0c8 -- Mon Nov  7 13:23:12 2016
     mtime: 0x5820e2b0:0048b0c8 -- Mon Nov  7 13:23:12 2016
    crtime: 0x5820e2b0:0048b0c8 -- Mon Nov  7 13:23:12 2016
     dtime: 0x0003f42b -- Sat Jan  3 16:58:35 1970
    Size of extra inode fields: 28
    BLOCKS:
    (0):1258994

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (5 preceding siblings ...)
  2016-11-07 20:31 ` bugzilla-daemon
@ 2016-11-08 15:15 ` bugzilla-daemon
  2017-01-28 12:25 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2016-11-08 15:15 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #7 from Theodore Tso <tytso@mit.edu> ---
Well, yes, but in this case the directory hasn't been deleted, as I understand
the original bug report.  And you can't truncate a directory....

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (6 preceding siblings ...)
  2016-11-08 15:15 ` bugzilla-daemon
@ 2017-01-28 12:25 ` bugzilla-daemon
  2017-01-28 12:52 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-28 12:25 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #8 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Hi. It just happened again.


[root@hades hussam]# debugfs /dev/dm-0
debugfs 1.43.3 (04-Sep-2016)
debugfs:  stat <58073259>
Inode: 58073259   Type: directory    Mode:  0755   Flags: 0x80000
Generation: 1123105798    Version: 0x00000000:0000003e
User:     0   Group:     0   Project:     0   Size: 4096
File ACL: 0    Directory ACL: 0
Links: 2   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x58709c64:841efcb0 -- Sat Jan  7 09:44:36 2017
 atime: 0x588bc31e:c21e1c40 -- Sat Jan 28 00:01:02 2017
 mtime: 0x58709c4e:00000000 -- Sat Jan  7 09:44:14 2017
crtime: 0x58709c64:6127674c -- Sat Jan  7 09:44:36 2017
Size of extra inode fields: 32
Inode checksum: 0xfb5c96a4
EXTENTS:
(0):232268326
debugfs:  quit
[root@hades hussam]# debugfs -R 'ncheck 58073259' /dev/dm-0
debugfs 1.43.3 (04-Sep-2016)
Inode    Pathname
58073259    /usr/share/swig/3.0.11/scilab

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (7 preceding siblings ...)
  2017-01-28 12:25 ` bugzilla-daemon
@ 2017-01-28 12:52 ` bugzilla-daemon
  2017-01-29  5:09 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-28 12:52 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #9 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Created attachment 253371
  --> https://bugzilla.kernel.org/attachment.cgi?id=253371&action=edit
logged trace

This is the crash data according to journalctl

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (8 preceding siblings ...)
  2017-01-28 12:52 ` bugzilla-daemon
@ 2017-01-29  5:09 ` bugzilla-daemon
  2017-01-29  9:02 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29  5:09 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #10 from Theodore Tso <tytso@mit.edu> ---
... and this case the directory has nlinks > 0, so it definitely shouldn't have
been on the orphaned list at all.

Can you tell me anything about the workload of your system?  And given the mod
times, it looks like it hasn't been modified any time recently, either.   This
is really starting to smell like some kind in-memory corruption --- caused
perhaps by a hardware fault causing a bit flip, or some random bug in any part
of the kernel.

What has changed recently?  Were you using an older kernel?   What was the last
known kernel version that was problem-free for you?

Have you tried using Memtest86+ to make sure you don't have any memory issues?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (9 preceding siblings ...)
  2017-01-29  5:09 ` bugzilla-daemon
@ 2017-01-29  9:02 ` bugzilla-daemon
  2017-01-29 10:00 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29  9:02 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #11 from Hussam Al-Tayeb <me@hussam.eu.org> ---
At the time, the system was under very heavy load as I was compiling
libreoffice and it was heavily swapping.
I recently upgraded from linux 4.8.13 to 4.9.5 and then 4.9.6

I will do a Memtest86+ test to check if the issue is memory related.
Thank you.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (10 preceding siblings ...)
  2017-01-29  9:02 ` bugzilla-daemon
@ 2017-01-29 10:00 ` bugzilla-daemon
  2017-01-29 11:53 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29 10:00 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #12 from Hussam Al-Tayeb <me@hussam.eu.org> ---
One more note. I definitely never had this issue before linux 4.8.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (11 preceding siblings ...)
  2017-01-29 10:00 ` bugzilla-daemon
@ 2017-01-29 11:53 ` bugzilla-daemon
  2017-01-29 11:56 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29 11:53 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #13 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Created attachment 253481
  --> https://bugzilla.kernel.org/attachment.cgi?id=253481&action=edit
memtest at 65%

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (12 preceding siblings ...)
  2017-01-29 11:53 ` bugzilla-daemon
@ 2017-01-29 11:56 ` bugzilla-daemon
  2017-01-29 11:56 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29 11:56 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #14 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Created attachment 253491
  --> https://bugzilla.kernel.org/attachment.cgi?id=253491&action=edit
memtest at 78%

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (13 preceding siblings ...)
  2017-01-29 11:56 ` bugzilla-daemon
@ 2017-01-29 11:56 ` bugzilla-daemon
  2017-01-29 11:59 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29 11:56 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #15 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Created attachment 253501
  --> https://bugzilla.kernel.org/attachment.cgi?id=253501&action=edit
memtest at 99%

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (14 preceding siblings ...)
  2017-01-29 11:56 ` bugzilla-daemon
@ 2017-01-29 11:59 ` bugzilla-daemon
  2017-01-30  2:21 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-29 11:59 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #16 from Hussam Al-Tayeb <me@hussam.eu.org> ---
Created attachment 253511
  --> https://bugzilla.kernel.org/attachment.cgi?id=253511&action=edit
memtest almost complete.

Hi again. I did a memtest though only one pass since I need to type some
documents.
I think it finished without errors.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (15 preceding siblings ...)
  2017-01-29 11:59 ` bugzilla-daemon
@ 2017-01-30  2:21 ` bugzilla-daemon
  2017-01-30 15:24 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-30  2:21 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #17 from Theodore Tso <tytso@mit.edu> ---
So you're seeing it on both 4.9 and 4.8, correct?

How reliably can you reproduce the failure?

It's possible turning off metadata_csum might make a difference, although it
would seem surprising since the expected failure modes would involve checksum
failures, and that's not what you're seeing.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (16 preceding siblings ...)
  2017-01-30  2:21 ` bugzilla-daemon
@ 2017-01-30 15:24 ` bugzilla-daemon
  2017-08-28 11:57 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-01-30 15:24 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #18 from Hussam Al-Tayeb <me@hussam.eu.org> ---
No, it was only happening once a month or so.


I upgraded today to a corei5 with 8GB ram. I just plugged in my old hard disks
and rebuilt the initramfs image.
Perhaps the old hardware was too little for the load I was putting on the
computer.
I'll wait and see. Hopefully the issues will be gone.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (17 preceding siblings ...)
  2017-01-30 15:24 ` bugzilla-daemon
@ 2017-08-28 11:57 ` bugzilla-daemon
  2017-10-03 22:44 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-08-28 11:57 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

Hussam Al-Tayeb (me@hussam.eu.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #19 from Hussam Al-Tayeb (me@hussam.eu.org) ---
I haven't seen this again on the new computer. I'm closing this as invalid.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (18 preceding siblings ...)
  2017-08-28 11:57 ` bugzilla-daemon
@ 2017-10-03 22:44 ` bugzilla-daemon
  2017-10-04 12:35 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-10-03 22:44 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

Tomasz Goliński (tomaszg@math.uwb.edu.pl) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tomaszg@math.uwb.edu.pl

--- Comment #20 from Tomasz Goliński (tomaszg@math.uwb.edu.pl) ---
I just saw a similar error on 4.13.3 kernel. In my case inode in question was a
file. It's the first time I see this error and I haven't changed anything in
the machine for a while (aside from updating the kernel few weeks ago).

uname -a
Linux htpc 4.13.3-gentoo #2 SMP PREEMPT Tue Sep 26 14:10:32 CEST 2017 x86_64
AMD Athlon 5370 APU with Radeon R3 AuthenticAMD GNU/Linux

dmesg:

[280833.904264] EXT4-fs (sda1): Inode 154567855 (ffff880100005f80): orphan list
check failed!
[280833.904276] ffff880100005f80: 0001f30a 00000004 00000000 00000000 
................
[280833.904278] ffff880100005f90: 00000001 24d8e41a 00000000 00000000 
.......$........
[280833.904280] ffff880100005fa0: 00000000 00000000 00000000 00000000 
................
[280833.904282] ffff880100005fb0: 00000000 00000000 00000000 00000000 
................
[280833.904284] ffff880100005fc0: 00000000 00000000 000049b4 00000000 
.........I......
[280833.904286] ffff880100005fd0: 00080000 00000080 00000000 00000000 
................
[280833.904288] ffff880100005fe0: 00005fe0 ffff8801 00005fe0 ffff8801 
._......._......
[280833.904290] ffff880100005ff0: 00000000 00000000 00000000 00000000 
................
[280833.904292] ffff880100006000: 004f6d2c ffff8801 00006000 ffff8801 
,mO......`......
[280833.904294] ffff880100006010: 000009a8 00000000 00000000 00000000 
................
[280833.904296] ffff880100006020: 00006020 ffff8801 00006020 ffff8801   `......
`......
[280833.904297] ffff880100006030: 00000000 00000000 00000000 00000000 
................
[280833.904299] ffff880100006040: 00000000 00000000 00006048 ffff8801 
........H`......
[280833.904301] ffff880100006050: 00006048 ffff8801 00000000 00000000 
H`..............
[280833.904303] ffff880100006060: 00000000 00000000 000d81a4 000000fa 
................
[280833.904305] ffff880100006070: 000000fa 00001000 ffffffff ffffffff 
................
[280833.904306] ffff880100006080: ffffffff ffffffff 81617100 ffffffff 
.........qa.....
[280833.904308] ffff880100006090: 39c42000 ffff8801 000061c8 ffff8801  .
.9.....a......
[280833.904310] ffff8801000060a0: 093684af 00000000 00000001 00000000 
..6.............
[280833.904312] ffff8801000060b0: 000009a8 00000000 59d0f5d4 00000000 
...........Y....
[280833.904314] ffff8801000060c0: 1e1a23d3 00000000 59cf614a 00000000 
.#......Ja.Y....
[280833.904315] ffff8801000060d0: 00000000 00000000 59d0f5d4 00000000 
...........Y....
[280833.904317] ffff8801000060e0: 1e1a23d3 00000000 00000000 00000000 
.#..............
[280833.904353] ffff8801000060f0: 0000000c 00000000 00000008 00000000 
................
[280833.904355] ffff880100006100: 00000060 00000000 00000000 00000000 
`...............
[280833.904357] ffff880100006110: 00006110 ffff8801 00006110 ffff8801 
.a.......a......
[280833.904359] ffff880100006120: 00000000 00000000 00000000 00000000 
................
[280833.904360] ffff880100006130: 0569d91c 00000001 00000000 00000000 
..i.............
[280833.904363] ffff880100006140: 00000000 00000000 00000000 00000000 
................
[280833.904364] ffff880100006150: 00006150 ffff8801 00006150 ffff8801 
Pa......Pa......
[280833.904366] ffff880100006160: 00006160 ffff8801 00006160 ffff8801 
`a......`a......
[280833.904368] ffff880100006170: 00006170 ffff8801 00006170 ffff8801 
pa......pa......
[280833.904370] ffff880100006180: 00006180 ffff8801 00006180 ffff8801 
.a.......a......
[280833.904372] ffff880100006190: 00000000 00000000 811a0610 ffffffff 
................
[280833.904373] ffff8801000061a0: 00000001 00000000 00000000 00000000 
................
[280833.904375] ffff8801000061b0: 00000000 00000000 816171c0 ffffffff 
.........qa.....
[280833.904377] ffff8801000061c0: 00000000 00000000 00006068 ffff8801 
........h`......
[280833.904379] ffff8801000061d0: 01180020 00000000 00000000 00000000  
...............
[280833.904380] ffff8801000061e0: 00000000 00000000 00000000 00000000 
................
[280833.904382] ffff8801000061f0: 00000000 00000000 000061f8 ffff8801 
.........a......
[280833.904384] ffff880100006200: 000061f8 ffff8801 00000000 00000000 
.a..............
[280833.904385] ffff880100006210: 00000000 00000000 00000000 00000000 
................
[280833.904387] ffff880100006220: 00000000 00000000 00000001 00000000 
................
[280833.904389] ffff880100006230: 81617ac0 ffffffff 00000010 00000000 
.za.............
[280833.904391] ffff880100006240: 00000000 014200ca 00006248 ffff8801 
......B.Hb......
[280833.904392] ffff880100006250: 00006248 ffff8801 00000000 00000000 
Hb..............
[280833.904394] ffff880100006260: 00000000 00000000 00006268 ffff8801 
........hb......
[280833.904396] ffff880100006270: 00006268 ffff8801 00000000 00000000 
hb..............
[280833.904398] ffff880100006280: 36640dc5 00000000 00000000 00000000 
..d6............
[280833.904399] ffff880100006290: 00000000 00000000 00000000 00000000 
................
[280833.904401] ffff8801000062a0: 00000000 00000000 59d0f5d4 00000000 
...........Y....
[280833.904403] ffff8801000062b0: 1e1a23d3 00000000 000062b8 ffff8801 
.#.......b......
[280833.904405] ffff8801000062c0: 000062b8 ffff8801 00000000 00000000 
.b..............
[280833.904407] ffff8801000062d0: 00000000 00000000 00000000 00000000 
................
[280833.904409] ffff8801000062e0: 00000000 00000000 000062e8 ffff8801 
.........b......
[280833.904411] ffff8801000062f0: 000062e8 ffff8801 00000000 00000000 
.b..............
[280833.904412] ffff880100006300: 00000000 ffffffff 00000000 00000000 
................
[280833.904414] ffff880100006310: 00000000 00000020 00000000 00000000  ....
...........
[280833.904416] ffff880100006320: 00006320 ffff8801 00006320 ffff8801   c......
c......
[280833.904418] ffff880100006330: ffffffe0 0000000f 00006338 ffff8801 
........8c......
[280833.904419] ffff880100006340: 00006338 ffff8801 8119bbe0 ffffffff 
8c..............
[280833.904421] ffff880100006350: 00000000 00000000 00213ce0 00213ce0 
.........<!..<!.
[280833.904423] ffff880100006360: 00000000 00000000                    ........
[280833.904429] CPU: 1 PID: 211 Comm: khugepaged Not tainted 4.13.3-gentoo #2
[280833.904430] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./AM1H-ITX, BIOS P1.50 04/29/2015
[280833.904432] Call Trace:
[280833.904445]  ? dump_stack+0x46/0x5e
[280833.904450]  ? ext4_destroy_inode+0x81/0x88
[280833.904454]  ? dispose_list+0x30/0x40
[280833.904456]  ? prune_icache_sb+0x48/0x58
[280833.904460]  ? super_cache_scan+0x133/0x180
[280833.904466]  ? shrink_slab.part.8.constprop.22+0x19e/0x250
[280833.904468]  ? shrink_node+0x58/0x190
[280833.904470]  ? try_to_free_pages+0x15a/0x320
[280833.904475]  ? __alloc_pages_nodemask+0x3db/0xaf0
[280833.904481]  ? collect_expired_timers+0xb0/0xb0
[280833.904485]  ? khugepaged+0x256/0x1590
[280833.904491]  ? do_wait_intr+0x80/0x80
[280833.904496]  ? kthread+0xf2/0x128
[280833.904498]  ? collapse_shmem.isra.4+0xa20/0xa20
[280833.904500]  ? kthread_create_on_node+0x40/0x40
[280833.904504]  ? call_usermodehelper_exec_async+0x134/0x138
[280833.904507]  ? ret_from_fork+0x22/0x30

debugfs:  stat <154567855>

Inode: 154567855   Type: regular    Mode:  0644   Flags: 0x80000
Generation: 912526789    Version: 0x00000000:00000001
User:   250   Group:   250   Project:     0   Size: 2472
File ACL: 0    Directory ACL: 0
Links: 1   Blockcount: 8
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x59d0f5d4:78688f4c -- Sun Oct  1 16:04:04 2017
 atime: 0x59d0f5d4:78688f4c -- Sun Oct  1 16:04:04 2017
 mtime: 0x59cf614a:00000000 -- Sat Sep 30 11:18:02 2017
crtime: 0x59d0f5d4:78688f4c -- Sun Oct  1 16:04:04 2017
Size of extra inode fields: 32
EXTENTS:
(0):618193946


debugfs -R 'ncheck 154567855' /dev/sda1
debugfs 1.43.3 (04-Sep-2016)
Inode   Pathname
154567855      
/usr/portage/metadata/md5-cache/dev-ruby/faraday_middleware-multi_json-0.0.6-r1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (19 preceding siblings ...)
  2017-10-03 22:44 ` bugzilla-daemon
@ 2017-10-04 12:35 ` bugzilla-daemon
  2017-10-05  2:27 ` bugzilla-daemon
  2017-10-05 11:00 ` bugzilla-daemon
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-10-04 12:35 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

Hussam Al-Tayeb (me@hussam.eu.org) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |---

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (20 preceding siblings ...)
  2017-10-04 12:35 ` bugzilla-daemon
@ 2017-10-05  2:27 ` bugzilla-daemon
  2017-10-05 11:00 ` bugzilla-daemon
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-10-05  2:27 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #21 from Theodore Tso (tytso@mit.edu) ---
Can you try to see if you can reliably reproduce the problem?   

And then can you see if you can reproduce it on some other hardware?

Note that one of the reasons why it's unfortunate when people reopen bugs,
especially when the original reporter has concluded it was probably caused by a
hardware issue in his case, is that it can sometimes be hard to disambiguate
data from the first report and successive reports.   (This is why I hate it
when Ubuntu users, in particular, glom on to reports by googling by symptom,
and assume that just because it has the same symptom, that it must be related
to every other problem with the same symptom, and all pile onto the same
Launchpad bug.  It's also why I believe Launchpad is mostly hopeless for a
developer....)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Bug 187051] "orphan list check failed" error in ext4
  2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
                   ` (21 preceding siblings ...)
  2017-10-05  2:27 ` bugzilla-daemon
@ 2017-10-05 11:00 ` bugzilla-daemon
  22 siblings, 0 replies; 24+ messages in thread
From: bugzilla-daemon @ 2017-10-05 11:00 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=187051

--- Comment #22 from Tomasz Goliński (tomaszg@math.uwb.edu.pl) ---
I can't reproduce it. So far it was only a single occurrence and I don't even
know what caused it. It was also during compilation but it didn't see any
problem, only the information in the log.

I didn't want to reopen the problem and I'm well aware that it is to transient
to be debugged in a systematic way. In light of this I didn't want to open a
new bug, it would be mostly pointless. However since it is something that
"should never happen" another example of it might (or might not ;) ) shed some
light on the issue.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-10-05 11:01 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-05 20:27 [Bug 187051] New: "orphan list check failed" error in ext4 bugzilla-daemon
2016-11-06 23:48 ` [Bug 187051] " bugzilla-daemon
2016-11-07 10:40 ` bugzilla-daemon
2016-11-07 17:02 ` bugzilla-daemon
2016-11-07 17:50 ` bugzilla-daemon
2016-11-07 19:39 ` bugzilla-daemon
2016-11-07 20:31 ` bugzilla-daemon
2016-11-08 15:15 ` bugzilla-daemon
2017-01-28 12:25 ` bugzilla-daemon
2017-01-28 12:52 ` bugzilla-daemon
2017-01-29  5:09 ` bugzilla-daemon
2017-01-29  9:02 ` bugzilla-daemon
2017-01-29 10:00 ` bugzilla-daemon
2017-01-29 11:53 ` bugzilla-daemon
2017-01-29 11:56 ` bugzilla-daemon
2017-01-29 11:56 ` bugzilla-daemon
2017-01-29 11:59 ` bugzilla-daemon
2017-01-30  2:21 ` bugzilla-daemon
2017-01-30 15:24 ` bugzilla-daemon
2017-08-28 11:57 ` bugzilla-daemon
2017-10-03 22:44 ` bugzilla-daemon
2017-10-04 12:35 ` bugzilla-daemon
2017-10-05  2:27 ` bugzilla-daemon
2017-10-05 11:00 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.