* BTRFS critical (device dm-0): invalid dir item name len: 45389
@ 2014-09-04 5:30 john terragon
2014-09-04 13:03 ` john terragon
2014-09-04 22:06 ` Duncan
0 siblings, 2 replies; 9+ messages in thread
From: john terragon @ 2014-09-04 5:30 UTC (permalink / raw)
To: Btrfs BTRFS
Hi.
When I traverse one of my btrfs, for example with a simple "find /", I
get the following in kmsg
BTRFS critical (device dm-0): invalid dir item name len: 45389
The message appears just one time (so I guess it involves just one
file/dir). dm-0 is the first dmcrypt device of a pair on which I have
btrfs in RAID0 (btrfs native raid). Though I can't be 100% sure, this
seems to be a very recent problem (I would have noticed something
"critical" in kmsg if it happened before). Everything else seems to
work fine.
So, should I be worried. Is there a way to fix this? (I assume that a
scrub would not do any good since it seems to be related to btrfs data
structures more than actual file data). Is there at least a way to
know which file/dir is involved? Maybe a verbose debug mode? Or maybe
I should just add some printk in the verify_dir_item function that
seems to generate the message.
Thanks
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-04 5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
@ 2014-09-04 13:03 ` john terragon
2014-09-04 22:26 ` Duncan
2014-09-04 22:06 ` Duncan
1 sibling, 1 reply; 9+ messages in thread
From: john terragon @ 2014-09-04 13:03 UTC (permalink / raw)
To: Btrfs BTRFS
Some more details about this problem:
-the directory involved is /lib/modules/3.17.0-rc3-cu3/kernel/drivers/iio/gyro
-in that dir there should be kernel object named hid-sensor-gyro-3d.ko
but there's
no trace of it
-that dir cannot be removed or overwritten. rm -rf fails saying that
the dir cannot be
removed because it's not empty (?, even with -rf ?) and trying to
reinstall the .deb
package for that kernel image (thus overwriting that dir) ends up in a segfault
The only workaround is to mv that dir (well, I simply mv the whole
3.17.0-rc3-cu3 dir but it should work also for the gyro subdir) and
reinstall the deb package.
So, it's pretty serious because there's actual loss of data (even
though I was lucky I just lost a ko I don't use).
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-04 5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
2014-09-04 13:03 ` john terragon
@ 2014-09-04 22:06 ` Duncan
1 sibling, 0 replies; 9+ messages in thread
From: Duncan @ 2014-09-04 22:06 UTC (permalink / raw)
To: linux-btrfs
john terragon posted on Thu, 04 Sep 2014 07:30:37 +0200 as excerpted:
> dm-0 is the first dmcrypt device of a pair on which I have
> btrfs in RAID0 (btrfs native raid).
>
> I assume that a scrub would not do any good since it seems to be
> related to btrfs data structures more than actual file data[.]
On btrfs raid0 scrub can't fix anything anyway, tho it might give you a
bit more information about problems it finds.
The only thing scrub does is verify block checksums and if there's a
second copy (as there is in single-device dup mode for metadata, and
multi-device raid1 and raid10 modes, raid5/6 scrub is still broken),
verify its checksum and assuming it verifies, rewrite the bad copy with
the new copy. The same thing happens in normal operation if btrfs
happens to come across the problem, but scrub gives you a way to
systematically check the entire filesystem.
If there's no valid second copy, either because the block was in a single-
copy chunk (btrfs raid0 or single mode) or because the second copy is bad
or the device it was on is missing (raid1/10 with a missing device),
there's obviously no good copy to overwrite the bad one with, so the
problem cannot be fixed. However, if it's in a data chunk, scrub should
log the file it was part of. For metadata you simply get a number, and
have to use btrfs debugging tools to figure out what's affected.
Since you said you're using btrfs raid0 mode, there's no second copy, so
scrub might find bad checksums and give you a bit of additional
information about what's affected, but it can never fix them, because
there's no second copy to fix them from. Scrub for btrfs raid0 (or
single, and scrub for raid56 modes remains broken) mode is thus purely
diagnostics, no possibility of fix, at least from scrub.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-04 13:03 ` john terragon
@ 2014-09-04 22:26 ` Duncan
2014-09-05 0:20 ` john terragon
0 siblings, 1 reply; 9+ messages in thread
From: Duncan @ 2014-09-04 22:26 UTC (permalink / raw)
To: linux-btrfs
john terragon posted on Thu, 04 Sep 2014 15:03:04 +0200 as excerpted:
> Some more details about this problem:
>
> -the directory involved is
> /lib/modules/3.17.0-rc3-cu3/kernel/drivers/iio/gyro -in that dir there
> should be kernel object named hid-sensor-gyro-3d.ko but there's
> no trace of it
> -that dir cannot be removed or overwritten. rm -rf fails saying that the
> dir cannot be
> removed because it's not empty (?, even with -rf ?) and trying to
> reinstall the .deb
> package for that kernel image (thus overwriting that dir) ends up in a
> segfault
>
> The only workaround is to mv that dir (well, I simply mv the whole
> 3.17.0-rc3-cu3 dir but it should work also for the gyro subdir) and
> reinstall the deb package.
>
> So, it's pretty serious because there's actual loss of data (even though
> I was lucky I just lost a ko I don't use).
I'd do a btrfs check (read-only without --repair or other switch) to see
what it came up with, then if it looked reasonable, ensure my backups
were fresh and do a btrfs check --repair.
If it fails or makes the problem worse, you can then mkfs and restore
from the backups.
Meanwhile, nobody sane would keep valuable data on a raid0 (any raid0,
btrfs or not) without backups anyway, as it's simply too risky, so by
definition the problem /cannot/ be "pretty serious. By definition,
what's stored on a raid0 is more or less throw-away data that can either
be recomputed or refetched from backup (which might be the net), or is
simply tmp/cache in the first place. So losing it can never be defined
as "pretty serious", unless of course you're using raid0 for valuable
data it was never intended, and don't keep current backups, which as I
said for any responsible sysadmin (and with that I include every home
user responsible for their own system, it's a responsibility taken too
lightly by many) working with raid0 is insanity.
So if you're characterizing any potential loss of data on any sort of
raid0 (btrfs or otherwise) as "pretty serious", I *STRONGLY* recommend
that you reconsider using raid0 in the first place, because loss of
ANYTHING, upto and including EVERYTHING on a raid0, by definition cannot
be "pretty serious" or you're simply using it wrong. And if you're doing
a mkfs and restore from backup, that's the perfect opportunity to
reconsider and choose something more appropriate. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-04 22:26 ` Duncan
@ 2014-09-05 0:20 ` john terragon
2014-09-05 1:41 ` Chris Murphy
2014-09-05 2:05 ` Duncan
0 siblings, 2 replies; 9+ messages in thread
From: john terragon @ 2014-09-05 0:20 UTC (permalink / raw)
To: Duncan; +Cc: Btrfs BTRFS
Everyone knows what raid0 entails. Moreover, with btrfs being an
experimental fs, not having backups would obviously be pure idiocy.
I wrote that it was "pretty serious" because the situation came out of
nowhere on a low-traffic fs on which the most exiciting thing that can
happen is an occasional snapshot once on a while when I do a heavy
update with apt-get (snapshot that gets always removed right after the
update goes invariably well and my paranoia fades).
The problem seems to have happen right after a hard lock probably due
to 3.17.0-rc3 (and before you explain to me what that rc3 stands for,
let me tell you that I'm not complaining, I knew what I was doing). I
had to power-off "brutally" and right after that the problem occurred.
I'm pretty sure about that because for obvious reasons I rsync the
hell out of that filesystem every chance I get. Rsync obviously does
a traversal of the fs and so the "critical" (btrfs words, not mine)
problem would have showed on kmsg (another place that I watch like a
hawk, because of the raid0+experimental fs thing).
I don't know if you are a btrfs developer but that "pretty serious"
was not meant to offend them nor to complain. Actually I've been a
pretty happy customer up until now (and I still am) because I have
never been bitten by any big bug even with such a complex fs. I just
have this zombie directory that can't be rm'd, but I mv'd out of the
way and everything is fine. It'll get sorted when I do the next
wipe-and-restore iteration (again, being experimental, I don't let the
fs to become too "old").
So, the "pretty serious" was more due to the surprise than anything else.
John
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-05 0:20 ` john terragon
@ 2014-09-05 1:41 ` Chris Murphy
2014-09-05 2:07 ` Duncan
2014-09-05 2:05 ` Duncan
1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2014-09-05 1:41 UTC (permalink / raw)
To: Btrfs BTRFS
On Sep 4, 2014, at 6:20 PM, john terragon <jterragon@gmail.com> wrote:
> Everyone knows what raid0 entails. Moreover, with btrfs being an
> experimental fs, not having backups would obviously be pure idiocy.
That is a bit of hyperbole. There is such a thing as innocently ignorant, as well as the misinformed.
> I don't know if you are a btrfs developer but that "pretty serious"
> was not meant to offend them nor to complain.
I think the "pretty serious" statement is legit. No one wants filesystems themselves responsible for losing data, if they did this with any regularity we wouldn't trust them, and we need to trust them. Since there are myriad sources for data loss or corruption a good autopsy is needed to understand the problem, and then whether the fix is one of prevented or can also be done during normal read, scrub, or btrfsck.
If the conditions can be repeated and reproduce the problem, that's a happy day.
> So, the "pretty serious" was more due to the surprise than anything else.
Right. Off chance scrub might fix the problem and then the evidence of why a regular read can't deal with it is gone. So I'd either take a btrfs-image first, or do a read-only non-background scrub and see what you get in dmesg:
btrfs scrub start -BRr <mp>
And btrfs check unmounted and without any options.
Chris Murphy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-05 0:20 ` john terragon
2014-09-05 1:41 ` Chris Murphy
@ 2014-09-05 2:05 ` Duncan
1 sibling, 0 replies; 9+ messages in thread
From: Duncan @ 2014-09-05 2:05 UTC (permalink / raw)
To: linux-btrfs
john terragon posted on Fri, 05 Sep 2014 02:20:26 +0200 as excerpted:
> Everyone knows what raid0 entails. Moreover, with btrfs being an
> experimental fs, not having backups would obviously be pure idiocy.
>
> I wrote that it was "pretty serious" because the situation came out of
> nowhere on a low-traffic fs[.]
> I'm pretty sure about that because for obvious reasons I rsync the hell
> out of that filesystem every chance I get.
Mea culpa, then.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-05 1:41 ` Chris Murphy
@ 2014-09-05 2:07 ` Duncan
2014-09-05 3:20 ` Chris Murphy
0 siblings, 1 reply; 9+ messages in thread
From: Duncan @ 2014-09-05 2:07 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Thu, 04 Sep 2014 19:41:39 -0600 as excerpted:
> Off chance scrub might fix the problem
How? He said it's btrfs raid0. There's no second copy to fix from.
(Of course if only the data is raid0, metadata being raid1, then if it's
metadata yes a scrub could fix it, but that's not what he said...)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
2014-09-05 2:07 ` Duncan
@ 2014-09-05 3:20 ` Chris Murphy
0 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2014-09-05 3:20 UTC (permalink / raw)
To: Btrfs BTRFS
On Sep 4, 2014, at 8:07 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Chris Murphy posted on Thu, 04 Sep 2014 19:41:39 -0600 as excerpted:
>
>> Off chance scrub might fix the problem
>
> How? He said it's btrfs raid0. There's no second copy to fix from.
>
> (Of course if only the data is raid0, metadata being raid1, then if it's
> metadata yes a scrub could fix it, but that's not what he said…)
True, I'm assuming only -draid0 was specified, and therefore the multiple device default of raid1 applies. In any case the conservative option is to ensure the fs isn't changed.
Chris Murphy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-09-05 3:20 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04 5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
2014-09-04 13:03 ` john terragon
2014-09-04 22:26 ` Duncan
2014-09-05 0:20 ` john terragon
2014-09-05 1:41 ` Chris Murphy
2014-09-05 2:07 ` Duncan
2014-09-05 3:20 ` Chris Murphy
2014-09-05 2:05 ` Duncan
2014-09-04 22:06 ` Duncan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.