All of lore.kernel.org
 help / color / mirror / Atom feed
* BTRFS critical (device dm-0): invalid dir item name len: 45389
@ 2014-09-04  5:30 john terragon
  2014-09-04 13:03 ` john terragon
  2014-09-04 22:06 ` Duncan
  0 siblings, 2 replies; 9+ messages in thread
From: john terragon @ 2014-09-04  5:30 UTC (permalink / raw)
  To: Btrfs BTRFS

Hi.

When I traverse one of my btrfs, for example with a simple "find /", I
get the following in kmsg

BTRFS critical (device dm-0): invalid dir item name len: 45389

The message appears just one time (so I guess it involves just one
file/dir). dm-0 is the first dmcrypt device of a pair on which I have
btrfs in RAID0 (btrfs native raid). Though I can't be 100% sure, this
seems to be a very recent problem (I would have noticed something
"critical" in kmsg if it happened before). Everything else seems to
work fine.

So, should I be worried. Is there a way to fix this? (I assume that a
scrub would not do any good since it seems to be related to btrfs data
structures more than actual file data). Is there at least a way to
know which file/dir is involved? Maybe a verbose debug mode? Or maybe
I should just add some printk in the verify_dir_item function that
seems to generate the message.

Thanks
John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-04  5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
@ 2014-09-04 13:03 ` john terragon
  2014-09-04 22:26   ` Duncan
  2014-09-04 22:06 ` Duncan
  1 sibling, 1 reply; 9+ messages in thread
From: john terragon @ 2014-09-04 13:03 UTC (permalink / raw)
  To: Btrfs BTRFS

Some more details about this problem:

-the directory involved is /lib/modules/3.17.0-rc3-cu3/kernel/drivers/iio/gyro
-in that dir there should be kernel object named hid-sensor-gyro-3d.ko
but there's
 no trace of it
-that dir cannot be removed or overwritten. rm -rf fails saying that
the dir cannot be
 removed because it's not empty (?, even with -rf ?) and trying to
reinstall the .deb
 package for that kernel image (thus overwriting that dir) ends up in a segfault

The only workaround is to mv that dir (well, I simply mv the whole
3.17.0-rc3-cu3 dir but it should work also for the gyro subdir) and
reinstall the deb package.

So, it's pretty serious because there's actual loss of data (even
though I was lucky I just lost a ko I don't use).

John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-04  5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
  2014-09-04 13:03 ` john terragon
@ 2014-09-04 22:06 ` Duncan
  1 sibling, 0 replies; 9+ messages in thread
From: Duncan @ 2014-09-04 22:06 UTC (permalink / raw)
  To: linux-btrfs

john terragon posted on Thu, 04 Sep 2014 07:30:37 +0200 as excerpted:

> dm-0 is the first dmcrypt device of a pair on which I have
> btrfs in RAID0 (btrfs native raid).
> 
> I assume that a scrub would not do any good since it seems to be
> related to btrfs data structures more than actual file data[.]

On btrfs raid0 scrub can't fix anything anyway, tho it might give you a 
bit more information about problems it finds.

The only thing scrub does is verify block checksums and if there's a 
second copy (as there is in single-device dup mode for metadata, and 
multi-device raid1 and raid10 modes, raid5/6 scrub is still broken), 
verify its checksum and assuming it verifies, rewrite the bad copy with 
the new copy.  The same thing happens in normal operation if btrfs 
happens to come across the problem, but scrub gives you a way to 
systematically check the entire filesystem.

If there's no valid second copy, either because the block was in a single-
copy chunk (btrfs raid0 or single mode) or because the second copy is bad 
or the device it was on is missing (raid1/10 with a missing device), 
there's obviously no good copy to overwrite the bad one with, so the 
problem cannot be fixed.  However, if it's in a data chunk, scrub should 
log the file it was part of.  For metadata you simply get a number, and 
have to use btrfs debugging tools to figure out what's affected.

Since you said you're using btrfs raid0 mode, there's no second copy, so 
scrub might find bad checksums and give you a bit of additional 
information about what's affected, but it can never fix them, because 
there's no second copy to fix them from.  Scrub for btrfs raid0 (or 
single, and scrub for raid56 modes remains broken) mode is thus purely 
diagnostics, no possibility of fix, at least from scrub.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-04 13:03 ` john terragon
@ 2014-09-04 22:26   ` Duncan
  2014-09-05  0:20     ` john terragon
  0 siblings, 1 reply; 9+ messages in thread
From: Duncan @ 2014-09-04 22:26 UTC (permalink / raw)
  To: linux-btrfs

john terragon posted on Thu, 04 Sep 2014 15:03:04 +0200 as excerpted:

> Some more details about this problem:
> 
> -the directory involved is
> /lib/modules/3.17.0-rc3-cu3/kernel/drivers/iio/gyro -in that dir there
> should be kernel object named hid-sensor-gyro-3d.ko but there's
>  no trace of it
> -that dir cannot be removed or overwritten. rm -rf fails saying that the
> dir cannot be
>  removed because it's not empty (?, even with -rf ?) and trying to
> reinstall the .deb
>  package for that kernel image (thus overwriting that dir) ends up in a
>  segfault
> 
> The only workaround is to mv that dir (well, I simply mv the whole
> 3.17.0-rc3-cu3 dir but it should work also for the gyro subdir) and
> reinstall the deb package.
> 
> So, it's pretty serious because there's actual loss of data (even though
> I was lucky I just lost a ko I don't use).

I'd do a btrfs check (read-only without --repair or other switch) to see 
what it came up with, then if it looked reasonable, ensure my backups 
were fresh and do a btrfs check --repair.

If it fails or makes the problem worse, you can then mkfs and restore 
from the backups.

Meanwhile, nobody sane would keep valuable data on a raid0 (any raid0, 
btrfs or not) without backups anyway, as it's simply too risky, so by 
definition the problem /cannot/ be "pretty serious.  By definition, 
what's stored on a raid0 is more or less throw-away data that can either 
be recomputed or refetched from backup (which might be the net), or is 
simply tmp/cache in the first place.  So losing it can never be defined 
as "pretty serious", unless of course you're using raid0 for valuable 
data it was never intended, and don't keep current backups, which as I 
said for any responsible sysadmin (and with that I include every home 
user responsible for their own system, it's a responsibility taken too 
lightly by many) working with raid0 is insanity.

So if you're characterizing any potential loss of data on any sort of  
raid0 (btrfs or otherwise) as "pretty serious", I *STRONGLY* recommend 
that you reconsider using raid0 in the first place, because loss of 
ANYTHING, upto and including EVERYTHING on a raid0, by definition cannot 
be "pretty serious" or you're simply using it wrong.  And if you're doing 
a mkfs and restore from backup, that's the perfect opportunity to 
reconsider and choose something more appropriate. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-04 22:26   ` Duncan
@ 2014-09-05  0:20     ` john terragon
  2014-09-05  1:41       ` Chris Murphy
  2014-09-05  2:05       ` Duncan
  0 siblings, 2 replies; 9+ messages in thread
From: john terragon @ 2014-09-05  0:20 UTC (permalink / raw)
  To: Duncan; +Cc: Btrfs BTRFS

Everyone knows what raid0 entails. Moreover, with btrfs being an
experimental fs, not having backups would obviously be pure idiocy.

I wrote that it was "pretty serious" because the situation came out of
nowhere on a low-traffic fs on which the most exiciting thing that can
happen is an occasional snapshot once on a while when I do a heavy
update with apt-get (snapshot that gets always removed right after the
update goes invariably well and my paranoia fades).
The problem seems to have happen right after a hard lock probably due
to 3.17.0-rc3 (and before you explain to me what that rc3 stands for,
let me tell you that I'm not complaining, I knew what I was doing). I
had to power-off "brutally" and right after that the problem occurred.
I'm pretty sure about that because for obvious reasons I rsync the
hell out of that filesystem  every chance I get. Rsync obviously does
a traversal of the fs and so the "critical" (btrfs words, not mine)
problem would have showed on kmsg (another place that I watch like a
hawk, because of the raid0+experimental fs thing).

I don't know if you are a btrfs developer but that "pretty serious"
was not meant to offend them nor to complain. Actually I've been a
pretty happy customer up until now (and I still am) because I have
never been bitten by any big bug even with such a complex fs. I just
have this zombie directory that can't be rm'd, but I mv'd out of the
way and everything is fine. It'll get sorted when I do the next
wipe-and-restore iteration (again, being experimental, I don't let the
fs to become too "old").
So, the "pretty serious" was more due to the surprise than anything else.

John

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-05  0:20     ` john terragon
@ 2014-09-05  1:41       ` Chris Murphy
  2014-09-05  2:07         ` Duncan
  2014-09-05  2:05       ` Duncan
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2014-09-05  1:41 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 4, 2014, at 6:20 PM, john terragon <jterragon@gmail.com> wrote:

> Everyone knows what raid0 entails. Moreover, with btrfs being an
> experimental fs, not having backups would obviously be pure idiocy.

That is a bit of hyperbole. There is such a thing as innocently ignorant, as well as the misinformed.

> I don't know if you are a btrfs developer but that "pretty serious"
> was not meant to offend them nor to complain.

I think the "pretty serious" statement is legit. No one wants filesystems themselves responsible for losing data, if they did this with any regularity we wouldn't trust them, and we need to trust them. Since there are myriad sources for data loss or corruption a good autopsy is needed to understand the problem, and then whether the fix is one of prevented or can also be done during normal read, scrub, or btrfsck.

If the conditions can be repeated and reproduce the problem, that's a happy day.

> So, the "pretty serious" was more due to the surprise than anything else.

Right. Off chance scrub might fix the problem and then the evidence of why a regular read can't deal with it is gone. So I'd either take a btrfs-image first, or do a read-only non-background scrub and see what you get in dmesg:

btrfs scrub start -BRr <mp>

And btrfs check unmounted and without any options.


Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-05  0:20     ` john terragon
  2014-09-05  1:41       ` Chris Murphy
@ 2014-09-05  2:05       ` Duncan
  1 sibling, 0 replies; 9+ messages in thread
From: Duncan @ 2014-09-05  2:05 UTC (permalink / raw)
  To: linux-btrfs

john terragon posted on Fri, 05 Sep 2014 02:20:26 +0200 as excerpted:

> Everyone knows what raid0 entails. Moreover, with btrfs being an
> experimental fs, not having backups would obviously be pure idiocy.
> 
> I wrote that it was "pretty serious" because the situation came out of
> nowhere on a low-traffic fs[.]

> I'm pretty sure about that because for obvious reasons I rsync the hell
> out of that filesystem  every chance I get.

Mea culpa, then.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-05  1:41       ` Chris Murphy
@ 2014-09-05  2:07         ` Duncan
  2014-09-05  3:20           ` Chris Murphy
  0 siblings, 1 reply; 9+ messages in thread
From: Duncan @ 2014-09-05  2:07 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Thu, 04 Sep 2014 19:41:39 -0600 as excerpted:

>  Off chance scrub might fix the problem

How?  He said it's btrfs raid0.  There's no second copy to fix from.

(Of course if only the data is raid0, metadata being raid1, then if it's 
metadata yes a scrub could fix it, but that's not what he said...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BTRFS critical (device dm-0): invalid dir item name len: 45389
  2014-09-05  2:07         ` Duncan
@ 2014-09-05  3:20           ` Chris Murphy
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Murphy @ 2014-09-05  3:20 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 4, 2014, at 8:07 PM, Duncan <1i5t5.duncan@cox.net> wrote:

> Chris Murphy posted on Thu, 04 Sep 2014 19:41:39 -0600 as excerpted:
> 
>> Off chance scrub might fix the problem
> 
> How?  He said it's btrfs raid0.  There's no second copy to fix from.
> 
> (Of course if only the data is raid0, metadata being raid1, then if it's 
> metadata yes a scrub could fix it, but that's not what he said…)

True, I'm assuming only -draid0 was specified, and therefore the multiple device default of raid1 applies. In any case the conservative option is to ensure the fs isn't changed.

Chris Murphy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-09-05  3:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04  5:30 BTRFS critical (device dm-0): invalid dir item name len: 45389 john terragon
2014-09-04 13:03 ` john terragon
2014-09-04 22:26   ` Duncan
2014-09-05  0:20     ` john terragon
2014-09-05  1:41       ` Chris Murphy
2014-09-05  2:07         ` Duncan
2014-09-05  3:20           ` Chris Murphy
2014-09-05  2:05       ` Duncan
2014-09-04 22:06 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.