All of lore.kernel.org
 help / color / mirror / Atom feed
* 4 vol raid5 segfault on device delete
@ 2013-08-16 16:50 Craig Johnson
  2013-08-17 12:05 ` Chris Mason
  0 siblings, 1 reply; 4+ messages in thread
From: Craig Johnson @ 2013-08-16 16:50 UTC (permalink / raw)
  To: linux-btrfs

I have a 4 device volume with raid5 - trying to remove one of the
devices (plenty of free space) and I get an almost immediate segfault.
 Scrub shows no errors, repair show space cache invalid but nothing
else (I remounted with clear cache to be safe).  Lots of corrupt on
bdev (for 3 out of 4 drives), but I have no file access issues that I
know of.  Thanks!

Output below:

http://pastebin.com/AnrmZrt2

- Craig

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4 vol raid5 segfault on device delete
  2013-08-16 16:50 4 vol raid5 segfault on device delete Craig Johnson
@ 2013-08-17 12:05 ` Chris Mason
  2013-08-17 12:24   ` Craig Johnson
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Mason @ 2013-08-17 12:05 UTC (permalink / raw)
  To: Craig Johnson, linux-btrfs

Quoting Craig Johnson (2013-08-16 12:50:59)
> I have a 4 device volume with raid5 - trying to remove one of the
> devices (plenty of free space) and I get an almost immediate segfault.
>  Scrub shows no errors, repair show space cache invalid but nothing
> else (I remounted with clear cache to be safe).  Lots of corrupt on
> bdev (for 3 out of 4 drives), but I have no file access issues that I
> know of.  Thanks!
> 
> Output below:
> 
> http://pastebin.com/AnrmZrt2

Hi Craig,

Just double checking how you setup the device removal?  This is a 3.11
arch kernel?  Which rc was it based from?

-chris


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4 vol raid5 segfault on device delete
  2013-08-17 12:05 ` Chris Mason
@ 2013-08-17 12:24   ` Craig Johnson
  0 siblings, 0 replies; 4+ messages in thread
From: Craig Johnson @ 2013-08-17 12:24 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

I did this on 3.11-rc5 kernel I compiled a few days ago.
3.11.0-1-ARCH-00013-g584d88b-dirty.

Thanks!

On Sat, Aug 17, 2013 at 7:05 AM, Chris Mason <chris.mason@fusionio.com> wrote:
> Quoting Craig Johnson (2013-08-16 12:50:59)
>> I have a 4 device volume with raid5 - trying to remove one of the
>> devices (plenty of free space) and I get an almost immediate segfault.
>>  Scrub shows no errors, repair show space cache invalid but nothing
>> else (I remounted with clear cache to be safe).  Lots of corrupt on
>> bdev (for 3 out of 4 drives), but I have no file access issues that I
>> know of.  Thanks!
>>
>> Output below:
>>
>> http://pastebin.com/AnrmZrt2
>
> Hi Craig,
>
> Just double checking how you setup the device removal?  This is a 3.11
> arch kernel?  Which rc was it based from?
>
> -chris
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 4 vol raid5 segfault on device delete
       [not found] <CAPidFuhbjxYUqjLV_8tCCVX6aSNwZ1qJGCZvnSPx8cShGCrQvA@mail.gmail. com>
@ 2013-08-17  4:36 ` Duncan
  0 siblings, 0 replies; 4+ messages in thread
From: Duncan @ 2013-08-17  4:36 UTC (permalink / raw)
  To: linux-btrfs

Craig Johnson posted on Fri, 16 Aug 2013 11:50:59 -0500 as excerpted:

> I have a 4 device volume with raid5 - trying to remove one of the
> devices (plenty of free space) and I get an almost immediate segfault.
>  Scrub shows no errors, repair show space cache invalid but nothing
> else (I remounted with clear cache to be safe).  Lots of corrupt on bdev
> (for 3 out of 4 drives), but I have no file access issues that I know
> of.  Thanks!

Last I knew (kernel 3.10, where it was introduced, but I haven't seen any 
suggestion that 3.11 fixes all the problems yet), btrfs raid5/6 wasn't 
yet ready for anything like real use yet -- the all-OK code was there, 
but it couldn't yet cope with devices disappearing -- recreating the 
missing content from the checksums didn't yet work.

So "an almost immediate segfault" might be expected if you actually 
remove a device from a btrfs raid5/6, because only the all-OK code is 
actually there, it's writing the checksums but it isn't prepared to 
actually use them yet.

Btrfs raid0/1/10 should be usable, and /reasonably/ stable (for a 
filesystem still under development with bugs actively being fixed with 
each kernel release, that is), however (tho raid1 actually means two-way-
mirror, no matter the number of devices).

FWIW, I'm using btrfs raid1 here, but I have backups both to a second 
btrfs raid1 and to reiserfs (my previous filesystem and what I still use 
on "spinning rust, but it's not suitable for ssds, so I use btrfs on 
them), because btrfs IS still experimental.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-17 12:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-16 16:50 4 vol raid5 segfault on device delete Craig Johnson
2013-08-17 12:05 ` Chris Mason
2013-08-17 12:24   ` Craig Johnson
     [not found] <CAPidFuhbjxYUqjLV_8tCCVX6aSNwZ1qJGCZvnSPx8cShGCrQvA@mail.gmail. com>
2013-08-17  4:36 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.