All of lore.kernel.org
 help / color / mirror / Atom feed
* Subject: Raid0 Reshape . Preface
@ 2009-06-16 21:51 raz ben yehuda
  2009-06-17 11:55 ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: raz ben yehuda @ 2009-06-16 21:51 UTC (permalink / raw)
  To: linux raid, Neil Brown

Neil Hello
This email is followed by a set of the raid0 reshape patches for your inspection.
hopefully i haven't mess anything in this set.

#1 Tests.
Most tests were conducted on 2.6.30-rc4 (i ported to 2.6.30-rc7
				 and made some sanity checks).

. Multiple zoned raid0 to a new multipled zoned raid0 
		(triggered by adding a disk).
. Online reshape
  . I copied while reshaping and performed a cksum over all files while reshaping.
  . Once a reshape is complete; Reboot and assemble the raid and re-check the files.

. Resume reshape
	.Stop reshape gracefully reboot and continue from same position.
	.Stop reshape gracefully , stop raid, re-assemble . reshape resumes automatically.
 	.Power off machine abruptly by unplugging the power cable.
			check resume and checksum of content known to be ok. (on ext3 though).
	. Graceful stop. assemble and resume, unplug cable, boot and re-assemble. checksum files.

. backward compatibility.
	. Once a reshape is complete, boot to kernel 2.6.18-8.el5 ,mount , and checksum files.
. Big volume reshape. 1/2TB reshape ( 10 hours reshape ).
. Big chunks .200M
. Non power 2 chunks 1K,101K,7001K.
. power 2 chunks. 64K and 1024K.
. USB disks.
. SATA disks.
. Meta data. 0.9,1,1.0,1.1,1.2
. add disk , reshape, add new disk, reshape again without booting.
. two raid0 reshaped. ( reshape is queued ).
. checked with lockdep.

#2 patch set description
	I apply these patches on top of 2.6.30-rc7 :
	
	. for md.c
	
	commit fd894e80a4e368e49c393463986a5d9f5ab16bd9
	Author: NeilBrown <neilb@suse.de>
	Date:   Thu Jun 4 15:23:11 2009 +1000

	. for raid0.c

	commit c9aaceea6015213e571761c0f0899614863c4868
	Author: NeilBrown <neilb@suse.de>
	Date:   Thu Jun 4 15:11:14 2009 +1000
	
	. for raid0.h

	commit c3b56cacfab1ababe07e401ac7b515463a7d2ae6
	Author: NeilBrown <neilb@suse.de>
	Date:   Thu Jun 4 13:49:55 2009 +1000
		

	patch 1: md assumes all personalities members are of the same size.
		fix this just for raid0.

	patch 2: have find_zone return NULL instead of reporting a bug.
		 It is us up to the use decide what to do. reshape uses 
		 find_zone to check end of reshape.	
	
	patch 3: raid0_size to support reshape.
	
	patch 4: replace dump_zones name to print_conf , accept device name
		 as a parameter, reshape prints the new configuration before
		 starting.

	patch 5: beautify create_strip and split raid0_run. have this procedures accept
		 conf and not mddev and a list of device members.
	
	patch 6: remove redudnant argument from is_in_chunk_boundary

	patch 7: have map_sector get raid0_conf struct  instead of mddev
	
	patch 8: split raid0_make_request to a make_request core and raid0_make_request wrapper. 
			reshape used the core make_request.

	patch 9: add hot_add_disk and hot_remove_disk.

	patch 10: reshape core code.

	patch 11: reshape structures. header file.
	
	patch 12: wrap reshape code ifdefs

	patch 13: add kconfig button :experimental: adding drives for raid0 
	
	patch 1: mdadm to support add and reshape for raid0.

#3 Errata:
	1. raid0d remains after a reshape is done. 
		raid0d exists only to stop/start raid0_sync.
	2. md is still aware to level 0.
	3. when a reshape starts, md tries to remove the new busy disk. currently I refuse
		to do it.
	4. I cannot add raid5 as a member to a raid0 ( or raid5 to a raid5 ).
		mdadm reports device is busy. hamm...though common to all other personalities,
		it is quite a problem. 
		
#4 My Todo List
	. Fix addition of raid5 as member of raid0.
	. Replace the reverse mapping with an algorithm.
	. Support reshape driven from a change in its members size, 
		and not just because of a new member. This is most important
		for raid50.
	. Support asymmetric chunk sizes. ( each component has a different chunk size ).
		if I assume raid5 may grow by adding a disk, I will have two different 
		chunks sizes.
	. Parallel reshape, have two reshapes over two raids or more run concurrently.
 
#5  Management
	. Other than me, who else is going to test this code ?




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Subject: Raid0 Reshape . Preface
  2009-06-16 21:51 Subject: Raid0 Reshape . Preface raz ben yehuda
@ 2009-06-17 11:55 ` Neil Brown
  2009-06-17 12:17   ` John Robinson
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2009-06-17 11:55 UTC (permalink / raw)
  To: raz ben yehuda; +Cc: linux raid

On Wednesday June 17, raziebe@gmail.com wrote:
> Neil Hello
> This email is followed by a set of the raid0 reshape patches for your inspection.
> hopefully i haven't mess anything in this set.

Hi Raz,

 I've had a bit of a look and while it would need a few refinements it
 seems to generally make sense - though there are a lot of details I
 haven't looked at thoroughly yet.

 However... I'm not at all sure I want to continue with this
 direction.

 The more I think about it, the more I feel I would prefer to use the
 raid5 module for all restriping.
 So to reshape a raid0, we would convert it to degraded-raid4, reshape
 that, then convert back to raid0.

 I would really like to keep the raid0 module very simple and clean.

 For the above to work, the most significant changes that would be
 needed to raid5 are:

  1/ add multi-zone support for raid4.
     Much of the complexity of this would be in init_stripe
     (to set sh->disks and the various sh->dev[].)
     and raid5_compute_sector (to choose the correct stripe).

  2/ add non-power-of-2 chunks support.
     We already use sector_div quite a lot, so this probably isn't
     a big problem.

  3/ Record the size of each device in the metadata.
     This isn't needed for raid0 as each device records its own size,
     but for raid4, one device might be missing, and we need to know
     how big it is.
     For v0.90 there is room in the superblock to store this
     information.
     For v1.x there isn't - I only allowed 2 bytes per device, and
     we really need another 8.   This is not an insurmountable problem
     as we can add a feature flag to change the size of the per-device
     information from 2 to 16 bytes.

 I would certainly like to see how this approach pans out before
 committing one way or the other.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Subject: Raid0 Reshape . Preface
  2009-06-17 11:55 ` Neil Brown
@ 2009-06-17 12:17   ` John Robinson
  2009-06-17 22:31     ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: John Robinson @ 2009-06-17 12:17 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux raid

On Wed, June 17, 2009 12:55 pm, Neil Brown wrote:
>  The more I think about it, the more I feel I would prefer to use the
>  raid5 module for all restriping.

That doesn't make sense to me, for various reasons including those for
having separate RAID personality modules in the first place. On the other
hand, if you're keeping the raid0 module simple, perhaps the raid5 module
could also be simplified and all objectives could be served by shipping
out all restriping to a new, separate restriping module? Or even to
userspace?

Cheers,

John.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Subject: Raid0 Reshape . Preface
  2009-06-17 12:17   ` John Robinson
@ 2009-06-17 22:31     ` Neil Brown
  2009-06-21 18:19       ` Bill Davidsen
  0 siblings, 1 reply; 5+ messages in thread
From: Neil Brown @ 2009-06-17 22:31 UTC (permalink / raw)
  To: John Robinson; +Cc: linux raid

On Wednesday June 17, john.robinson@anonymous.org.uk wrote:
> On Wed, June 17, 2009 12:55 pm, Neil Brown wrote:
> >  The more I think about it, the more I feel I would prefer to use the
> >  raid5 module for all restriping.
> 
> That doesn't make sense to me, for various reasons including those for
> having separate RAID personality modules in the first place. On the other
> hand, if you're keeping the raid0 module simple, perhaps the raid5 module
> could also be simplified and all objectives could be served by shipping
> out all restriping to a new, separate restriping module? Or even to
> userspace?

The enhancements needed to raid5 to make is able to handle reshaping a
raid0 are either minor, or are ones that we want eventually any way.
Given that, there seems little point implementing the same thing in
two different ways.

I have occasionally thought that it would be nice if all the "Reshape"
code could be separated out of raid5 as it is not used very often.
However I suspect that you would find that it isn't very much code as
it shares a lot with resync and normal raid5 processing.

Striping it out into user-space is also tempting.  The tricky bit
would be deciding on the interface - exactly how much to leave in the
kernel and how much to takeout.  It might be an interesting exercise.
It's hard to know if it would be productive or not.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Subject: Raid0 Reshape . Preface
  2009-06-17 22:31     ` Neil Brown
@ 2009-06-21 18:19       ` Bill Davidsen
  0 siblings, 0 replies; 5+ messages in thread
From: Bill Davidsen @ 2009-06-21 18:19 UTC (permalink / raw)
  To: Neil Brown; +Cc: John Robinson, linux raid

Neil Brown wrote:
> On Wednesday June 17, john.robinson@anonymous.org.uk wrote:
>   
>> On Wed, June 17, 2009 12:55 pm, Neil Brown wrote:
>>     
>>>  The more I think about it, the more I feel I would prefer to use the
>>>  raid5 module for all restriping.
>>>       
>> That doesn't make sense to me, for various reasons including those for
>> having separate RAID personality modules in the first place. On the other
>> hand, if you're keeping the raid0 module simple, perhaps the raid5 module
>> could also be simplified and all objectives could be served by shipping
>> out all restriping to a new, separate restriping module? Or even to
>> userspace?
>>     
>
> The enhancements needed to raid5 to make is able to handle reshaping a
> raid0 are either minor, or are ones that we want eventually any way.
> Given that, there seems little point implementing the same thing in
> two different ways.
>
> I have occasionally thought that it would be nice if all the "Reshape"
> code could be separated out of raid5 as it is not used very often.
> However I suspect that you would find that it isn't very much code as
> it shares a lot with resync and normal raid5 processing.
>
> Striping it out into user-space is also tempting.  The tricky bit
> would be deciding on the interface - exactly how much to leave in the
> kernel and how much to takeout.  It might be an interesting exercise.
> It's hard to know if it would be productive or not.
>   

Render unto Caesar what is Caesar's, etc. I don't think anything 
critical like that should be in user space, it invites people to try to 
"improve it" and really mess up their data, then come looking for help. 
At least by raising the bar to require being able to build a kernel you 
eliminate some people who probably shouldn't be doing that.

On the other hand, putting reshape into a module which could be loaded 
if needed or enhanced and inserted for testing might be a good idea. It 
just seems easier to have kernel code than to try to make user code do 
the right thing in terms of keeping things resident, running with 
appropriate priorities, etc.

-- 
Bill Davidsen <davidsen@tmr.com>
  Obscure bug of 2004: BASH BUFFER OVERFLOW - if bash is being run by a
normal user and is setuid root, with the "vi" line edit mode selected,
and the character set is "big5," an off-by-one error occurs during
wildcard (glob) expansion.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-06-21 18:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-16 21:51 Subject: Raid0 Reshape . Preface raz ben yehuda
2009-06-17 11:55 ` Neil Brown
2009-06-17 12:17   ` John Robinson
2009-06-17 22:31     ` Neil Brown
2009-06-21 18:19       ` Bill Davidsen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.