All of lore.kernel.org
 help / color / mirror / Atom feed
* mdadm reshape stop, resume with alternate/moved backup file?
       [not found] <4877c76c0911021643x6f1a0e37h795963451e521d0c@mail.gmail.com>
@ 2009-11-03  0:44 ` Michael Evans
  2009-11-03  0:59   ` NeilBrown
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Evans @ 2009-11-03  0:44 UTC (permalink / raw)
  To: linux-raid

In the past I'd only worked with software raid5, it used the temporary
file for a brief period at the beginning and then it was all
disk-bound.

I recently started a raid-6 takeover of one of my larger raid-5
arrays, it is running around 1/10th to 1/20th the speed I expect:

      2909829120 blocks super 1.1 level 6, 128k chunk, algorithm 18
[8/7] [UUUUUUUU]
      [=>...................]  reshape =  7.2% (35359488/484971520)
finish=11057.6min speed=677K/sec

I suspect this is because another array sharing the same devices is
where I put the temporary file, and further that it might be waiting
for complete hardware syncs before proceeding.  If that's the case I
expect that using a small, currently unused, area on unrelated block
devices would speed the operation up by at least 5x.

Can I safely pause the current restripe process ( 1060 pts/...    SL
 21:28 mdadm -G /dev/md52 -l6 --backup-file=/md52 ) with something
like kill 1060 and then re-invoke it with the backup file in another
location?  Or would it be this increadiably slow anyway?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mdadm reshape stop, resume with alternate/moved backup file?
  2009-11-03  0:44 ` mdadm reshape stop, resume with alternate/moved backup file? Michael Evans
@ 2009-11-03  0:59   ` NeilBrown
       [not found]     ` <4877c76c0911021911y4c8e7b05g47a4d594b4727f82@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: NeilBrown @ 2009-11-03  0:59 UTC (permalink / raw)
  To: Michael Evans; +Cc: linux-raid

On Tue, November 3, 2009 11:44 am, Michael Evans wrote:
> In the past I'd only worked with software raid5, it used the temporary
> file for a brief period at the beginning and then it was all
> disk-bound.
>
> I recently started a raid-6 takeover of one of my larger raid-5
> arrays, it is running around 1/10th to 1/20th the speed I expect:
>
>       2909829120 blocks super 1.1 level 6, 128k chunk, algorithm 18
> [8/7] [UUUUUUUU]
>       [=>...................]  reshape =  7.2% (35359488/484971520)
> finish=11057.6min speed=677K/sec
>
> I suspect this is because another array sharing the same devices is
> where I put the temporary file, and further that it might be waiting
> for complete hardware syncs before proceeding.  If that's the case I
> expect that using a small, currently unused, area on unrelated block
> devices would speed the operation up by at least 5x.
>
> Can I safely pause the current restripe process ( 1060 pts/...    SL
>  21:28 mdadm -G /dev/md52 -l6 --backup-file=/md52 ) with something
> like kill 1060 and then re-invoke it with the backup file in another
> location?  Or would it be this increadiably slow anyway?

1/ It is safe to stop the array, move the backup file, then reassemble
  the array giving it the moved backup file.
2/ This will actually be significantly faster even if the backup file is
  on the same device as there is a bug which causes the size of the
  data being backuped to be very small (and so very slow) when the reshape
  is first started.  When the reshape is restarted this bug does not apply
  and you get the backup performed in larger chunks.
3/ There is another bug where by if one of the devices in the array dies
  during the reshape, the backup process stops working correctly with the
  result that the reshape goes much faster but the backup is completely
  useless.  If you crash during the reshape after a failed device,
  you will probably lose data.  If you try to stop and restart the
  array after one device has failed, the restart will fail.  However
  this is still the safest thing to do.  I will try to put out some
  updates to mdadm so that you can reassemble the array safely in this
  case (and of course, fix the problem so that the backup is maintained
  throughout the entire run).

So yes, go ahead and move the file.  But if you get a device failure
stop the reshape and ask me to help - I'll get something to you within
24 hours - probably less but I have to allow for time zones...

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mdadm reshape stop, resume with alternate/moved backup file?
       [not found]         ` <4877c76c0911022039j24cd4ec8n2f5485eb716270e9@mail.gmail.com>
@ 2009-11-03  4:59           ` Michael Evans
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Evans @ 2009-11-03  4:59 UTC (permalink / raw)
  To: NeilBrown, linux-raid

Thank you for the help, and supporting linux software raid for all these years.

I look forward to having double-redundancy so that it's far more
unlikely I'll run in to a fatal failure during rebuild after a disk
finally fails.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-11-03  4:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4877c76c0911021643x6f1a0e37h795963451e521d0c@mail.gmail.com>
2009-11-03  0:44 ` mdadm reshape stop, resume with alternate/moved backup file? Michael Evans
2009-11-03  0:59   ` NeilBrown
     [not found]     ` <4877c76c0911021911y4c8e7b05g47a4d594b4727f82@mail.gmail.com>
     [not found]       ` <b22bd9c037682d4d031be6dea1c0e3ab.squirrel@neil.brown.name>
     [not found]         ` <4877c76c0911022039j24cd4ec8n2f5485eb716270e9@mail.gmail.com>
2009-11-03  4:59           ` Michael Evans

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.