All of lore.kernel.org
 help / color / mirror / Atom feed
* Question about raid5 disk recovery logic
@ 2012-07-01  7:08 Alexander Lyakas
  2012-07-01  8:00 ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Lyakas @ 2012-07-01  7:08 UTC (permalink / raw)
  To: linux-raid

Hi everybody,
I am trying to understand what happens when raid5 is recovering a
disk, and a write comes to a stripe that has not been recovered yet.
Does md first reconstruct the missing chunk and then applies the
write, or first the write is applied as if the array is still degraded
(and not recovering), and only later the missing chunk is
reconstructed (when the md_do_sync() loop gets to this area)?
I am looking at the stripe handling logic (kernel 2.6.38), can anybody
pls point me at the path that handle_stripe5() takes in that case?

Thanks,
Alex.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about raid5 disk recovery logic
  2012-07-01  7:08 Question about raid5 disk recovery logic Alexander Lyakas
@ 2012-07-01  8:00 ` NeilBrown
  2012-07-01 13:36   ` Alexander Lyakas
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2012-07-01  8:00 UTC (permalink / raw)
  To: Alexander Lyakas; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]

On Sun, 1 Jul 2012 10:08:40 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
wrote:

> Hi everybody,
> I am trying to understand what happens when raid5 is recovering a
> disk, and a write comes to a stripe that has not been recovered yet.
> Does md first reconstruct the missing chunk and then applies the
> write, or first the write is applied as if the array is still degraded
> (and not recovering), and only later the missing chunk is
> reconstructed (when the md_do_sync() loop gets to this area)?
> I am looking at the stripe handling logic (kernel 2.6.38), can anybody
> pls point me at the path that handle_stripe5() takes in that case?
> 
>

Hi Alex,

 The stripe is still degraded, so md/raid5 treats it like a write to a
 degraded array.
 Exactly what happens depends one which block is being written.
 If the block being written would be stored on the recovering devices, then
 md will perform a reconstruct-write.  It will read the other data blocks,
 calculate the parity, and write out the parity and the changed data.
 Similarly if the parity block is on the recovering device a
 reconstruct-write will be needed.
 If some other block is being written, md will do a read-modify-write to
 calculate the new parity and then write out the parity and data.  In this
 case the block on the recovering device will not be written.

 I hope that clarifies the situation.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about raid5 disk recovery logic
  2012-07-01  8:00 ` NeilBrown
@ 2012-07-01 13:36   ` Alexander Lyakas
  2012-07-01 21:44     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Lyakas @ 2012-07-01 13:36 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Thanks, Neil!
That clarifies.

Does this also mean, that when md_do_sync() gets to such
already-reconstructed stripe, it might reconstruct it once again,
unless the stripe stays in the stripe cache?

Thanks for helping,
Alex.


On Sun, Jul 1, 2012 at 11:00 AM, NeilBrown <neilb@suse.de> wrote:
> On Sun, 1 Jul 2012 10:08:40 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
> wrote:
>
>> Hi everybody,
>> I am trying to understand what happens when raid5 is recovering a
>> disk, and a write comes to a stripe that has not been recovered yet.
>> Does md first reconstruct the missing chunk and then applies the
>> write, or first the write is applied as if the array is still degraded
>> (and not recovering), and only later the missing chunk is
>> reconstructed (when the md_do_sync() loop gets to this area)?
>> I am looking at the stripe handling logic (kernel 2.6.38), can anybody
>> pls point me at the path that handle_stripe5() takes in that case?
>>
>>
>
> Hi Alex,
>
>  The stripe is still degraded, so md/raid5 treats it like a write to a
>  degraded array.
>  Exactly what happens depends one which block is being written.
>  If the block being written would be stored on the recovering devices, then
>  md will perform a reconstruct-write.  It will read the other data blocks,
>  calculate the parity, and write out the parity and the changed data.
>  Similarly if the parity block is on the recovering device a
>  reconstruct-write will be needed.
>  If some other block is being written, md will do a read-modify-write to
>  calculate the new parity and then write out the parity and data.  In this
>  case the block on the recovering device will not be written.
>
>  I hope that clarifies the situation.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about raid5 disk recovery logic
  2012-07-01 13:36   ` Alexander Lyakas
@ 2012-07-01 21:44     ` NeilBrown
  2012-07-02  8:32       ` Alexander Lyakas
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2012-07-01 21:44 UTC (permalink / raw)
  To: Alexander Lyakas; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2204 bytes --]

On Sun, 1 Jul 2012 16:36:51 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
wrote:

> Thanks, Neil!
> That clarifies.
> 
> Does this also mean, that when md_do_sync() gets to such
> already-reconstructed stripe, it might reconstruct it once again,
> unless the stripe stays in the stripe cache?

Yes, it will reconstruct it, and that might be "again" if the reconstructed
block has already been written.  If the stripe is still in the cache, I think
it will still write that block out again, but won't need to reconstruct it.

NeilBrown


> 
> Thanks for helping,
> Alex.
> 
> 
> On Sun, Jul 1, 2012 at 11:00 AM, NeilBrown <neilb@suse.de> wrote:
> > On Sun, 1 Jul 2012 10:08:40 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
> > wrote:
> >
> >> Hi everybody,
> >> I am trying to understand what happens when raid5 is recovering a
> >> disk, and a write comes to a stripe that has not been recovered yet.
> >> Does md first reconstruct the missing chunk and then applies the
> >> write, or first the write is applied as if the array is still degraded
> >> (and not recovering), and only later the missing chunk is
> >> reconstructed (when the md_do_sync() loop gets to this area)?
> >> I am looking at the stripe handling logic (kernel 2.6.38), can anybody
> >> pls point me at the path that handle_stripe5() takes in that case?
> >>
> >>
> >
> > Hi Alex,
> >
> >  The stripe is still degraded, so md/raid5 treats it like a write to a
> >  degraded array.
> >  Exactly what happens depends one which block is being written.
> >  If the block being written would be stored on the recovering devices, then
> >  md will perform a reconstruct-write.  It will read the other data blocks,
> >  calculate the parity, and write out the parity and the changed data.
> >  Similarly if the parity block is on the recovering device a
> >  reconstruct-write will be needed.
> >  If some other block is being written, md will do a read-modify-write to
> >  calculate the new parity and then write out the parity and data.  In this
> >  case the block on the recovering device will not be written.
> >
> >  I hope that clarifies the situation.
> >
> > NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about raid5 disk recovery logic
  2012-07-01 21:44     ` NeilBrown
@ 2012-07-02  8:32       ` Alexander Lyakas
  0 siblings, 0 replies; 5+ messages in thread
From: Alexander Lyakas @ 2012-07-02  8:32 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Thanks, Neil, for the clear explanation.

Alex.


On Mon, Jul 2, 2012 at 12:44 AM, NeilBrown <neilb@suse.de> wrote:
> On Sun, 1 Jul 2012 16:36:51 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
> wrote:
>
>> Thanks, Neil!
>> That clarifies.
>>
>> Does this also mean, that when md_do_sync() gets to such
>> already-reconstructed stripe, it might reconstruct it once again,
>> unless the stripe stays in the stripe cache?
>
> Yes, it will reconstruct it, and that might be "again" if the reconstructed
> block has already been written.  If the stripe is still in the cache, I think
> it will still write that block out again, but won't need to reconstruct it.
>
> NeilBrown
>
>
>>
>> Thanks for helping,
>> Alex.
>>
>>
>> On Sun, Jul 1, 2012 at 11:00 AM, NeilBrown <neilb@suse.de> wrote:
>> > On Sun, 1 Jul 2012 10:08:40 +0300 Alexander Lyakas <alex.bolshoy@gmail.com>
>> > wrote:
>> >
>> >> Hi everybody,
>> >> I am trying to understand what happens when raid5 is recovering a
>> >> disk, and a write comes to a stripe that has not been recovered yet.
>> >> Does md first reconstruct the missing chunk and then applies the
>> >> write, or first the write is applied as if the array is still degraded
>> >> (and not recovering), and only later the missing chunk is
>> >> reconstructed (when the md_do_sync() loop gets to this area)?
>> >> I am looking at the stripe handling logic (kernel 2.6.38), can anybody
>> >> pls point me at the path that handle_stripe5() takes in that case?
>> >>
>> >>
>> >
>> > Hi Alex,
>> >
>> >  The stripe is still degraded, so md/raid5 treats it like a write to a
>> >  degraded array.
>> >  Exactly what happens depends one which block is being written.
>> >  If the block being written would be stored on the recovering devices, then
>> >  md will perform a reconstruct-write.  It will read the other data blocks,
>> >  calculate the parity, and write out the parity and the changed data.
>> >  Similarly if the parity block is on the recovering device a
>> >  reconstruct-write will be needed.
>> >  If some other block is being written, md will do a read-modify-write to
>> >  calculate the new parity and then write out the parity and data.  In this
>> >  case the block on the recovering device will not be written.
>> >
>> >  I hope that clarifies the situation.
>> >
>> > NeilBrown
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-07-02  8:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-01  7:08 Question about raid5 disk recovery logic Alexander Lyakas
2012-07-01  8:00 ` NeilBrown
2012-07-01 13:36   ` Alexander Lyakas
2012-07-01 21:44     ` NeilBrown
2012-07-02  8:32       ` Alexander Lyakas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.