RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6

All of lore.kernel.org
 help / color / mirror / Atom feed

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
@ 2014-04-11 12:11 Manibalan P
  2014-04-23  7:07 ` NeilBrown
  2014-04-23  9:19 ` Pasi Kärkkäinen
  0 siblings, 2 replies; 18+ messages in thread
From: Manibalan P @ 2014-04-11 12:11 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

Hi Neil,

Also, I found the data corruption issue on RHEL 6.5.

For your kind attention, I up-ported the md code [raid5.c + raid5.h]
from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
up-ported code.

Thanks,
Manibalan.

-----Original Message-----
From: Manibalan P 
Sent: Monday, March 24, 2014 6:46 PM
To: 'linux-raid@vger.kernel.org'
Cc: neilb@suse.de
Subject: RE: raid6 - data integrity issue - data mis-compare on
rebuilding RAID 6 - with 100 Mb resync speed.

Hi,

I have performed the following tests to narrow down the integrity issue.

1. RAID 6, single drive failure - NO ISSUE
	a. Running IO
	b. mdadm set faulty and remove a drive
	c. mdadm add the drive back
 There is no mis-compare happen in this path.

2. RAID 6, two drive failure - write during Degrade and verify after
rebuild 
	a. remove two drives, to make the RAID array degraded.
	b. now run write IO write cycle, wait till the write cycle
completes
	c. insert the drives back one by one, and wait till the re-build
completes and a RAID array become optimal.
	d. now perform the verification cycle.
There is no mis-compare happened in this path also.

During All my test, the sync_Speed_max and min is set to 100Mb

So, as you referred in your previous mail, the corruption might be
happening only during resync and IO happens in parallel.

Also, I tested with upstream 2.6.32 kernel from git:
"http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ -
tags/v2.6.32"
	And I am facing mis-compare issue in this kernel as well.  on
RAID 6, two drive failure with high sync_speed.

Thanks,
Manibalan.

-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de]
Sent: Thursday, March 13, 2014 11:49 AM
To: Manibalan P
Cc: linux-raid@vger.kernel.org
Subject: Re: raid6 - data integrity issue - data mis-compare on
rebuilding RAID 6 - with 100 Mb resync speed.

On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
<pmanibalan@amiindia.co.in>
wrote:

> >
> >Was the array fully synced before you started the test?
> 
> Yes , IO is started, only after the re-sync is completed.
>  And to add more info,
>              I am facing this mis-compare only with high resync speed 
> (30M to 100M), I ran the same test with resync speed min -10M and max
> - 30M, without any issue. So the  issue has relationship with 
> sync_speed_max / min.

So presumably it is an interaction between recovery and IO.  Maybe if we
write to a stripe that is being recoverred, or recover a stripe that is
being written to, then something gets confused.

I'll have a look to see what I can find.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-11 12:11 raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed Manibalan P
@ 2014-04-23  7:07 ` NeilBrown
  2014-04-23 17:02   ` Dan Williams
  2014-04-23  9:19 ` Pasi Kärkkäinen
  1 sibling, 1 reply; 18+ messages in thread
From: NeilBrown @ 2014-04-23  7:07 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid, Dan Williams

[-- Attachment #1: Type: text/plain, Size: 780 bytes --]

On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
wrote:

> Hi Neil,
> 
> Also, I found the data corruption issue on RHEL 6.5.
> 
> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
> up-ported code.

This narrows it down to between 2.6.29 and 2.6.32 - is that correct?

So it is probably the change to RAID6 to support async parity calculations.

Looking at the code always makes my head spin.

Dan : have you any ideas?

It seems that writing to a double-degraded RAID6 while it is recovering to
a space can trigger data corruption.

2.6.29 works
2.6.32 doesn't
3.8.0 still doesn't.

I suspect async parity calculations.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-11 12:11 raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed Manibalan P
  2014-04-23  7:07 ` NeilBrown
@ 2014-04-23  9:19 ` Pasi Kärkkäinen
  2014-04-23  9:25   ` Manibalan P
  1 sibling, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2014-04-23  9:19 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid, neilb

On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
> Hi Neil,
> 
> Also, I found the data corruption issue on RHEL 6.5.
> 

Did you file a bug about the corruption to redhat bugzilla?

-- Pasi

> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
> up-ported code.
> 
> Thanks,
> Manibalan.
> 
> -----Original Message-----
> From: Manibalan P 
> Sent: Monday, March 24, 2014 6:46 PM
> To: 'linux-raid@vger.kernel.org'
> Cc: neilb@suse.de
> Subject: RE: raid6 - data integrity issue - data mis-compare on
> rebuilding RAID 6 - with 100 Mb resync speed.
> 
> Hi,
> 
> I have performed the following tests to narrow down the integrity issue.
> 
> 1. RAID 6, single drive failure - NO ISSUE
> 	a. Running IO
> 	b. mdadm set faulty and remove a drive
> 	c. mdadm add the drive back
>  There is no mis-compare happen in this path.
> 
> 2. RAID 6, two drive failure - write during Degrade and verify after
> rebuild 
> 	a. remove two drives, to make the RAID array degraded.
> 	b. now run write IO write cycle, wait till the write cycle
> completes
> 	c. insert the drives back one by one, and wait till the re-build
> completes and a RAID array become optimal.
> 	d. now perform the verification cycle.
> There is no mis-compare happened in this path also.
> 
> During All my test, the sync_Speed_max and min is set to 100Mb
> 
> So, as you referred in your previous mail, the corruption might be
> happening only during resync and IO happens in parallel.
> 
> Also, I tested with upstream 2.6.32 kernel from git:
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ -
> tags/v2.6.32"
> 	And I am facing mis-compare issue in this kernel as well.  on
> RAID 6, two drive failure with high sync_speed.
> 
> Thanks,
> Manibalan.
> 
> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Thursday, March 13, 2014 11:49 AM
> To: Manibalan P
> Cc: linux-raid@vger.kernel.org
> Subject: Re: raid6 - data integrity issue - data mis-compare on
> rebuilding RAID 6 - with 100 Mb resync speed.
> 
> On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> <pmanibalan@amiindia.co.in>
> wrote:
> 
> > >
> > >Was the array fully synced before you started the test?
> > 
> > Yes , IO is started, only after the re-sync is completed.
> >  And to add more info,
> >              I am facing this mis-compare only with high resync speed 
> > (30M to 100M), I ran the same test with resync speed min -10M and max
> > - 30M, without any issue. So the  issue has relationship with 
> > sync_speed_max / min.
> 
> So presumably it is an interaction between recovery and IO.  Maybe if we
> write to a stripe that is being recoverred, or recover a stripe that is
> being written to, then something gets confused.
> 
> I'll have a look to see what I can find.
> 
> Thanks,
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:19 ` Pasi Kärkkäinen
@ 2014-04-23  9:25   ` Manibalan P
  2014-04-23  9:30     ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Manibalan P @ 2014-04-23  9:25 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: linux-raid, neilb

>On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
>> Hi Neil,
>> 
>> Also, I found the data corruption issue on RHEL 6.5.
>> 

>Did you file a bug about the corruption to redhat bugzilla?

Yes, today I raised a support ticket with Redhat regarding this issue.

Manibalan

>-- Pasi

> For your kind attention, I up-ported the md code [raid5.c + raid5.h] 
> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the 
> up-ported code.
> 
> Thanks,
> Manibalan.
> 
> -----Original Message-----
> From: Manibalan P
> Sent: Monday, March 24, 2014 6:46 PM
> To: 'linux-raid@vger.kernel.org'
> Cc: neilb@suse.de
> Subject: RE: raid6 - data integrity issue - data mis-compare on 
> rebuilding RAID 6 - with 100 Mb resync speed.
> 
> Hi,
> 
> I have performed the following tests to narrow down the integrity issue.
> 
> 1. RAID 6, single drive failure - NO ISSUE
> 	a. Running IO
> 	b. mdadm set faulty and remove a drive
> 	c. mdadm add the drive back
>  There is no mis-compare happen in this path.
> 
> 2. RAID 6, two drive failure - write during Degrade and verify after 
> rebuild
> 	a. remove two drives, to make the RAID array degraded.
> 	b. now run write IO write cycle, wait till the write cycle completes
> 	c. insert the drives back one by one, and wait till the re-build 
> completes and a RAID array become optimal.
> 	d. now perform the verification cycle.
> There is no mis-compare happened in this path also.
> 
> During All my test, the sync_Speed_max and min is set to 100Mb
> 
> So, as you referred in your previous mail, the corruption might be 
> happening only during resync and IO happens in parallel.
> 
> Also, I tested with upstream 2.6.32 kernel from git:
> "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> tags/v2.6.32"
> 	And I am facing mis-compare issue in this kernel as well.  on RAID 6, 
> two drive failure with high sync_speed.
> 
> Thanks,
> Manibalan.
> 
> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Thursday, March 13, 2014 11:49 AM
> To: Manibalan P
> Cc: linux-raid@vger.kernel.org
> Subject: Re: raid6 - data integrity issue - data mis-compare on 
> rebuilding RAID 6 - with 100 Mb resync speed.
> 
> On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> <pmanibalan@amiindia.co.in>
> wrote:
> 
> > >
> > >Was the array fully synced before you started the test?
> > 
> > Yes , IO is started, only after the re-sync is completed.
> >  And to add more info,
> >              I am facing this mis-compare only with high resync 
> > speed (30M to 100M), I ran the same test with resync speed min -10M 
> > and max
> > - 30M, without any issue. So the  issue has relationship with 
> > sync_speed_max / min.
> 
> So presumably it is an interaction between recovery and IO.  Maybe if 
> we write to a stripe that is being recoverred, or recover a stripe 
> that is being written to, then something gets confused.
> 
> I'll have a look to see what I can find.
> 
> Thanks,
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:25   ` Manibalan P
@ 2014-04-23  9:30     ` Pasi Kärkkäinen
  2014-04-23  9:33       ` Manibalan P
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2014-04-23  9:30 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid, neilb

On Wed, Apr 23, 2014 at 02:55:15PM +0530, Manibalan P wrote:
> >On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
> >> Hi Neil,
> >> 
> >> Also, I found the data corruption issue on RHEL 6.5.
> >> 
> 
> >Did you file a bug about the corruption to redhat bugzilla?
> 
> Yes, today I raised a support ticket with Redhat regarding this issue.
> 

Ok, good. Can you paste the bz# ? 

-- Pasi

> Manibalan
> 
> >-- Pasi
> 
> > For your kind attention, I up-ported the md code [raid5.c + raid5.h] 
> > from FC11 kernel to CentOS 6.4, and there is no mis-compare with the 
> > up-ported code.
> > 
> > Thanks,
> > Manibalan.
> > 
> > -----Original Message-----
> > From: Manibalan P
> > Sent: Monday, March 24, 2014 6:46 PM
> > To: 'linux-raid@vger.kernel.org'
> > Cc: neilb@suse.de
> > Subject: RE: raid6 - data integrity issue - data mis-compare on 
> > rebuilding RAID 6 - with 100 Mb resync speed.
> > 
> > Hi,
> > 
> > I have performed the following tests to narrow down the integrity issue.
> > 
> > 1. RAID 6, single drive failure - NO ISSUE
> > 	a. Running IO
> > 	b. mdadm set faulty and remove a drive
> > 	c. mdadm add the drive back
> >  There is no mis-compare happen in this path.
> > 
> > 2. RAID 6, two drive failure - write during Degrade and verify after 
> > rebuild
> > 	a. remove two drives, to make the RAID array degraded.
> > 	b. now run write IO write cycle, wait till the write cycle completes
> > 	c. insert the drives back one by one, and wait till the re-build 
> > completes and a RAID array become optimal.
> > 	d. now perform the verification cycle.
> > There is no mis-compare happened in this path also.
> > 
> > During All my test, the sync_Speed_max and min is set to 100Mb
> > 
> > So, as you referred in your previous mail, the corruption might be 
> > happening only during resync and IO happens in parallel.
> > 
> > Also, I tested with upstream 2.6.32 kernel from git:
> > "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> > tags/v2.6.32"
> > 	And I am facing mis-compare issue in this kernel as well.  on RAID 6, 
> > two drive failure with high sync_speed.
> > 
> > Thanks,
> > Manibalan.
> > 
> > -----Original Message-----
> > From: NeilBrown [mailto:neilb@suse.de]
> > Sent: Thursday, March 13, 2014 11:49 AM
> > To: Manibalan P
> > Cc: linux-raid@vger.kernel.org
> > Subject: Re: raid6 - data integrity issue - data mis-compare on 
> > rebuilding RAID 6 - with 100 Mb resync speed.
> > 
> > On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> > <pmanibalan@amiindia.co.in>
> > wrote:
> > 
> > > >
> > > >Was the array fully synced before you started the test?
> > > 
> > > Yes , IO is started, only after the re-sync is completed.
> > >  And to add more info,
> > >              I am facing this mis-compare only with high resync 
> > > speed (30M to 100M), I ran the same test with resync speed min -10M 
> > > and max
> > > - 30M, without any issue. So the  issue has relationship with 
> > > sync_speed_max / min.
> > 
> > So presumably it is an interaction between recovery and IO.  Maybe if 
> > we write to a stripe that is being recoverred, or recover a stripe 
> > that is being written to, then something gets confused.
> > 
> > I'll have a look to see what I can find.
> > 
> > Thanks,
> > NeilBrown
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:30     ` Pasi Kärkkäinen
@ 2014-04-23  9:33       ` Manibalan P
  2014-04-23  9:45         ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Manibalan P @ 2014-04-23  9:33 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: linux-raid, neilb

>On Wed, Apr 23, 2014 at 02:55:15PM +0530, Manibalan P wrote:
>> >On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
>> >> Hi Neil,
>> >> 
>> >> Also, I found the data corruption issue on RHEL 6.5.
>> >> 
>> 
>> >Did you file a bug about the corruption to redhat bugzilla?
>> 
>> Yes, today I raised a support ticket with Redhat regarding this issue.
>> 

>Ok, good. Can you paste the bz# ?

https://access.redhat.com/support/cases/01080080/

manibalan
 
>
>-- Pasi

> Manibalan
> 
> >-- Pasi
> 
> > For your kind attention, I up-ported the md code [raid5.c + raid5.h] 
> > from FC11 kernel to CentOS 6.4, and there is no mis-compare with the 
> > up-ported code.
> > 
> > Thanks,
> > Manibalan.
> > 
> > -----Original Message-----
> > From: Manibalan P
> > Sent: Monday, March 24, 2014 6:46 PM
> > To: 'linux-raid@vger.kernel.org'
> > Cc: neilb@suse.de
> > Subject: RE: raid6 - data integrity issue - data mis-compare on 
> > rebuilding RAID 6 - with 100 Mb resync speed.
> > 
> > Hi,
> > 
> > I have performed the following tests to narrow down the integrity issue.
> > 
> > 1. RAID 6, single drive failure - NO ISSUE
> > 	a. Running IO
> > 	b. mdadm set faulty and remove a drive
> > 	c. mdadm add the drive back
> >  There is no mis-compare happen in this path.
> > 
> > 2. RAID 6, two drive failure - write during Degrade and verify after 
> > rebuild
> > 	a. remove two drives, to make the RAID array degraded.
> > 	b. now run write IO write cycle, wait till the write cycle completes
> > 	c. insert the drives back one by one, and wait till the re-build 
> > completes and a RAID array become optimal.
> > 	d. now perform the verification cycle.
> > There is no mis-compare happened in this path also.
> > 
> > During All my test, the sync_Speed_max and min is set to 100Mb
> > 
> > So, as you referred in your previous mail, the corruption might be 
> > happening only during resync and IO happens in parallel.
> > 
> > Also, I tested with upstream 2.6.32 kernel from git:
> > "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> > tags/v2.6.32"
> > 	And I am facing mis-compare issue in this kernel as well.  on RAID 
> > 6, two drive failure with high sync_speed.
> > 
> > Thanks,
> > Manibalan.
> > 
> > -----Original Message-----
> > From: NeilBrown [mailto:neilb@suse.de]
> > Sent: Thursday, March 13, 2014 11:49 AM
> > To: Manibalan P
> > Cc: linux-raid@vger.kernel.org
> > Subject: Re: raid6 - data integrity issue - data mis-compare on 
> > rebuilding RAID 6 - with 100 Mb resync speed.
> > 
> > On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> > <pmanibalan@amiindia.co.in>
> > wrote:
> > 
> > > >
> > > >Was the array fully synced before you started the test?
> > > 
> > > Yes , IO is started, only after the re-sync is completed.
> > >  And to add more info,
> > >              I am facing this mis-compare only with high resync 
> > > speed (30M to 100M), I ran the same test with resync speed min 
> > > -10M and max
> > > - 30M, without any issue. So the  issue has relationship with 
> > > sync_speed_max / min.
> > 
> > So presumably it is an interaction between recovery and IO.  Maybe 
> > if we write to a stripe that is being recoverred, or recover a 
> > stripe that is being written to, then something gets confused.
> > 
> > I'll have a look to see what I can find.
> > 
> > Thanks,
> > NeilBrown
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:33       ` Manibalan P
@ 2014-04-23  9:45         ` Pasi Kärkkäinen
  2014-04-23  9:59           ` Manibalan P
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2014-04-23  9:45 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid, neilb

On Wed, Apr 23, 2014 at 03:03:21PM +0530, Manibalan P wrote:
> >On Wed, Apr 23, 2014 at 02:55:15PM +0530, Manibalan P wrote:
> >> >On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
> >> >> Hi Neil,
> >> >> 
> >> >> Also, I found the data corruption issue on RHEL 6.5.
> >> >> 
> >> 
> >> >Did you file a bug about the corruption to redhat bugzilla?
> >> 
> >> Yes, today I raised a support ticket with Redhat regarding this issue.
> >> 
> 
> >Ok, good. Can you paste the bz# ?
> 
> https://access.redhat.com/support/cases/01080080/
> 

Hmm, I can't access that, do you have an url for bugzilla.redhat.com (which is the public bug tracker) ? 

-- Pasi

> manibalan
>  
> >
> >-- Pasi
> 
> > Manibalan
> > 
> > >-- Pasi
> > 
> > > For your kind attention, I up-ported the md code [raid5.c + raid5.h] 
> > > from FC11 kernel to CentOS 6.4, and there is no mis-compare with the 
> > > up-ported code.
> > > 
> > > Thanks,
> > > Manibalan.
> > > 
> > > -----Original Message-----
> > > From: Manibalan P
> > > Sent: Monday, March 24, 2014 6:46 PM
> > > To: 'linux-raid@vger.kernel.org'
> > > Cc: neilb@suse.de
> > > Subject: RE: raid6 - data integrity issue - data mis-compare on 
> > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > 
> > > Hi,
> > > 
> > > I have performed the following tests to narrow down the integrity issue.
> > > 
> > > 1. RAID 6, single drive failure - NO ISSUE
> > > 	a. Running IO
> > > 	b. mdadm set faulty and remove a drive
> > > 	c. mdadm add the drive back
> > >  There is no mis-compare happen in this path.
> > > 
> > > 2. RAID 6, two drive failure - write during Degrade and verify after 
> > > rebuild
> > > 	a. remove two drives, to make the RAID array degraded.
> > > 	b. now run write IO write cycle, wait till the write cycle completes
> > > 	c. insert the drives back one by one, and wait till the re-build 
> > > completes and a RAID array become optimal.
> > > 	d. now perform the verification cycle.
> > > There is no mis-compare happened in this path also.
> > > 
> > > During All my test, the sync_Speed_max and min is set to 100Mb
> > > 
> > > So, as you referred in your previous mail, the corruption might be 
> > > happening only during resync and IO happens in parallel.
> > > 
> > > Also, I tested with upstream 2.6.32 kernel from git:
> > > "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> > > tags/v2.6.32"
> > > 	And I am facing mis-compare issue in this kernel as well.  on RAID 
> > > 6, two drive failure with high sync_speed.
> > > 
> > > Thanks,
> > > Manibalan.
> > > 
> > > -----Original Message-----
> > > From: NeilBrown [mailto:neilb@suse.de]
> > > Sent: Thursday, March 13, 2014 11:49 AM
> > > To: Manibalan P
> > > Cc: linux-raid@vger.kernel.org
> > > Subject: Re: raid6 - data integrity issue - data mis-compare on 
> > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > 
> > > On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> > > <pmanibalan@amiindia.co.in>
> > > wrote:
> > > 
> > > > >
> > > > >Was the array fully synced before you started the test?
> > > > 
> > > > Yes , IO is started, only after the re-sync is completed.
> > > >  And to add more info,
> > > >              I am facing this mis-compare only with high resync 
> > > > speed (30M to 100M), I ran the same test with resync speed min 
> > > > -10M and max
> > > > - 30M, without any issue. So the  issue has relationship with 
> > > > sync_speed_max / min.
> > > 
> > > So presumably it is an interaction between recovery and IO.  Maybe 
> > > if we write to a stripe that is being recoverred, or recover a 
> > > stripe that is being written to, then something gets confused.
> > > 
> > > I'll have a look to see what I can find.
> > > 
> > > Thanks,
> > > NeilBrown
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> > > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:45         ` Pasi Kärkkäinen
@ 2014-04-23  9:59           ` Manibalan P
  2014-04-23 10:03             ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Manibalan P @ 2014-04-23  9:59 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: linux-raid, neilb

>On Wed, Apr 23, 2014 at 03:03:21PM +0530, Manibalan P wrote:
>> >On Wed, Apr 23, 2014 at 02:55:15PM +0530, Manibalan P wrote:
>> >> >On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
>> >> >> Hi Neil,
>> >> >> 
>> >> >> Also, I found the data corruption issue on RHEL 6.5.
>> >> >> 
>> >> 
>> >> >Did you file a bug about the corruption to redhat bugzilla?
>> >> 
>>>> Yes, today I raised a support ticket with Redhat regarding this issue.
>>>> 
>> 
>> >Ok, good. Can you paste the bz# ?
>> 
>> https://access.redhat.com/support/cases/01080080/
>> 

>Hmm, I can't access that, do you have an url for bugzilla.redhat.com (which is the public bug tracker) ? 

https://bugzilla.redhat.com/show_bug.cgi?id=1090423

please look at the above link, I created one now.

Manibalan.
>-- Pasi

> manibalan
>  
> >
> >-- Pasi
> 
> > Manibalan
> > 
> > >-- Pasi
> > 
> > > For your kind attention, I up-ported the md code [raid5.c + 
> > > raid5.h] from FC11 kernel to CentOS 6.4, and there is no 
> > > mis-compare with the up-ported code.
> > > 
> > > Thanks,
> > > Manibalan.
> > > 
> > > -----Original Message-----
> > > From: Manibalan P
> > > Sent: Monday, March 24, 2014 6:46 PM
> > > To: 'linux-raid@vger.kernel.org'
> > > Cc: neilb@suse.de
> > > Subject: RE: raid6 - data integrity issue - data mis-compare on 
> > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > 
> > > Hi,
> > > 
> > > I have performed the following tests to narrow down the integrity issue.
> > > 
> > > 1. RAID 6, single drive failure - NO ISSUE
> > > 	a. Running IO
> > > 	b. mdadm set faulty and remove a drive
> > > 	c. mdadm add the drive back
> > >  There is no mis-compare happen in this path.
> > > 
> > > 2. RAID 6, two drive failure - write during Degrade and verify 
> > > after rebuild
> > > 	a. remove two drives, to make the RAID array degraded.
> > > 	b. now run write IO write cycle, wait till the write cycle completes
> > > 	c. insert the drives back one by one, and wait till the re-build 
> > > completes and a RAID array become optimal.
> > > 	d. now perform the verification cycle.
> > > There is no mis-compare happened in this path also.
> > > 
> > > During All my test, the sync_Speed_max and min is set to 100Mb
> > > 
> > > So, as you referred in your previous mail, the corruption might be 
> > > happening only during resync and IO happens in parallel.
> > > 
> > > Also, I tested with upstream 2.6.32 kernel from git:
> > > "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> > > tags/v2.6.32"
> > > 	And I am facing mis-compare issue in this kernel as well.  on 
> > > RAID 6, two drive failure with high sync_speed.
> > > 
> > > Thanks,
> > > Manibalan.
> > > 
> > > -----Original Message-----
> > > From: NeilBrown [mailto:neilb@suse.de]
> > > Sent: Thursday, March 13, 2014 11:49 AM
> > > To: Manibalan P
> > > Cc: linux-raid@vger.kernel.org
> > > Subject: Re: raid6 - data integrity issue - data mis-compare on 
> > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > 
> > > On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> > > <pmanibalan@amiindia.co.in>
> > > wrote:
> > > 
> > > > >
> > > > >Was the array fully synced before you started the test?
> > > > 
> > > > Yes , IO is started, only after the re-sync is completed.
> > > >  And to add more info,
> > > >              I am facing this mis-compare only with high resync 
> > > > speed (30M to 100M), I ran the same test with resync speed min 
> > > > -10M and max
> > > > - 30M, without any issue. So the  issue has relationship with 
> > > > sync_speed_max / min.
> > > 
> > > So presumably it is an interaction between recovery and IO.  Maybe 
> > > if we write to a stripe that is being recoverred, or recover a 
> > > stripe that is being written to, then something gets confused.
> > > 
> > > I'll have a look to see what I can find.
> > > 
> > > Thanks,
> > > NeilBrown
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> > > in the body of a message to majordomo@vger.kernel.org More 
> > > majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  9:59           ` Manibalan P
@ 2014-04-23 10:03             ` Pasi Kärkkäinen
  0 siblings, 0 replies; 18+ messages in thread
From: Pasi Kärkkäinen @ 2014-04-23 10:03 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid, neilb

On Wed, Apr 23, 2014 at 03:29:34PM +0530, Manibalan P wrote:
> >On Wed, Apr 23, 2014 at 03:03:21PM +0530, Manibalan P wrote:
> >> >On Wed, Apr 23, 2014 at 02:55:15PM +0530, Manibalan P wrote:
> >> >> >On Fri, Apr 11, 2014 at 05:41:12PM +0530, Manibalan P wrote:
> >> >> >> Hi Neil,
> >> >> >> 
> >> >> >> Also, I found the data corruption issue on RHEL 6.5.
> >> >> >> 
> >> >> 
> >> >> >Did you file a bug about the corruption to redhat bugzilla?
> >> >> 
> >>>> Yes, today I raised a support ticket with Redhat regarding this issue.
> >>>> 
> >> 
> >> >Ok, good. Can you paste the bz# ?
> >> 
> >> https://access.redhat.com/support/cases/01080080/
> >> 
> 
> >Hmm, I can't access that, do you have an url for bugzilla.redhat.com (which is the public bug tracker) ? 
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1090423
> 
> please look at the above link, I created one now.
> 

Great, thanks!

> Manibalan.


-- Pasi

> >-- Pasi
> 
> > manibalan
> >  
> > >
> > >-- Pasi
> > 
> > > Manibalan
> > > 
> > > >-- Pasi
> > > 
> > > > For your kind attention, I up-ported the md code [raid5.c + 
> > > > raid5.h] from FC11 kernel to CentOS 6.4, and there is no 
> > > > mis-compare with the up-ported code.
> > > > 
> > > > Thanks,
> > > > Manibalan.
> > > > 
> > > > -----Original Message-----
> > > > From: Manibalan P
> > > > Sent: Monday, March 24, 2014 6:46 PM
> > > > To: 'linux-raid@vger.kernel.org'
> > > > Cc: neilb@suse.de
> > > > Subject: RE: raid6 - data integrity issue - data mis-compare on 
> > > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > > 
> > > > Hi,
> > > > 
> > > > I have performed the following tests to narrow down the integrity issue.
> > > > 
> > > > 1. RAID 6, single drive failure - NO ISSUE
> > > > 	a. Running IO
> > > > 	b. mdadm set faulty and remove a drive
> > > > 	c. mdadm add the drive back
> > > >  There is no mis-compare happen in this path.
> > > > 
> > > > 2. RAID 6, two drive failure - write during Degrade and verify 
> > > > after rebuild
> > > > 	a. remove two drives, to make the RAID array degraded.
> > > > 	b. now run write IO write cycle, wait till the write cycle completes
> > > > 	c. insert the drives back one by one, and wait till the re-build 
> > > > completes and a RAID array become optimal.
> > > > 	d. now perform the verification cycle.
> > > > There is no mis-compare happened in this path also.
> > > > 
> > > > During All my test, the sync_Speed_max and min is set to 100Mb
> > > > 
> > > > So, as you referred in your previous mail, the corruption might be 
> > > > happening only during resync and IO happens in parallel.
> > > > 
> > > > Also, I tested with upstream 2.6.32 kernel from git:
> > > > "http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ - 
> > > > tags/v2.6.32"
> > > > 	And I am facing mis-compare issue in this kernel as well.  on 
> > > > RAID 6, two drive failure with high sync_speed.
> > > > 
> > > > Thanks,
> > > > Manibalan.
> > > > 
> > > > -----Original Message-----
> > > > From: NeilBrown [mailto:neilb@suse.de]
> > > > Sent: Thursday, March 13, 2014 11:49 AM
> > > > To: Manibalan P
> > > > Cc: linux-raid@vger.kernel.org
> > > > Subject: Re: raid6 - data integrity issue - data mis-compare on 
> > > > rebuilding RAID 6 - with 100 Mb resync speed.
> > > > 
> > > > On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
> > > > <pmanibalan@amiindia.co.in>
> > > > wrote:
> > > > 
> > > > > >
> > > > > >Was the array fully synced before you started the test?
> > > > > 
> > > > > Yes , IO is started, only after the re-sync is completed.
> > > > >  And to add more info,
> > > > >              I am facing this mis-compare only with high resync 
> > > > > speed (30M to 100M), I ran the same test with resync speed min 
> > > > > -10M and max
> > > > > - 30M, without any issue. So the  issue has relationship with 
> > > > > sync_speed_max / min.
> > > > 
> > > > So presumably it is an interaction between recovery and IO.  Maybe 
> > > > if we write to a stripe that is being recoverred, or recover a 
> > > > stripe that is being written to, then something gets confused.
> > > > 
> > > > I'll have a look to see what I can find.
> > > > 
> > > > Thanks,
> > > > NeilBrown
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" 
> > > > in the body of a message to majordomo@vger.kernel.org More 
> > > > majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23  7:07 ` NeilBrown
@ 2014-04-23 17:02   ` Dan Williams
  2014-05-05  7:21     ` NeilBrown
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2014-04-23 17:02 UTC (permalink / raw)
  To: NeilBrown; +Cc: Manibalan P, linux-raid

On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
> wrote:
>
>> Hi Neil,
>>
>> Also, I found the data corruption issue on RHEL 6.5.
>>
>> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
>> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
>> up-ported code.
>
> This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
>
> So it is probably the change to RAID6 to support async parity calculations.
>
> Looking at the code always makes my head spin.
>
> Dan : have you any ideas?
>
> It seems that writing to a double-degraded RAID6 while it is recovering to
> a space can trigger data corruption.
>
> 2.6.29 works
> 2.6.32 doesn't
> 3.8.0 still doesn't.
>
> I suspect async parity calculations.

I'll take a look.  I've had cleanups of that code on my backlog for "a
while now (TM)".

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-04-23 17:02   ` Dan Williams
@ 2014-05-05  7:21     ` NeilBrown
  2014-05-16 18:11       ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: NeilBrown @ 2014-05-05  7:21 UTC (permalink / raw)
  To: Dan Williams; +Cc: Manibalan P, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]

On Wed, 23 Apr 2014 10:02:00 -0700 Dan Williams <dan.j.williams@intel.com>
wrote:

> On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
> > On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
> > wrote:
> >
> >> Hi Neil,
> >>
> >> Also, I found the data corruption issue on RHEL 6.5.
> >>
> >> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
> >> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
> >> up-ported code.
> >
> > This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
> >
> > So it is probably the change to RAID6 to support async parity calculations.
> >
> > Looking at the code always makes my head spin.
> >
> > Dan : have you any ideas?
> >
> > It seems that writing to a double-degraded RAID6 while it is recovering to
> > a space can trigger data corruption.
> >
> > 2.6.29 works
> > 2.6.32 doesn't
> > 3.8.0 still doesn't.
> >
> > I suspect async parity calculations.
> 
> I'll take a look.  I've had cleanups of that code on my backlog for "a
> while now (TM)".

Hi Dan,
 did you have a chance to have a look?

I've been consistently failing to find anything.

I have a question though.
If we set up a chain of async dma handling via:
   ops_run_compute6_2 then ops_bio_drain then ops_run_reconstruct

is it possible for the ops_complete_compute callback set up by
ops_run_compute6_2 to be called before ops_run_reconstruct has been scheduled
or run?

If so, there seems to be some room for confusion over the setting for
R5_UPTODATE on blocks that are being computed and then drained to.  Both will
try to set the flag, so it could get set before reconstruction has run.

I can't see that this would cause a problem, but then I'm not entirely sure
why we clear R5_UPTODATE when we set R5_Wantdrain.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-05-05  7:21     ` NeilBrown
@ 2014-05-16 18:11       ` Dan Williams
  2014-05-20  0:22         ` Dan Williams
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Williams @ 2014-05-16 18:11 UTC (permalink / raw)
  To: NeilBrown; +Cc: Manibalan P, linux-raid

On Mon, May 5, 2014 at 12:21 AM, NeilBrown <neilb@suse.de> wrote:
> On Wed, 23 Apr 2014 10:02:00 -0700 Dan Williams <dan.j.williams@intel.com>
> wrote:
>
>> On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
>> > On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
>> > wrote:
>> >
>> >> Hi Neil,
>> >>
>> >> Also, I found the data corruption issue on RHEL 6.5.
>> >>
>> >> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
>> >> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
>> >> up-ported code.
>> >
>> > This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
>> >
>> > So it is probably the change to RAID6 to support async parity calculations.
>> >
>> > Looking at the code always makes my head spin.
>> >
>> > Dan : have you any ideas?
>> >
>> > It seems that writing to a double-degraded RAID6 while it is recovering to
>> > a space can trigger data corruption.
>> >
>> > 2.6.29 works
>> > 2.6.32 doesn't
>> > 3.8.0 still doesn't.
>> >
>> > I suspect async parity calculations.
>>
>> I'll take a look.  I've had cleanups of that code on my backlog for "a
>> while now (TM)".
>
>
> Hi Dan,
>  did you have a chance to have a look?
>
> I've been consistently failing to find anything.
>
> I have a question though.
> If we set up a chain of async dma handling via:
>    ops_run_compute6_2 then ops_bio_drain then ops_run_reconstruct
>
> is it possible for the ops_complete_compute callback set up by
> ops_run_compute6_2 to be called before ops_run_reconstruct has been scheduled
> or run?

In the absence of a dma engine we never run asynchronously, so we will
*always* call ops_complete_compute() before ops_run_reconstruct() in
the synchronous acse.  This looks confused.  We're certainly leaking
an uptodate state prior to the completion of the write.

> If so, there seems to be some room for confusion over the setting for
> R5_UPTODATE on blocks that are being computed and then drained to.  Both will
> try to set the flag, so it could get set before reconstruction has run.
>
> I can't see that this would cause a problem, but then I'm not entirely sure
> why we clear R5_UPTODATE when we set R5_Wantdrain.

Let me see what problems this could be causing.  I'm thinking we
should be protected by the global ->reconstruct_state, but something
is telling me we do depend on R5_UPTODATE being consistent with the
ongoing stripe operation.

--
Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-05-16 18:11       ` Dan Williams
@ 2014-05-20  0:22         ` Dan Williams
  2014-05-22 11:47           ` Manibalan P
  2014-05-22 11:52           ` Manibalan P
  0 siblings, 2 replies; 18+ messages in thread
From: Dan Williams @ 2014-05-20  0:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: Manibalan P, linux-raid

On Fri, May 16, 2014 at 11:11 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, May 5, 2014 at 12:21 AM, NeilBrown <neilb@suse.de> wrote:
>> On Wed, 23 Apr 2014 10:02:00 -0700 Dan Williams <dan.j.williams@intel.com>
>> wrote:
>>
>>> On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
>>> > On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
>>> > wrote:
>>> >
>>> >> Hi Neil,
>>> >>
>>> >> Also, I found the data corruption issue on RHEL 6.5.
>>> >>
>>> >> For your kind attention, I up-ported the md code [raid5.c + raid5.h]
>>> >> from FC11 kernel to CentOS 6.4, and there is no mis-compare with the
>>> >> up-ported code.
>>> >
>>> > This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
>>> >
>>> > So it is probably the change to RAID6 to support async parity calculations.
>>> >
>>> > Looking at the code always makes my head spin.
>>> >
>>> > Dan : have you any ideas?
>>> >
>>> > It seems that writing to a double-degraded RAID6 while it is recovering to
>>> > a space can trigger data corruption.
>>> >
>>> > 2.6.29 works
>>> > 2.6.32 doesn't
>>> > 3.8.0 still doesn't.
>>> >
>>> > I suspect async parity calculations.
>>>
>>> I'll take a look.  I've had cleanups of that code on my backlog for "a
>>> while now (TM)".
>>
>>
>> Hi Dan,
>>  did you have a chance to have a look?
>>
>> I've been consistently failing to find anything.
>>
>> I have a question though.
>> If we set up a chain of async dma handling via:
>>    ops_run_compute6_2 then ops_bio_drain then ops_run_reconstruct
>>
>> is it possible for the ops_complete_compute callback set up by
>> ops_run_compute6_2 to be called before ops_run_reconstruct has been scheduled
>> or run?
>
> In the absence of a dma engine we never run asynchronously, so we will
> *always* call ops_complete_compute() before ops_run_reconstruct() in
> the synchronous acse.  This looks confused.  We're certainly leaking
> an uptodate state prior to the completion of the write.
>
>> If so, there seems to be some room for confusion over the setting for
>> R5_UPTODATE on blocks that are being computed and then drained to.  Both will
>> try to set the flag, so it could get set before reconstruction has run.
>>
>> I can't see that this would cause a problem, but then I'm not entirely sure
>> why we clear R5_UPTODATE when we set R5_Wantdrain.
>
> Let me see what problems this could be causing.  I'm thinking we
> should be protected by the global ->reconstruct_state, but something
> is telling me we do depend on R5_UPTODATE being consistent with the
> ongoing stripe operation.
>

Can you share your exact test scripts?  I'm having a hard time
reproducing this with something like:

echo 100000 > /proc/sys/dev/raid/speed_limit_min
mdadm --add /dev/md0 /dev/sd[bc]; dd if=urandom.dump of=/dev/md0
bs=1024M oflag=sync

This is a 7-drive raid6 array.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-05-20  0:22         ` Dan Williams
@ 2014-05-22 11:47           ` Manibalan P
  2014-05-22 11:52           ` Manibalan P
  1 sibling, 0 replies; 18+ messages in thread
From: Manibalan P @ 2014-05-22 11:47 UTC (permalink / raw)
  To: linux-raid, Dan Williams, NeilBrown

[-- Attachment #1: Type: text/plain, Size: 3773 bytes --]

>Can you share your exact test scripts?  I'm having a hard time reproducing this with something like:

>echo 100000 > /proc/sys/dev/raid/speed_limit_min
>mdadm --add /dev/md0 /dev/sd[bc]; dd if=urandom.dump of=/dev/md0 bs=1024M oflag=sync

I have attached the script which will simulate 2 drive failure in RAID 6.  And for running IO, we used Dit32. But you  can use any
Data verification tool, which does write some known pattern and read back the data and verify.


>-----Original Message-----
>From: dan.j.williams@gmail.com [mailto:dan.j.williams@gmail.com] On Behalf Of Dan Williams
>Sent: Tuesday, May 20, 2014 5:52 AM
>To: NeilBrown
>Cc: Manibalan P; linux-raid
>Subject: Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.

>On Fri, May 16, 2014 at 11:11 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, May 5, 2014 at 12:21 AM, NeilBrown <neilb@suse.de> wrote:
>> On Wed, 23 Apr 2014 10:02:00 -0700 Dan Williams 
>> <dan.j.williams@intel.com>
>> wrote:
>>
>>> On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
>>> > On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" 
>>> > <pmanibalan@amiindia.co.in>
>>> > wrote:
>>> >
>>> >> Hi Neil,
>>> >>
>>> >> Also, I found the data corruption issue on RHEL 6.5.
>>> >>
>>> >> For your kind attention, I up-ported the md code [raid5.c + 
>>> >> raid5.h] from FC11 kernel to CentOS 6.4, and there is no 
>>> >> mis-compare with the up-ported code.
>>> >
>>> > This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
>>> >
>>> > So it is probably the change to RAID6 to support async parity calculations.
>>> >
>>> > Looking at the code always makes my head spin.
>>> >
>>> > Dan : have you any ideas?
>>> >
>>> > It seems that writing to a double-degraded RAID6 while it is 
>>> > recovering to a space can trigger data corruption.
>>> >
>>> > 2.6.29 works
>>> > 2.6.32 doesn't
>>> > 3.8.0 still doesn't.
>>> >
>>> > I suspect async parity calculations.
>>>
>>> I'll take a look.  I've had cleanups of that code on my backlog for 
>>> "a while now (TM)".
>>
>>
>> Hi Dan,
>>  did you have a chance to have a look?
>>
>> I've been consistently failing to find anything.
>>
>> I have a question though.
>> If we set up a chain of async dma handling via:
>>    ops_run_compute6_2 then ops_bio_drain then ops_run_reconstruct
>>
>> is it possible for the ops_complete_compute callback set up by
>> ops_run_compute6_2 to be called before ops_run_reconstruct has been 
>> scheduled or run?
>
> In the absence of a dma engine we never run asynchronously, so we will
> *always* call ops_complete_compute() before ops_run_reconstruct() in 
> the synchronous acse.  This looks confused.  We're certainly leaking 
> an uptodate state prior to the completion of the write.
>
>> If so, there seems to be some room for confusion over the setting for 
>> R5_UPTODATE on blocks that are being computed and then drained to.  
>> Both will try to set the flag, so it could get set before reconstruction has run.
>>
>> I can't see that this would cause a problem, but then I'm not 
>> entirely sure why we clear R5_UPTODATE when we set R5_Wantdrain.
>
> Let me see what problems this could be causing.  I'm thinking we 
> should be protected by the global ->reconstruct_state, but something 
> is telling me we do depend on R5_UPTODATE being consistent with the 
> ongoing stripe operation.
>

Can you share your exact test scripts?  I'm having a hard time reproducing this with something like:

echo 100000 > /proc/sys/dev/raid/speed_limit_min
mdadm --add /dev/md0 /dev/sd[bc]; dd if=urandom.dump of=/dev/md0 bs=1024M oflag=sync

This is a 7-drive raid6 array.

[-- Attachment #2: RollingHotSpareTwoDriveFailure.sh --]
[-- Type: application/octet-stream, Size: 10095 bytes --]

#!/bin/bash

if [ -e /volumes/RAIDCONF ]
then
	echo "Debug directory is present."
else
	echo "Debug directory is not present. Created one"
	mkdir -p /volumes/RAIDCONF
fi

ld_name="/dev/md0"
# Check if LD exists
if [ -e $ld_name ]
then
	echo "LD $ld_name exists"
else
	echo "LD $ld_name does not exist"
	exit
fi
one=1
two=2
three=3
four=4
six=6

echo "`date` : Initial State" >> /volumes/RAIDCONF/raid_conf_info.txt
echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

md_name=`basename $ld_name`

# Check if raid is initializing
md_init=`cat /proc/mdstat | grep $md_name -A 3 | grep -o resync`

if [ -z $md_init ]
then
	echo "$md_name is online"
else
	echo "RAID is Initializing. Speeding up the process"
	echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	# Wait till Initialization completes
	while [ 1 ]
	do 
		init_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
		if [ -z $init_still ]
		then
			echo "Initialzing of $md_name completed"
			break
		else
			echo "Initializing completed so far: $init_still"
			sleep 5
		fi
	done
fi

# Reset back sync speed
echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min
COUNT=10
i=1;
# Loop forever
while [ 1 ]
do
echo "###########################################################"
echo "Rolling Hot Spare Test two drive removal Running for Iteration : $i" >> Iterations.txt
echo "Rolling Hot Spare Test two drive removal Running for Iteration : $i"
echo "###########################################################"
	#calculating length of the array
	arrlen=`mdadm -D $ld_name | grep "Active Devices" | awk '{print $4}'`
	echo "Original Lenght of array: $arrlen";
	let arrlen=$arrlen-1;
	echo "Length of Array: $arrlen";

	#PD list in an array
	pds=`cat /proc/mdstat | grep $md_name | grep -o 'sd[a-z]' | awk '{print "/dev/"$1}'`
	arr=($pds)


	#Random Number Generation
	ran1=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	echo "Randon Number1: $ran1";
	ran2=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	#echo "Temp Randon Number2: $ran2";
	while [ $ran1 -eq $ran2 ]
	do
	ran2=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	done
	echo "Randon Number2: $ran2";

##########  REMOVING TWO RANDOM DRIVES  ###############################################################################################
echo "Random Drive1 to be removed: ${arr[$ran1]}"
echo "Random Drive1 to be removed: ${arr[$ran1]}" >> Iterations.txt

        # Removing drive1 Randomly
		echo "`date` : Iteration : $COUNT" >> /volumes/RAIDCONF/raid_conf_info.txt
		# Find the scsi address of the current disk
		scsi_address1=`lsscsi | grep ${arr[$ran1]} | grep -o [0-9]*:[0-9]*:[0-9]*:[0-9]*`
		disk1=`basename ${arr[$ran1]}`
		echo "Disk Name: $disk1";
		td_name=`echo "$disk1" | cut -c 3`
		
		#Removing data partition
		echo "Removing data partition.."
		mdadm --manage $ld_name --set-faulty  ${arr[$ran1]}$six
		sleep 10

		faulty=`mdadm -D $ld_name | grep "faulty spare" | wc -l` 
		echo " faulty = $faulty"

		if [ $faulty -ne 1 ]; then
		echo "Number of failed disk is more the expected"
		exit 1;
		fi

		faulty=0
		isremoved=1
		while [[ $isremoved -ne 0 ]]
		do
		echo "in while"		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | grep ${arr[$ran1]}$six | wc -l`
		echo " in - faulty = $faulty"
		if [ $faulty -eq 1 ]; then
		mdadm --manage $ld_name --remove ${arr[$ran1]}$six
		sleep 3
		fi
		isremoved=$faulty
		echo "isremoved = $isremoved"
		done
		
		sleep 2;

		echo "`date` : After Removing the disk : Slot [ $slot1 ] Name[ ${arr[$ran1]} ] scsi_address[ $scsi_address1 ]" >> /volumes/RAIDCONF/raid_conf_info.txt
		echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
echo "Random Drive2 to be removed: ${arr[$ran2]}"		
echo "Random Drive2 to be removed: ${arr[$ran2]}" >> Iterations.txt		
		 # Removing drive2 Randomly
		echo "`date` : Iteration : $COUNT" >> /volumes/RAIDCONF/raid_conf_info.txt
		# Find the scsi address of the current disk
		scsi_address2=`lsscsi | grep ${arr[$ran2]} | grep -o [0-9]*:[0-9]*:[0-9]*:[0-9]*`
		disk2=`basename ${arr[$ran2]}`
		echo "Disk Name: $disk2";
		
		td_name=`echo "$disk2" | cut -c 3`
		
		#Removing data partition
		echo "Removing data partition.."
		mdadm --manage $ld_name --set-faulty  ${arr[$ran2]}$six 
		sleep 10
		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | wc -l` 
		echo " faulty = $faulty"

		if [ $faulty -ne 1 ]; then
		echo "Number of failed disk is more the expected"
		exit 1;
		fi

		faulty=0
		isremoved=1
		while [[ $isremoved -ne 0 ]]
		do
		echo "in while"		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | grep ${arr[$ran2]}$six | wc -l`
		echo " in - faulty = $faulty"
		if [ $faulty -eq 1 ]; then
		mdadm --manage $ld_name --remove ${arr[$ran2]}$six
		sleep 3
		fi
		isremoved=$faulty
		echo "isremoved = $isremoved"
		done

		echo "`date` : After Removing the disk : Slot [ $slot2 ] Name[ ${arr[$ran2]} ] scsi_address[ $scsi_address2 ]" >> /volumes/RAIDCONF/raid_conf_info.txt
		echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

#######################################################################################################################################		

####  Adding First removed drive  #############################################################################################

		# Add back the device removed first after rebuild completes
		sleep 5;
		mdadm  $ld_name -a  ${arr[$ran1]}$six

		# Wait for some time to get it added as spare
		sleep 5;
		echo "`date` :  After Disk Added at Slot [ $slot1 ] scsi_address [$scsi_address1]">> /volumes/RAIDCONF/raid_conf_info.txt
        echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		# Check if md starts rebuilding
		while [ 1 ]
		do
			md_recovery=`cat /proc/mdstat | grep $md_name -A 3 | grep -o recovery`
			if [ -z $md_recovery ]
			then
				echo "$md_name did not start rebuilding. sleeping and checking again"
				sleep 5
			else
				break
			fi
		done

		sleep 5
		echo "`date` :	After Rebuild Started">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		echo "RAID is Rebuilding. Speeding up the process"
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

		# Wait till rebuilding
		while [ 1 ]
		do 
			rb_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
			if [ -z $rb_still ]
			then
				echo "`date +%H:%M:%S`:Rebuild of $md_name completed"
				break
			else
				echo "Rebuild completed so far: $rb_still"
				sleep 5
			fi
		done
	
		sleep 5
		echo "`date` :	After Rebuild complete">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
##################################################################################################################################		
	# Reset back sync speed
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	echo "Wait after rebuilding disk 1"
	sleep 5	

		
########  Adding back the Second Removed Drive ###################################################################################
 
		mdadm  $ld_name -a  ${arr[$ran2]}$six

		# Wait for some time to get it added as spare
		sleep 5;
		echo "`date` :  After Disk Added at Slot [ $slot2 ] scsi_address [$scsi_address2]">> /volumes/RAIDCONF/raid_conf_info.txt
        echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt		
		
		# Check if md starts rebuilding
		while [ 1 ]
		do
			md_recovery=`cat /proc/mdstat | grep $md_name -A 3 | grep -o recovery`
			if [ -z $md_recovery ]
			then
				echo "$md_name did not start rebuilding. sleeping and checking again"
				sleep 5
			else
				break
			fi
		done

		sleep 5
		echo "`date` :	After Rebuild Started">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		echo "RAID is Rebuilding. Speeding up the process"
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

		# Wait till rebuilding
		while [ 1 ]
		do 
			rb_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
			if [ -z $rb_still ]
			then
				echo "`date +%H:%M:%S`:Rebuild of $md_name completed"
				break
			else
				echo "Rebuild completed so far: $rb_still"
				sleep 5
			fi
		done
	
		sleep 5
		echo "`date` :	After Rebuild complete">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
##################################################################################################################################		
	# Reset back sync speed
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	echo "=============================================================================================" >>/volumes/RAIDCONF/raid_conf_info.txt
	((COUNT=$COUNT+1))
	
echo "Rolling Hot Spare Test two drive removal iteration $i complete....">> Iterations.txt
echo "Rolling Hot Spare Test two drive removal iteration $i complete...."
let i=$i+1;
echo "Sleeping for 300 seconds.."
#####################################################################################################################################	
sleep 20;
done #while

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-05-20  0:22         ` Dan Williams
  2014-05-22 11:47           ` Manibalan P
@ 2014-05-22 11:52           ` Manibalan P
  1 sibling, 0 replies; 18+ messages in thread
From: Manibalan P @ 2014-05-22 11:52 UTC (permalink / raw)
  To: linux-raid; +Cc: Dan Williams, NeilBrown

>Can you share your exact test scripts?  I'm having a hard time reproducing this with something like:

>echo 100000 > /proc/sys/dev/raid/speed_limit_min
>mdadm --add /dev/md0 /dev/sd[bc]; dd if=urandom.dump of=/dev/md0 
>bs=1024M oflag=sync

Below is the script which will simulate 2 drive failure in RAID 6. 
And for running IO, we used Dit32. But you  can use any
Data verification tool, which does write some known pattern and read back the data and verify.

#--------------------Script Which Simulates 2 Drive failure in Raid 6 Array - Input will be md array name [ Eg : /dev/md0 ] --------------------------------------------------------------------------------------------------
#!/bin/bash

if [ -e /volumes/RAIDCONF ]
then
	echo "Debug directory is present."
else
	echo "Debug directory is not present. Created one"
	mkdir -p /volumes/RAIDCONF
fi

ld_name="/dev/md0"
# Check if LD exists
if [ -e $ld_name ]
then
	echo "LD $ld_name exists"
else
	echo "LD $ld_name does not exist"
	exit
fi
one=1
two=2
three=3
four=4
six=6

echo "`date` : Initial State" >> /volumes/RAIDCONF/raid_conf_info.txt
echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

md_name=`basename $ld_name`

# Check if raid is initializing
md_init=`cat /proc/mdstat | grep $md_name -A 3 | grep -o resync`

if [ -z $md_init ]
then
	echo "$md_name is online"
else
	echo "RAID is Initializing. Speeding up the process"
	echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	# Wait till Initialization completes
	while [ 1 ]
	do 
		init_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
		if [ -z $init_still ]
		then
			echo "Initialzing of $md_name completed"
			break
		else
			echo "Initializing completed so far: $init_still"
			sleep 5
		fi
	done
fi

# Reset back sync speed
echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min
COUNT=10
i=1;
# Loop forever
while [ 1 ]
do
echo "###########################################################"
echo "Rolling Hot Spare Test two drive removal Running for Iteration : $i" >> Iterations.txt
echo "Rolling Hot Spare Test two drive removal Running for Iteration : $i"
echo "###########################################################"
	#calculating length of the array
	arrlen=`mdadm -D $ld_name | grep "Active Devices" | awk '{print $4}'`
	echo "Original Lenght of array: $arrlen";
	let arrlen=$arrlen-1;
	echo "Length of Array: $arrlen";

	#PD list in an array
	pds=`cat /proc/mdstat | grep $md_name | grep -o 'sd[a-z]' | awk '{print "/dev/"$1}'`
	arr=($pds)


	#Random Number Generation
	ran1=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	echo "Randon Number1: $ran1";
	ran2=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	#echo "Temp Randon Number2: $ran2";
	while [ $ran1 -eq $ran2 ]
	do
	ran2=`grep -m1 -ao '[0-'$arrlen']' /dev/urandom | head -1`
	done
	echo "Randon Number2: $ran2";

##########  REMOVING TWO RANDOM DRIVES  ###############################################################################################
echo "Random Drive1 to be removed: ${arr[$ran1]}"
echo "Random Drive1 to be removed: ${arr[$ran1]}" >> Iterations.txt

        # Removing drive1 Randomly
		echo "`date` : Iteration : $COUNT" >> /volumes/RAIDCONF/raid_conf_info.txt
		# Find the scsi address of the current disk
		scsi_address1=`lsscsi | grep ${arr[$ran1]} | grep -o [0-9]*:[0-9]*:[0-9]*:[0-9]*`
		disk1=`basename ${arr[$ran1]}`
		echo "Disk Name: $disk1";
		td_name=`echo "$disk1" | cut -c 3`
		
		#Removing data partition
		echo "Removing data partition.."
		mdadm --manage $ld_name --set-faulty  ${arr[$ran1]}$six
		sleep 10

		faulty=`mdadm -D $ld_name | grep "faulty spare" | wc -l` 
		echo " faulty = $faulty"

		if [ $faulty -ne 1 ]; then
		echo "Number of failed disk is more the expected"
		exit 1;
		fi

		faulty=0
		isremoved=1
		while [[ $isremoved -ne 0 ]]
		do
		echo "in while"		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | grep ${arr[$ran1]}$six | wc -l`
		echo " in - faulty = $faulty"
		if [ $faulty -eq 1 ]; then
		mdadm --manage $ld_name --remove ${arr[$ran1]}$six
		sleep 3
		fi
		isremoved=$faulty
		echo "isremoved = $isremoved"
		done
		
		sleep 2;

		echo "`date` : After Removing the disk : Slot [ $slot1 ] Name[ ${arr[$ran1]} ] scsi_address[ $scsi_address1 ]" >> /volumes/RAIDCONF/raid_conf_info.txt
		echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
echo "Random Drive2 to be removed: ${arr[$ran2]}"		
echo "Random Drive2 to be removed: ${arr[$ran2]}" >> Iterations.txt		
		 # Removing drive2 Randomly
		echo "`date` : Iteration : $COUNT" >> /volumes/RAIDCONF/raid_conf_info.txt
		# Find the scsi address of the current disk
		scsi_address2=`lsscsi | grep ${arr[$ran2]} | grep -o [0-9]*:[0-9]*:[0-9]*:[0-9]*`
		disk2=`basename ${arr[$ran2]}`
		echo "Disk Name: $disk2";
		
		td_name=`echo "$disk2" | cut -c 3`
		
		#Removing data partition
		echo "Removing data partition.."
		mdadm --manage $ld_name --set-faulty  ${arr[$ran2]}$six 
		sleep 10
		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | wc -l` 
		echo " faulty = $faulty"

		if [ $faulty -ne 1 ]; then
		echo "Number of failed disk is more the expected"
		exit 1;
		fi

		faulty=0
		isremoved=1
		while [[ $isremoved -ne 0 ]]
		do
		echo "in while"		
		faulty=`mdadm -D $ld_name | grep "faulty spare" | grep ${arr[$ran2]}$six | wc -l`
		echo " in - faulty = $faulty"
		if [ $faulty -eq 1 ]; then
		mdadm --manage $ld_name --remove ${arr[$ran2]}$six
		sleep 3
		fi
		isremoved=$faulty
		echo "isremoved = $isremoved"
		done

		echo "`date` : After Removing the disk : Slot [ $slot2 ] Name[ ${arr[$ran2]} ] scsi_address[ $scsi_address2 ]" >> /volumes/RAIDCONF/raid_conf_info.txt
		echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

#######################################################################################################################################		

####  Adding First removed drive  #############################################################################################

		# Add back the device removed first after rebuild completes
		sleep 5;
		mdadm  $ld_name -a  ${arr[$ran1]}$six

		# Wait for some time to get it added as spare
		sleep 5;
		echo "`date` :  After Disk Added at Slot [ $slot1 ] scsi_address [$scsi_address1]">> /volumes/RAIDCONF/raid_conf_info.txt
        echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		# Check if md starts rebuilding
		while [ 1 ]
		do
			md_recovery=`cat /proc/mdstat | grep $md_name -A 3 | grep -o recovery`
			if [ -z $md_recovery ]
			then
				echo "$md_name did not start rebuilding. sleeping and checking again"
				sleep 5
			else
				break
			fi
		done

		sleep 5
		echo "`date` :	After Rebuild Started">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		echo "RAID is Rebuilding. Speeding up the process"
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

		# Wait till rebuilding
		while [ 1 ]
		do 
			rb_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
			if [ -z $rb_still ]
			then
				echo "`date +%H:%M:%S`:Rebuild of $md_name completed"
				break
			else
				echo "Rebuild completed so far: $rb_still"
				sleep 5
			fi
		done
	
		sleep 5
		echo "`date` :	After Rebuild complete">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
##################################################################################################################################		
	# Reset back sync speed
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	echo "Wait after rebuilding disk 1"
	sleep 5	

		
########  Adding back the Second Removed Drive ###################################################################################
 
		mdadm  $ld_name -a  ${arr[$ran2]}$six

		# Wait for some time to get it added as spare
		sleep 5;
		echo "`date` :  After Disk Added at Slot [ $slot2 ] scsi_address [$scsi_address2]">> /volumes/RAIDCONF/raid_conf_info.txt
        echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt		
		
		# Check if md starts rebuilding
		while [ 1 ]
		do
			md_recovery=`cat /proc/mdstat | grep $md_name -A 3 | grep -o recovery`
			if [ -z $md_recovery ]
			then
				echo "$md_name did not start rebuilding. sleeping and checking again"
				sleep 5
			else
				break
			fi
		done

		sleep 5
		echo "`date` :	After Rebuild Started">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt
		
		echo "RAID is Rebuilding. Speeding up the process"
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
		echo 100000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

		# Wait till rebuilding
		while [ 1 ]
		do 
			rb_still=`cat /proc/mdstat | grep $md_name -A 3 | grep -o [0-9]*.[0-9]*%`
			if [ -z $rb_still ]
			then
				echo "`date +%H:%M:%S`:Rebuild of $md_name completed"
				break
			else
				echo "Rebuild completed so far: $rb_still"
				sleep 5
			fi
		done
	
		sleep 5
		echo "`date` :	After Rebuild complete">> /volumes/RAIDCONF/raid_conf_info.txt
                echo "`date` : `mdadm -D $ld_name`" >> /volumes/RAIDCONF/raid_conf_info.txt

		
##################################################################################################################################		
	# Reset back sync speed
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_max
	echo 1000 > /sys/devices/virtual/block/$md_name/md/sync_speed_min

	echo "=============================================================================================" >>/volumes/RAIDCONF/raid_conf_info.txt
	((COUNT=$COUNT+1))
	
echo "Rolling Hot Spare Test two drive removal iteration $i complete....">> Iterations.txt
echo "Rolling Hot Spare Test two drive removal iteration $i complete...."
let i=$i+1;
echo "Sleeping for 300 seconds.."
#####################################################################################################################################	
sleep 20;
done #while  
#--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Manibalan.


-----Original Message-----
From: dan.j.williams@gmail.com [mailto:dan.j.williams@gmail.com] On Behalf Of Dan Williams
Sent: Tuesday, May 20, 2014 5:52 AM
To: NeilBrown
Cc: Manibalan P; linux-raid
Subject: Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.

On Fri, May 16, 2014 at 11:11 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, May 5, 2014 at 12:21 AM, NeilBrown <neilb@suse.de> wrote:
>> On Wed, 23 Apr 2014 10:02:00 -0700 Dan Williams 
>> <dan.j.williams@intel.com>
>> wrote:
>>
>>> On Wed, Apr 23, 2014 at 12:07 AM, NeilBrown <neilb@suse.de> wrote:
>>> > On Fri, 11 Apr 2014 17:41:12 +0530 "Manibalan P" 
>>> > <pmanibalan@amiindia.co.in>
>>> > wrote:
>>> >
>>> >> Hi Neil,
>>> >>
>>> >> Also, I found the data corruption issue on RHEL 6.5.
>>> >>
>>> >> For your kind attention, I up-ported the md code [raid5.c + 
>>> >> raid5.h] from FC11 kernel to CentOS 6.4, and there is no 
>>> >> mis-compare with the up-ported code.
>>> >
>>> > This narrows it down to between 2.6.29 and 2.6.32 - is that correct?
>>> >
>>> > So it is probably the change to RAID6 to support async parity calculations.
>>> >
>>> > Looking at the code always makes my head spin.
>>> >
>>> > Dan : have you any ideas?
>>> >
>>> > It seems that writing to a double-degraded RAID6 while it is 
>>> > recovering to a space can trigger data corruption.
>>> >
>>> > 2.6.29 works
>>> > 2.6.32 doesn't
>>> > 3.8.0 still doesn't.
>>> >
>>> > I suspect async parity calculations.
>>>
>>> I'll take a look.  I've had cleanups of that code on my backlog for 
>>> "a while now (TM)".
>>
>>
>> Hi Dan,
>>  did you have a chance to have a look?
>>
>> I've been consistently failing to find anything.
>>
>> I have a question though.
>> If we set up a chain of async dma handling via:
>>    ops_run_compute6_2 then ops_bio_drain then ops_run_reconstruct
>>
>> is it possible for the ops_complete_compute callback set up by
>> ops_run_compute6_2 to be called before ops_run_reconstruct has been 
>> scheduled or run?
>
> In the absence of a dma engine we never run asynchronously, so we will
> *always* call ops_complete_compute() before ops_run_reconstruct() in 
> the synchronous acse.  This looks confused.  We're certainly leaking 
> an uptodate state prior to the completion of the write.
>
>> If so, there seems to be some room for confusion over the setting for 
>> R5_UPTODATE on blocks that are being computed and then drained to.  
>> Both will try to set the flag, so it could get set before reconstruction has run.
>>
>> I can't see that this would cause a problem, but then I'm not 
>> entirely sure why we clear R5_UPTODATE when we set R5_Wantdrain.
>
> Let me see what problems this could be causing.  I'm thinking we 
> should be protected by the global ->reconstruct_state, but something 
> is telling me we do depend on R5_UPTODATE being consistent with the 
> ongoing stripe operation.
>

Can you share your exact test scripts?  I'm having a hard time reproducing this with something like:

echo 100000 > /proc/sys/dev/raid/speed_limit_min
mdadm --add /dev/md0 /dev/sd[bc]; dd if=urandom.dump of=/dev/md0 bs=1024M oflag=sync

This is a 7-drive raid6 array.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
@ 2014-03-24 13:15 Manibalan P
  0 siblings, 0 replies; 18+ messages in thread
From: Manibalan P @ 2014-03-24 13:15 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

Hi,

I have performed the following tests to narrow down the integrity issue.

1. RAID 6, single drive failure - NO ISSUE
	a. Running IO
	b. mdadm set faulty and remove a drive
	c. mdadm add the drive back
 There is no mis-compare happen in this path.

2. RAID 6, two drive failure - write during Degrade and verify after
rebuild 
	a. remove two drives, to make the RAID array degraded.
	b. now run write IO write cycle, wait till the write cycle
completes
	c. insert the drives back one by one, and wait till the re-build
completes and a RAID array become optimal.
	d. now perform the verification cycle.
There is no mis-compare happened in this path also.

During All my test, the sync_Speed_max and min is set to 100Mb

So, as you referred in your previous mail, the corruption might be
happening only during resync and IO happens in parallel.

Also, I tested with upstream 2.6.32 kernel from git:
"http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ -
tags/v2.6.32"
	And I am facing mis-compare issue in this kernel as well.  on
RAID 6, two drive failure with high sync_speed.

Thanks,
Manibalan.

-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de]
Sent: Thursday, March 13, 2014 11:49 AM
To: Manibalan P
Cc: linux-raid@vger.kernel.org
Subject: Re: raid6 - data integrity issue - data mis-compare on
rebuilding RAID 6 - with 100 Mb resync speed.

On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P"
<pmanibalan@amiindia.co.in>
wrote:

> >
> >Was the array fully synced before you started the test?
> 
> Yes , IO is started, only after the re-sync is completed.
>  And to add more info,
>              I am facing this mis-compare only with high resync speed 
> (30M to 100M), I ran the same test with resync speed min -10M and max
> - 30M, without any issue. So the  issue has relationship with 
> sync_speed_max / min.

So presumably it is an interaction between recovery and IO.  Maybe if we
write to a stripe that is being recoverred, or recover a stripe that is
being written to, then something gets confused.

I'll have a look to see what I can find.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
  2014-03-12  7:39 Manibalan P
@ 2014-03-13  6:19 ` NeilBrown
  0 siblings, 0 replies; 18+ messages in thread
From: NeilBrown @ 2014-03-13  6:19 UTC (permalink / raw)
  To: Manibalan P; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

On Wed, 12 Mar 2014 13:09:28 +0530 "Manibalan P" <pmanibalan@amiindia.co.in>
wrote:

> >
> >Was the array fully synced before you started the test?
> 
> Yes , IO is started, only after the re-sync is completed.
>  And to add more info,
>              I am facing this mis-compare only with high resync speed
> (30M to 100M), I ran the same test with resync speed min -10M and max -
> 30M, without any issue. So the  issue has relationship with
> sync_speed_max / min.

So presumably it is an interaction between recovery and IO.  Maybe if we
write to a stripe that is being recoverred, or recover a stripe that is being
written to, then something gets confused.

I'll have a look to see what I can find.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed.
@ 2014-03-12  7:39 Manibalan P
  2014-03-13  6:19 ` NeilBrown
  0 siblings, 1 reply; 18+ messages in thread
From: Manibalan P @ 2014-03-12  7:39 UTC (permalink / raw)
  To: linux-raid; +Cc: NeilBrown

Hi,

>I don't know what kernel "CentOS 6.4" runs.  Please report the actual
kernel version as well as distro details.
The Kernel version is : 2.6.32
 Centos  distribution  : 2.6.32-358.23.2.el6.x86_64 #1 SMP : x86_64
GNU/Linux

>I know nothing about "dit32" and so cannot easily interpret the output.
Is it saying that just a few bytes were wrong?

It is not just few bytes of corruption, it looks like some number of
sectors are corrupted (for example - 40 sectors ).  dit32 will write a
pattern of IO, and after each write cycle, it will read it back and
verify.
Actually, the data which is written on the reported LBA itself
corrupted. What I mean to say is,  this looks like write corruption.

>
>Was the array fully synced before you started the test?

Yes , IO is started, only after the re-sync is completed.
 And to add more info,
             I am facing this mis-compare only with high resync speed
(30M to 100M), I ran the same test with resync speed min -10M and max -
30M, without any issue. So the  issue has relationship with
sync_speed_max / min.

>
>I can't think of anything else that might cause an inconsistency.  I 
>test the
>RAID6 recovery code from time to time and it always works flawlessly
for me.

Do you suggest, any IO tool or test to ensure data integrity.

One more thing, I like to bring to your notification. I did the same IO
test on Ubuntu 13 (Linux ubuntu 3.8.0-19-generic #29-Ubuntu SMP Wed Apr
17 18:16:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux ) system also. And I
faced same type of data corruption.

Thanks,
Manibalan.

More Information:

[root@Cento6 ~]# mdadm --version
mdadm - v3.2.5 - 18th May 2012
------------------------------------------------------------------------
-----------------------------
[root@Cento6 ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdd6[13] sdg6[11] sdf6[12] sde6[9] sdh6[8] sdc6[10]
sdb6[7]
      26214400 blocks super 1.2 level 6, 64k chunk, algorithm 2 [7/6]
[UUUUUU_]
      [===============>.....]  recovery = 75.2% (3943692/5242880)
finish=0.3min speed=60112K/sec

unused devices: <none>
------------------------------------------------------------------------
-----------------------------
[root@Cento6 ~]# mdadm -Evvvs
/dev/md0:
   MBR Magic : aa55
Partition[0] :     52422656 sectors at         2048 (type 0c)
mdadm: No md superblock detected on /dev/dm-2.
mdadm: No md superblock detected on /dev/dm-1.
mdadm: No md superblock detected on /dev/dm-0.
mdadm: No md superblock detected on /dev/sda2.
mdadm: No md superblock detected on /dev/sda1.
/dev/sda:
   MBR Magic : aa55
Partition[0] :      1024000 sectors at         2048 (type 83)
Partition[1] :    285722624 sectors at      1026048 (type 8e)
/dev/sdd6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x2
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
Recovery Offset : 9830520 sectors
          State : clean
    Device UUID : 0df3501e:7cdae253:4a6628ba:e0aed1c2

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : 6b146a09 - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 6
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdd5.
mdadm: No md superblock detected on /dev/sdd4.
mdadm: No md superblock detected on /dev/sdd3.
mdadm: No md superblock detected on /dev/sdd2.
mdadm: No md superblock detected on /dev/sdd1.
/dev/sdd:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sdc6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 5304a667:f7ff5099:4d438d70:6d4d7aed

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : da4f1bdd - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 3
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdc5.
mdadm: No md superblock detected on /dev/sdc4.
mdadm: No md superblock detected on /dev/sdc3.
mdadm: No md superblock detected on /dev/sdc2.
mdadm: No md superblock detected on /dev/sdc1.
/dev/sdc:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sdb6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 0042c71b:f2642cec:4455ac44:e941ab66

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : 2e9bc4f5 - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdb5.
mdadm: No md superblock detected on /dev/sdb4.
mdadm: No md superblock detected on /dev/sdb3.
mdadm: No md superblock detected on /dev/sdb2.
mdadm: No md superblock detected on /dev/sdb1.
/dev/sdb:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sdg6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b05ea97b:fd15cd87:4a71f688:e5140be8

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : efc881b6 - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 4
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdg5.
mdadm: No md superblock detected on /dev/sdg4.
mdadm: No md superblock detected on /dev/sdg3.
mdadm: No md superblock detected on /dev/sdg2.
mdadm: No md superblock detected on /dev/sdg1.
/dev/sdg:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sdh6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 7002db82:8feb4355:9c7d788c:b89a2823

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : 3108d2a - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdh5.
mdadm: No md superblock detected on /dev/sdh4.
mdadm: No md superblock detected on /dev/sdh3.
mdadm: No md superblock detected on /dev/sdh2.
mdadm: No md superblock detected on /dev/sdh1.
/dev/sdh:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sde6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : afc8f016:23c110f2:4a209140:d9c0cef8

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : bdb1f1cd - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sde5.
mdadm: No md superblock detected on /dev/sde4.
mdadm: No md superblock detected on /dev/sde3.
mdadm: No md superblock detected on /dev/sde2.
mdadm: No md superblock detected on /dev/sde1.
/dev/sde:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
/dev/sdf6:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
           Name : initiator:0
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
   Raid Devices : 7

 Avail Dev Size : 3891293457 (1855.51 GiB 1992.34 GB)
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 10485760 (5.00 GiB 5.37 GB)
    Data Offset : 8192 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 62ff3273:a8e1260b:4c0e8ba0:48093e3f

    Update Time : Sat Mar  8 10:00:15 2014
       Checksum : d9737f78 - correct
         Events : 14853

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 5
   Array State : AAAAAAA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdf5.
mdadm: No md superblock detected on /dev/sdf4.
mdadm: No md superblock detected on /dev/sdf3.
mdadm: No md superblock detected on /dev/sdf2.
mdadm: No md superblock detected on /dev/sdf1.
/dev/sdf:
   MBR Magic : aa55
Partition[0] :   3907029167 sectors at            1 (type ee)
------------------------------------------------------------------------
-----------------------------
[root@Cento6 ~]# mdadm -D /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Fri Mar  7 20:33:24 2014
     Raid Level : raid6
     Array Size : 26214400 (25.00 GiB 26.84 GB)
  Used Dev Size : 5242880 (5.00 GiB 5.37 GB)
   Raid Devices : 7
  Total Devices : 7
    Persistence : Superblock is persistent

    Update Time : Sat Mar  8 10:00:32 2014
          State : clean
 Active Devices : 7
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : initiator:0
           UUID : 6e5e1ed7:5b4bbe23:ae3ce08e:8502c4d5
         Events : 14855

    Number   Major   Minor   RaidDevice State
       7       8       22        0      active sync   /dev/sdb6
       8       8      118        1      active sync   /dev/sdh6
       9       8       70        2      active sync   /dev/sde6
      10       8       38        3      active sync   /dev/sdc6
      11       8      102        4      active sync   /dev/sdg6
      12       8       86        5      active sync   /dev/sdf6
      13       8       54        6      active sync   /dev/sdd6
------------------------------------------------------------------------
-----------------------------
[root@Cento6 ~]# tgtadm --mode target --op show
Target 1: iqn.2011-07.world.server:target0
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 26844 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/md0
            Backing store flags:
    Account information:
    ACL information:
        ALL
------------------------------------------------------------------------
-----------------------------
[root@Cento6 ~]# cat /sys/block/md0/md/sync_speed_max
100000 (local)
[root@Cento6 ~]# cat /sys/block/md0/md/sync_speed_min
100000 (local)


-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de] 
Sent: Tuesday, March 11, 2014 8:34 AM
To: Manibalan P
Cc: linux-raid@vger.kernel.org
Subject: Re: raid6 - data intefrity issue - data mis-compare on
rebuilding RAID 6 - with 100 Mb resync speed.

On Fri, 7 Mar 2014 14:18:59 +0530 "Manibalan P"
<pmanibalan@amiindia.co.in>
wrote:

> Hi,

Hi,
 when posting to vger.kernel.org lists, please don't send HTML mail,
just  plain text.
 Because you did the original email didn't get to the list.

> 
>  
> 
> We are facing a data integrity issue on RAID 6. On CentOS 6.4 kernel.

I don't know what kernel "CentOS 6.4" runs.  Please report the actual
kernel version as well as distro details.

> 
>  
> 
> Details of the setup:
> 
>  
> 
> 1.       7 drives Raid6 md devices (md0) - Capacity 25 GB
> 
> 2.       Resync speed max and min set to 100000 (100Mb)
> 
> 3.       A script is running to simulate drive failure, this script
will
> do the following
> 
> a.       Mdadm set faulty for two random drives on the md, the mdadm
> remove those drives.
> 
> b.      Mdadm add ond drive, and wait for rebuild to complete, then
> insert the next one.
> 
> c.       Wait till the md become optimal, and continue the disk
removal
> cycle again.
> 
> 4.       iSCSI target is configured to "/dev/md0"
> 
> 5.       From  Windows server, the md0 target is connected using
> MicroSoft iSCSI initiator, and formatted with NTFS.
> 
> 6.       Dit32 IO tool is running on the formatted volume.
> 
>  
> 
> Issue#:
> 
>                 The Dit32 tool will running IO in multiple threads, in

> each thread, IO will be written and verified.
> 
>                 And on the verification Cycle, we are getting 
> mis-compare. Below is the log from the dit32 tool.
> 
>                 
> 
> Thu Mar 06 23:19:31 2014 INFO:  DITNT application started
> 
> Thu Mar 06 23:20:19 2014 INFO:  Test started on Drive D:
> 
>      Dir Sets=8, Dirs per Set=70, Files per Dir=75
> 
>      File Size=512KB
> 
>      Read Only=N, Debug Stamp=Y, Verify During Copy=Y
> 
>      Build I/O Size range=1 to 128 sectors
> 
>      Copy Read I/O Size range=1 to 128 sectors
> 
>      Copy Write I/O Size range=1 to 128 sectors
> 
>      Verify I/O Size range=1 to 128 sectors
> 
> Fri Mar 07 01:28:09 2014 ERROR: Miscompare Found: File 
> "D:\dit\s6\d51\s6d51f37", offset=00048008
> 
>      Expected Data: 06 33 25 01 0240 (dirSet, dirNo, fileNo, 
> elementNo,
> sectorOffset)
> 
>          Read Data: 05 08 2d 01 0240 (dirSet, dirNo, fileNo, 
> elementNo,
> sectorOffset)
> 
>      Read Request: offset=00043000, size=00008600
> 
>  
> 
> This mail has been attached with the following files for your 
> reference
> 
> 1.       Raid5.c and .h files, the Code what we are using.
> 
> 2.       RollingHotSpareTwoDriveFailure.sh - the script which
simulates
> the two disk failure.
> 
> 3.       dit32log.sav - Log file from the dit32 tool
> 
> 4.       s6d31f37 - the file where the corruption happened(hex format)
> 
> 5.       CentOS-system-info - md and system info
> 
>  

I didn't find any "CentOS-system-info" attached.

I know nothing about "dit32" and so can not easily interpret the output.
Is it saying that just a few bytes were wrong?

Was the array fully synced before you started the test?

I can't think of anything else that might cause an inconsistency.  I
test the
RAID6 recovery code from time to time and it always works flawlessly for
me.

NeilBrown



> 
>                 
> 
> Thanks,
> 
> Manibalan.
> 
>  
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-05-22 11:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-11 12:11 raid6 - data integrity issue - data mis-compare on rebuilding RAID 6 - with 100 Mb resync speed Manibalan P
2014-04-23  7:07 ` NeilBrown
2014-04-23 17:02   ` Dan Williams
2014-05-05  7:21     ` NeilBrown
2014-05-16 18:11       ` Dan Williams
2014-05-20  0:22         ` Dan Williams
2014-05-22 11:47           ` Manibalan P
2014-05-22 11:52           ` Manibalan P
2014-04-23  9:19 ` Pasi Kärkkäinen
2014-04-23  9:25   ` Manibalan P
2014-04-23  9:30     ` Pasi Kärkkäinen
2014-04-23  9:33       ` Manibalan P
2014-04-23  9:45         ` Pasi Kärkkäinen
2014-04-23  9:59           ` Manibalan P
2014-04-23 10:03             ` Pasi Kärkkäinen
  -- strict thread matches above, loose matches on Subject: below --
2014-03-24 13:15 Manibalan P
2014-03-12  7:39 Manibalan P
2014-03-13  6:19 ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.