All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
@ 2017-05-12  1:51 Lidong Zhong
  2017-05-12  7:53 ` Coly Li
  2017-05-16  4:28 ` Lidong Zhong
  0 siblings, 2 replies; 8+ messages in thread
From: Lidong Zhong @ 2017-05-12  1:51 UTC (permalink / raw)
  To: linux-raid; +Cc: colyli, Jes.Sorensen

The value of sb->max_dev will always be increased by 1 when adding
a new disk in linear array. It causes an inconsistence between each
disk in the array and the "Array State" value of "mdadm --examine DISK"
is wrong. For example, when adding the first new disk into linear array
it will be:

Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
('A' == active, '.' == missing, 'R' == replacing)

Adding the second disk into linear array it will be

Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
('A' == active, '.' == missing, 'R' == replacing)

Signed-off-by: Lidong Zhong <lzhong@suse.com>
---
 super1.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/super1.c b/super1.c
index 87a74cb..4fb655f 100644
--- a/super1.c
+++ b/super1.c
@@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 				break;
 		sb->dev_number = __cpu_to_le32(i);
 		info->disk.number = i;
-		if (max >= __le32_to_cpu(sb->max_dev))
+		if (i >= __le32_to_cpu(sb->max_dev)) {
 			sb->max_dev = __cpu_to_le32(max+1);
+			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
+		}
 
 		random_uuid(sb->device_uuid);
 
@@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
 		sb->dev_roles[info->disk.number] =
 			__cpu_to_le16(info->disk.raid_disk);
+		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
+			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
+			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
+		}
 	} else if (strcmp(update, "resync") == 0) {
 		/* make sure resync happens */
 		sb->resync_offset = 0ULL;
-- 
2.12.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-12  1:51 [PATCH] super1: fix sb->max_dev when adding a new disk in linear array Lidong Zhong
@ 2017-05-12  7:53 ` Coly Li
  2017-05-15 11:33   ` Lidong Zhong
  2017-05-16  4:28 ` Lidong Zhong
  1 sibling, 1 reply; 8+ messages in thread
From: Coly Li @ 2017-05-12  7:53 UTC (permalink / raw)
  To: Lidong Zhong, linux-raid; +Cc: Jes.Sorensen

On 2017/5/12 上午9:51, Lidong Zhong wrote:
> The value of sb->max_dev will always be increased by 1 when adding
> a new disk in linear array. It causes an inconsistence between each
> disk in the array and the "Array State" value of "mdadm --examine DISK"
> is wrong. For example, when adding the first new disk into linear array
> it will be:
> 
> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
> 
> Adding the second disk into linear array it will be
> 
> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
> 
> Signed-off-by: Lidong Zhong <lzhong@suse.com>

The above description does not very well explain why this fix is necessary.

For linear device, there was a race condition and got fixed already.
There are users encountered such bug in old kernel, when they trying to
add a hard disk into a linear device. After the kernel panic caused by
the race issue in old md linear code, they reboot from the fixed kernel,
and find 'R' or '?' marks from Array State output.

After analyze the metadata, we found value of
sb->dev_roles[sb->max_dev-1] is 0, but which should be
MD_DISK_ROLE_FAULTY. The reason for the error is, when adding a disk
into linear device, mdadm first increases sb->max_dev value, but not set
its role value. So the corresponding role value keeps as 0, which is
same as No.0 disk's role value in the linear device.

So mdadm -E will find there are 2 same role value for No.0 disk, and the
first state character of Array State is marked as 'R'. If such a panic
happened multiple times, there are more bogus zero value at tail of
dev_roles[], so the character of Array State is marked as '?'.

Now the kernel part is fixed, but mdadm part is not fixed yet. This
patch is an effort to avoid bogus dev_roles[] value if kernel panic
happens in follow ioctl system call (although the possibility is quite
low and who can tel ...).


> ---
>  super1.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/super1.c b/super1.c
> index 87a74cb..4fb655f 100644
> --- a/super1.c
> +++ b/super1.c
> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  				break;
>  		sb->dev_number = __cpu_to_le32(i);
>  		info->disk.number = i;
> -		if (max >= __le32_to_cpu(sb->max_dev))
> +		if (i >= __le32_to_cpu(sb->max_dev)) {
>  			sb->max_dev = __cpu_to_le32(max+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
> +		}
>  
>  		random_uuid(sb->device_uuid);

The above part makes sense to me.



>  
> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>  		sb->dev_roles[info->disk.number] =
>  			__cpu_to_le16(info->disk.raid_disk);
> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
> +		}
>  	} else if (strcmp(update, "resync") == 0) {
>  		/* make sure resync happens */
>  		sb->resync_offset = 0ULL;
> 

Just a lazy question, could you please explain a little bit why you also
modify here ? Thanks.

Coly




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-12  7:53 ` Coly Li
@ 2017-05-15 11:33   ` Lidong Zhong
  0 siblings, 0 replies; 8+ messages in thread
From: Lidong Zhong @ 2017-05-15 11:33 UTC (permalink / raw)
  To: Coly Li; +Cc: linux-raid, Jes.Sorensen



On 05/12/2017 03:53 PM, Coly Li wrote:
> On 2017/5/12 上午9:51, Lidong Zhong wrote:
>> The value of sb->max_dev will always be increased by 1 when adding
>> a new disk in linear array. It causes an inconsistence between each
>> disk in the array and the "Array State" value of "mdadm --examine DISK"
>> is wrong. For example, when adding the first new disk into linear array
>> it will be:
>>
>> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>> ('A' == active, '.' == missing, 'R' == replacing)
>>
>> Adding the second disk into linear array it will be
>>
>> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>> ('A' == active, '.' == missing, 'R' == replacing)
>>
>> Signed-off-by: Lidong Zhong<lzhong@suse.com>

Hi Coly,

Thanks for your reply.

> The above description does not very well explain why this fix is necessary.
>
> For linear device, there was a race condition and got fixed already.
> There are users encountered such bug in old kernel, when they trying to
> add a hard disk into a linear device. After the kernel panic caused by
> the race issue in old md linear code, they reboot from the fixed kernel,
> and find 'R' or '?' marks from Array State output.
>
> After analyze the metadata, we found value of
> sb->dev_roles[sb->max_dev-1] is 0, but which should be
> MD_DISK_ROLE_FAULTY. The reason for the error is, when adding a disk
> into linear device, mdadm first increases sb->max_dev value, but not set
> its role value. So the corresponding role value keeps as 0, which is
> same as No.0 disk's role value in the linear device.
>
> So mdadm -E will find there are 2 same role value for No.0 disk, and the
> first state character of Array State is marked as 'R'. If such a panic
> happened multiple times, there are more bogus zero value at tail of
> dev_roles[], so the character of Array State is marked as '?'.
>
> Now the kernel part is fixed, but mdadm part is not fixed yet. This
> patch is an effort to avoid bogus dev_roles[] value if kernel panic
> happens in follow ioctl system call (although the possibility is quite
> low and who can tel ...).

Not exactly the kernel panic we met in another bug triggers the error value
of sb->max_dev. You can reproduce this problem easily by adding a new
disk into a linear array. Here is the original logic:

The super block written  to the new added device is read from the last disk
of the linear array. And it does some changes including max_dev and 
dev_roles,

1259     } else if (strcmp(update, "linear-grow-new") == 0) {
1260         unsigned int i;
1261         int fd;
1262         unsigned int max = __le32_to_cpu(sb->max_dev);
1263
1264         for (i = 0; i < max; i++)
1265             if (__le16_to_cpu(sb->dev_roles[i]) >=
1266 MD_DISK_ROLE_FAULTY)
1267 break;
1268         sb->dev_number = __cpu_to_le32(i);
1269         info->disk.number = i;
1270         if (max >= __le32_to_cpu(sb->max_dev))
1271             sb->max_dev = __cpu_to_le32(max+1);

But because max always equals to __le32_to_cpu(sb->max_dev)) in line 
1270, the
value of sb->max_dev is always increased by 1 for the new added device 
and the
sb->dev_roles[ sb->max_dev] is not initialized to 0xfffe. While the 
sb->max_dev in
the original disks are not changed

1295     } else if (strcmp(update, "linear-grow-update") == 0) {
1296         sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
1297         sb->dev_roles[info->disk.number] =
1298             __cpu_to_le16(info->disk.raid_disk);
>> ---
>>   super1.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/super1.c b/super1.c
>> index 87a74cb..4fb655f 100644
>> --- a/super1.c
>> +++ b/super1.c
>> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>   				break;
>>   		sb->dev_number = __cpu_to_le32(i);
>>   		info->disk.number = i;
>> -		if (max >= __le32_to_cpu(sb->max_dev))
>> +		if (i >= __le32_to_cpu(sb->max_dev)) {
>>   			sb->max_dev = __cpu_to_le32(max+1);
>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
>> +		}
>>
>>   		random_uuid(sb->device_uuid);
> The above part makes sense to me.
>
>
>
>>
>> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>   		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>>   		sb->dev_roles[info->disk.number] =
>>   			__cpu_to_le16(info->disk.raid_disk);
>> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
>> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
>> +		}
>>   	} else if (strcmp(update, "resync") == 0) {
>>   		/* make sure resync happens */
>>   		sb->resync_offset = 0ULL;
>>
> Just a lazy question, could you please explain a little bit why you also
> modify here ? Thanks.

I didn't see the sb->max_dev and dev_roles value are correctly set when 
the disks in array
are really more than sb->max_dev, both in kernel and userspace.

Thanks,
Lidong

> Coly
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttp://vger.kernel.org/majordomo-info.html
>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-12  1:51 [PATCH] super1: fix sb->max_dev when adding a new disk in linear array Lidong Zhong
  2017-05-12  7:53 ` Coly Li
@ 2017-05-16  4:28 ` Lidong Zhong
  1 sibling, 0 replies; 8+ messages in thread
From: Lidong Zhong @ 2017-05-16  4:28 UTC (permalink / raw)
  To: linux-raid; +Cc: colyli, Jes.Sorensen



On 05/12/2017 09:51 AM, Lidong Zhong wrote:
> The value of sb->max_dev will always be increased by 1 when adding
> a new disk in linear array. It causes an inconsistence between each
> disk in the array and the "Array State" value of "mdadm --examine DISK"
> is wrong. For example, when adding the first new disk into linear array
> it will be:
>
> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
>
> Adding the second disk into linear array it will be
>
> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
>
> Signed-off-by: Lidong Zhong <lzhong@suse.com>
> ---
>  super1.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/super1.c b/super1.c
> index 87a74cb..4fb655f 100644
> --- a/super1.c
> +++ b/super1.c
> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  				break;
>  		sb->dev_number = __cpu_to_le32(i);
>  		info->disk.number = i;
> -		if (max >= __le32_to_cpu(sb->max_dev))
> +		if (i >= __le32_to_cpu(sb->max_dev)) {
>  			sb->max_dev = __cpu_to_le32(max+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
> +		}
>
Hi,

I realized that the initial value for dev_roles should be 0xffff, that 
is MD_DISK_ROLE_SPARE. Sorry for the typo, I will send another patch
later.

Thanks,
Lidong
>  		random_uuid(sb->device_uuid);
>
> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>  		sb->dev_roles[info->disk.number] =
>  			__cpu_to_le16(info->disk.raid_disk);
> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_FAULTY);
> +		}
>  	} else if (strcmp(update, "resync") == 0) {
>  		/* make sure resync happens */
>  		sb->resync_offset = 0ULL;
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-19  5:31   ` Lidong Zhong
@ 2017-05-19  5:37     ` NeilBrown
  0 siblings, 0 replies; 8+ messages in thread
From: NeilBrown @ 2017-05-19  5:37 UTC (permalink / raw)
  To: Lidong Zhong, linux-raid; +Cc: colyli, Jes.Sorensen

[-- Attachment #1: Type: text/plain, Size: 4018 bytes --]

On Fri, May 19 2017, Lidong Zhong wrote:

> On 05/19/2017 12:36 PM, NeilBrown wrote:
>> On Tue, May 16 2017, Lidong Zhong wrote:
>>
>>> The value of sb->max_dev will always be increased by 1 when adding
>>> a new disk in linear array. It causes an inconsistence between each
>>> disk in the array and the "Array State" value of "mdadm --examine DISK"
>>> is wrong. For example, when adding the first new disk into linear array
>>> it will be:
>>>
>>> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>> ('A' == active, '.' == missing, 'R' == replacing)
>>>
>>> Adding the second disk into linear array it will be
>>>
>>> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>> ('A' == active, '.' == missing, 'R' == replacing)
>>>
>>> Signed-off-by: Lidong Zhong <lzhong@suse.com>
>>> ---
>>>  super1.c | 8 +++++++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/super1.c b/super1.c
>>> index 87a74cb..3d49bee 100644
>>> --- a/super1.c
>>> +++ b/super1.c
>>> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>>  				break;
>>>  		sb->dev_number = __cpu_to_le32(i);
>>>  		info->disk.number = i;
>>> -		if (max >= __le32_to_cpu(sb->max_dev))
>>> +		if (i >= __le32_to_cpu(sb->max_dev)) {
>>
>> This change is correct - thanks.  Though
>>     if (i >= max) {
>>
>> might be clearer and simpler.
>>
>>
>>>  			sb->max_dev = __cpu_to_le32(max+1);
>>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>>
>> This change is wrong.
>> At the very least, the dev_roles[] array needs to be indexed by a
>> host-order number, not a little-endian number.
>> But the change is not needed because dev_roles[max_dev] is never used.
>> See role_from_sb().
>> dev_rols[max_dev - 1] does need to be set, but the line
>>
>> 		sb->dev_roles[i] = __cpu_to_le16(info->disk.raid_disk);
>>
>> almost certainly does that.
> Hi Neil,
>
> The reason I set all the dev_roles[0~max_dev-1] is because
> the following code
>
>   552     printf("   Array State : ");
>   553     for (d = 0; d < __le32_to_cpu(sb->raid_disks) + delta_extra; 
> d++) {
>   554         int cnt = 0;
>   555         unsigned int i;
>   556         for (i = 0; i < __le32_to_cpu(sb->max_dev); i++) {
>   557             unsigned int role = __le16_to_cpu(sb->dev_roles[i]); 
>
>   558             if (role == d) 
>
>   559                 cnt++; 
>
>   560         }

This code does not access dev_roles[max_dev], only up to
dev_roles[max_dev-1].
You changed dev_roles[max_dev], which will never be accessed.

>
>
>> It might be better to do
>>   if (i >= max) {
>>      while (max <= i) {
>>         sb->dev_roles[max] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>>         max += 1;
>>      }
>>      sb->max_dev = __cpu_to_le32(max);
>>   }
>>
>
>
> Thanks for the advice.
>
>>> +		}
>>>
>>>  		random_uuid(sb->device_uuid);
>>>
>>> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>>  		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>>>  		sb->dev_roles[info->disk.number] =
>>>  			__cpu_to_le16(info->disk.raid_disk);
>>> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
>>> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
>>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>>
>> Again, max_dev is little-endian, so cannot be used as an index.
>> And I think you are updating the wrong element in the dev_roles array.
>
> Yes, I didn't realized the valude is conversed to little-endian and the
> index is wrong too. Thank you for pointing this out. I will submit
> another version patch.

I look forward to it,
Thanks,
NeilBrown

>
> Thanks,
> Lidong
>>
>> Thanks,
>> NeilBrown
>>
>>
>>> +		}
>>>  	} else if (strcmp(update, "resync") == 0) {
>>>  		/* make sure resync happens */
>>>  		sb->resync_offset = 0ULL;
>>> --
>>> 2.12.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-19  4:36 ` NeilBrown
@ 2017-05-19  5:31   ` Lidong Zhong
  2017-05-19  5:37     ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Lidong Zhong @ 2017-05-19  5:31 UTC (permalink / raw)
  To: NeilBrown, linux-raid; +Cc: colyli, Jes.Sorensen



On 05/19/2017 12:36 PM, NeilBrown wrote:
> On Tue, May 16 2017, Lidong Zhong wrote:
>
>> The value of sb->max_dev will always be increased by 1 when adding
>> a new disk in linear array. It causes an inconsistence between each
>> disk in the array and the "Array State" value of "mdadm --examine DISK"
>> is wrong. For example, when adding the first new disk into linear array
>> it will be:
>>
>> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>> ('A' == active, '.' == missing, 'R' == replacing)
>>
>> Adding the second disk into linear array it will be
>>
>> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>> ('A' == active, '.' == missing, 'R' == replacing)
>>
>> Signed-off-by: Lidong Zhong <lzhong@suse.com>
>> ---
>>  super1.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/super1.c b/super1.c
>> index 87a74cb..3d49bee 100644
>> --- a/super1.c
>> +++ b/super1.c
>> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>  				break;
>>  		sb->dev_number = __cpu_to_le32(i);
>>  		info->disk.number = i;
>> -		if (max >= __le32_to_cpu(sb->max_dev))
>> +		if (i >= __le32_to_cpu(sb->max_dev)) {
>
> This change is correct - thanks.  Though
>     if (i >= max) {
>
> might be clearer and simpler.
>
>
>>  			sb->max_dev = __cpu_to_le32(max+1);
>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>
> This change is wrong.
> At the very least, the dev_roles[] array needs to be indexed by a
> host-order number, not a little-endian number.
> But the change is not needed because dev_roles[max_dev] is never used.
> See role_from_sb().
> dev_rols[max_dev - 1] does need to be set, but the line
>
> 		sb->dev_roles[i] = __cpu_to_le16(info->disk.raid_disk);
>
> almost certainly does that.
Hi Neil,

The reason I set all the dev_roles[0~max_dev-1] is because
the following code

  552     printf("   Array State : ");
  553     for (d = 0; d < __le32_to_cpu(sb->raid_disks) + delta_extra; 
d++) {
  554         int cnt = 0;
  555         unsigned int i;
  556         for (i = 0; i < __le32_to_cpu(sb->max_dev); i++) {
  557             unsigned int role = __le16_to_cpu(sb->dev_roles[i]); 

  558             if (role == d) 

  559                 cnt++; 

  560         }


> It might be better to do
>   if (i >= max) {
>      while (max <= i) {
>         sb->dev_roles[max] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>         max += 1;
>      }
>      sb->max_dev = __cpu_to_le32(max);
>   }
>


Thanks for the advice.

>> +		}
>>
>>  		random_uuid(sb->device_uuid);
>>
>> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>>  		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>>  		sb->dev_roles[info->disk.number] =
>>  			__cpu_to_le16(info->disk.raid_disk);
>> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
>> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
>> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
>
> Again, max_dev is little-endian, so cannot be used as an index.
> And I think you are updating the wrong element in the dev_roles array.

Yes, I didn't realized the valude is conversed to little-endian and the
index is wrong too. Thank you for pointing this out. I will submit
another version patch.

Thanks,
Lidong
>
> Thanks,
> NeilBrown
>
>
>> +		}
>>  	} else if (strcmp(update, "resync") == 0) {
>>  		/* make sure resync happens */
>>  		sb->resync_offset = 0ULL;
>> --
>> 2.12.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
  2017-05-16  4:51 Lidong Zhong
@ 2017-05-19  4:36 ` NeilBrown
  2017-05-19  5:31   ` Lidong Zhong
  0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2017-05-19  4:36 UTC (permalink / raw)
  To: Lidong Zhong, linux-raid; +Cc: colyli, Jes.Sorensen

[-- Attachment #1: Type: text/plain, Size: 2774 bytes --]

On Tue, May 16 2017, Lidong Zhong wrote:

> The value of sb->max_dev will always be increased by 1 when adding
> a new disk in linear array. It causes an inconsistence between each
> disk in the array and the "Array State" value of "mdadm --examine DISK"
> is wrong. For example, when adding the first new disk into linear array
> it will be:
>
> Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
>
> Adding the second disk into linear array it will be
>
> Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> ('A' == active, '.' == missing, 'R' == replacing)
>
> Signed-off-by: Lidong Zhong <lzhong@suse.com>
> ---
>  super1.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/super1.c b/super1.c
> index 87a74cb..3d49bee 100644
> --- a/super1.c
> +++ b/super1.c
> @@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  				break;
>  		sb->dev_number = __cpu_to_le32(i);
>  		info->disk.number = i;
> -		if (max >= __le32_to_cpu(sb->max_dev))
> +		if (i >= __le32_to_cpu(sb->max_dev)) {

This change is correct - thanks.  Though
    if (i >= max) {

might be clearer and simpler.


>  			sb->max_dev = __cpu_to_le32(max+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);

This change is wrong.
At the very least, the dev_roles[] array needs to be indexed by a
host-order number, not a little-endian number.
But the change is not needed because dev_roles[max_dev] is never used.
See role_from_sb().
dev_rols[max_dev - 1] does need to be set, but the line

		sb->dev_roles[i] = __cpu_to_le16(info->disk.raid_disk);

almost certainly does that.
It might be better to do
  if (i >= max) {
     while (max <= i) {
        sb->dev_roles[max] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
        max += 1;
     }
     sb->max_dev = __cpu_to_le32(max);
  }

> +		}
>  
>  		random_uuid(sb->device_uuid);
>  
> @@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
>  		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
>  		sb->dev_roles[info->disk.number] =
>  			__cpu_to_le16(info->disk.raid_disk);
> +		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
> +			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
> +			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);

Again, max_dev is little-endian, so cannot be used as an index.
And I think you are updating the wrong element in the dev_roles array.

Thanks,
NeilBrown


> +		}
>  	} else if (strcmp(update, "resync") == 0) {
>  		/* make sure resync happens */
>  		sb->resync_offset = 0ULL;
> -- 
> 2.12.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] super1: fix sb->max_dev when adding a new disk in linear array
@ 2017-05-16  4:51 Lidong Zhong
  2017-05-19  4:36 ` NeilBrown
  0 siblings, 1 reply; 8+ messages in thread
From: Lidong Zhong @ 2017-05-16  4:51 UTC (permalink / raw)
  To: linux-raid; +Cc: colyli, Jes.Sorensen, neilb

The value of sb->max_dev will always be increased by 1 when adding
a new disk in linear array. It causes an inconsistence between each
disk in the array and the "Array State" value of "mdadm --examine DISK"
is wrong. For example, when adding the first new disk into linear array
it will be:

Array State : RAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
('A' == active, '.' == missing, 'R' == replacing)

Adding the second disk into linear array it will be

Array State : .AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
('A' == active, '.' == missing, 'R' == replacing)

Signed-off-by: Lidong Zhong <lzhong@suse.com>
---
 super1.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/super1.c b/super1.c
index 87a74cb..3d49bee 100644
--- a/super1.c
+++ b/super1.c
@@ -1184,8 +1184,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 				break;
 		sb->dev_number = __cpu_to_le32(i);
 		info->disk.number = i;
-		if (max >= __le32_to_cpu(sb->max_dev))
+		if (i >= __le32_to_cpu(sb->max_dev)) {
 			sb->max_dev = __cpu_to_le32(max+1);
+			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
+		}
 
 		random_uuid(sb->device_uuid);
 
@@ -1214,6 +1216,10 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 		sb->raid_disks = __cpu_to_le32(info->array.raid_disks);
 		sb->dev_roles[info->disk.number] =
 			__cpu_to_le16(info->disk.raid_disk);
+		if (sb->raid_disks+1 >= __le32_to_cpu(sb->max_dev)) {
+			sb->max_dev = __cpu_to_le32(sb->raid_disks+1);
+			sb->dev_roles[sb->max_dev] = __cpu_to_le16(MD_DISK_ROLE_SPARE);
+		}
 	} else if (strcmp(update, "resync") == 0) {
 		/* make sure resync happens */
 		sb->resync_offset = 0ULL;
-- 
2.12.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-19  5:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-12  1:51 [PATCH] super1: fix sb->max_dev when adding a new disk in linear array Lidong Zhong
2017-05-12  7:53 ` Coly Li
2017-05-15 11:33   ` Lidong Zhong
2017-05-16  4:28 ` Lidong Zhong
2017-05-16  4:51 Lidong Zhong
2017-05-19  4:36 ` NeilBrown
2017-05-19  5:31   ` Lidong Zhong
2017-05-19  5:37     ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.