All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: ian_bruce@mail.ru
Cc: linux-raid@vger.kernel.org
Subject: Re: [BUG] non-metadata arrays cannot use more than 27 component devices
Date: Wed, 01 Mar 2017 07:29:28 +1100	[thread overview]
Message-ID: <87mvd6oy8n.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170228022513.2cf5445a.ian_bruce@mail.ru>

[-- Attachment #1: Type: text/plain, Size: 3853 bytes --]

On Tue, Feb 28 2017, ian_bruce@mail.ru wrote:

> On Mon, 27 Feb 2017 16:55:56 +1100
> NeilBrown <neilb@suse.com> wrote:
>
>>> When assembling non-metadata arrays ("mdadm --build"), the in-kernel
>>> superblock apparently defaults to the MD-RAID v0.90 type. This
>>> imposes a maximum of 27 component block devices, presumably as well
>>> as limits on device size.
>>>
>>> mdadm does not allow you to override this default, by specifying the
>>> v1.2 superblock. It is not clear whether mdadm tells the kernel to
>>> use the v0.90 superblock, or the kernel assumes this by itself. One
>>> or other of them should be fixed; there does not appear to be any
>>> reason why the v1.2 superblock should not be the default in this
>>> case.
>> 
>> Can you see if this change improves the behavior for you?
>
> Unfortunately, I'm not set up for kernel compilation at the moment. But
> here is my test case; it shouldn't be any harder to reproduce than this,
> on extremely ordinary hardware (= no actual disk RAID array):
>
>
> # truncate -s 64M img64m.{00..31}   # requires no space on ext4,
> #                                   # because sparse files are created
> # 
> # ls img64m.*
> img64m.00  img64m.04  img64m.08  img64m.12  img64m.16  img64m.20  img64m.24  img64m.28
> img64m.01  img64m.05  img64m.09  img64m.13  img64m.17  img64m.21  img64m.25  img64m.29
> img64m.02  img64m.06  img64m.10  img64m.14  img64m.18  img64m.22  img64m.26  img64m.30
> img64m.03  img64m.07  img64m.11  img64m.15  img64m.19  img64m.23  img64m.27  img64m.31
> # 
> # RAID=$(for x in img64m.* ; do losetup --show -f $x ; done)
> # 
> # echo $RAID
> /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6 /dev/loop7
> /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12 /dev/loop13 /dev/loop14 /dev/loop15
> /dev/loop16 /dev/loop17 /dev/loop18 /dev/loop19 /dev/loop20 /dev/loop21 /dev/loop22 /dev/loop23
> /dev/loop24 /dev/loop25 /dev/loop26 /dev/loop27 /dev/loop28 /dev/loop29 /dev/loop30 /dev/loop31
> # 
> # mdadm --build /dev/md/md-test --level=linear --raid-devices=32 $RAID
> mdadm: ADD_NEW_DISK failed for /dev/loop27: Device or resource busy
> # 

Thanks.  That makes it easy.
Test works with my patch applied.
....

>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index ba485dcf1064..e0ac7f5a8e68 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -6464,9 +6464,8 @@ static int set_array_info(struct mddev *mddev, mdu_array_info_t *info)
>>  	mddev->layout        = info->layout;
>>  	mddev->chunk_sectors = info->chunk_size >> 9;
>>  
>> -	mddev->max_disks     = MD_SB_DISKS;
>> -
>>  	if (mddev->persistent) {
>> +		mddev->max_disks     = MD_SB_DISKS;
>>  		mddev->flags         = 0;
>>  		mddev->sb_flags         = 0;
>>  	}
>
> What value does mddev->max_disks get in the opposite case,
> (!mddev->persistent) ?

Default value is zero, which causes no limit to be imposed.

>
> I note this comment from the top of the function:
>
>     * set_array_info is used two different ways
>     * The original usage is when creating a new array.
>     * In this usage, raid_disks is > 0 and it together with
>     *  level, size, not_persistent,layout,chunksize determine the
>     *  shape of the array.
>     *  This will always create an array with a type-0.90.0 superblock.

Unfortunately you cannot always trust comments.  They are more like hints.

>
> http://lxr.free-electrons.com/source/drivers/md/md.c#L6410
>
> Surely there is an equivalent function which creates arrays with a
> type-1 superblock?

Not really.  type-1 superblock are created from userspace by mdadm.
mdadm then tells the kernel "here are some devices that form an array".
md reads the devices, finds the type-1 metadata, and proceeds.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2017-02-28 20:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-24 12:08 [BUG] non-metadata arrays cannot use more than 27 component devices ian_bruce
2017-02-24 15:20 ` Phil Turmel
2017-02-24 16:40   ` ian_bruce
2017-02-24 20:46     ` Phil Turmel
2017-02-25 20:05       ` Anthony Youngman
2017-02-25 22:00         ` Phil Turmel
2017-02-25 23:30           ` Wols Lists
2017-02-25 23:41             ` Phil Turmel
2017-02-25 23:55               ` Wols Lists
2017-02-26  0:07                 ` Phil Turmel
2017-03-01 15:02                   ` Wols Lists
2017-03-01 17:23                     ` Phil Turmel
2017-03-01 18:13                       ` Phil Turmel
2017-03-01 19:50                         ` Anthony Youngman
2017-03-01 22:20                           ` Phil Turmel
2017-02-27  5:55 ` NeilBrown
2017-02-28 10:25   ` ian_bruce
2017-02-28 20:29     ` NeilBrown [this message]
2017-03-01 13:05       ` ian_bruce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mvd6oy8n.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=ian_bruce@mail.ru \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.