All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@cse.unsw.edu.au>
To: Scott Long <scott_long@adaptec.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: Proposed enhancements to MD
Date: Thu, 15 Jan 2004 10:07:34 +1100	[thread overview]
Message-ID: <16389.52150.148792.875315@notabene.cse.unsw.edu.au> (raw)
In-Reply-To: message from Scott Long on Monday January 12

On Monday January 12, scott_long@adaptec.com wrote:
> All,
> 
> Adaptec has been looking at the MD driver for a foundation for their
> Open-Source software RAID stack.  This will help us provide full
> and open support for current and future Adaptec RAID products (as
> opposed to the limited support through closed drivers that we have
> now).

Sounds like a great idea.

> 
> While MD is fairly functional and clean, there are a number of 
> enhancements to it that we have been working on for a while and would
> like to push out to the community for review and integration.  These
> include:

It would help if you said up-front if you were thinking of 2.4 or 2.6
or 2.7 or all of whatever.  I gather from subsequent emails in the
thread that you are thinking of 2.6 and hoping for 2.4.
It is definately too late for any of this to go into kernel.org 2.4,
but some of it could live in an external patch set that people or
vendors can choose or not.

> 
> - partition support for md devices:  MD does not support the concept of
>    fdisk partitions; the only way to approximate this right now is by
>    creating multiple arrays on the same media.  Fixing this is required
>    for not only feature-completeness, but to allow our BIOS to recognise
>    the partitions on an array and properly boot them as it would boot a
>    normal disk.

Your attached patch is completely unacceptable as it breaks backwards
compatability.  /dev/md1 (blockdev 9,1) changes from being the second
md array to being the first partition of the first md array.

I too would like to support partitions of md devices but there is not
really elegant way to do it.
I'm beginning to think the best approach is to use a new major number
(which will be dynammically allocated because Linus has forbidden new
static allocations).  This should be fairly easy to do.

A reasonable alternate is to use DM.  As I understand it, DM can work
with any sort of metadata (As metadata is handled by user-space) so
this should work just fine.

Note that kernel-based autodetection is seriously a thing of the past.
As has been said already, it should be just as easy and much more
manageable to do autodtection in early user-space.  If it isn't, then
we need to improve the early user-space tools.

> 
> - generic device arrival notification mechanism:  This is needed to
>    support device hot-plug, and allow arrays to be automatically
>    configured regardless of when the md module is loaded or initialized.
>    RedHat EL3 has a scaled down version of this already, but it is
>    specific to MD and only works if MD is statically compiled into the
>    kernel.  A general mechanism will benefit MD as well as any other
>    storage system that wants hot-arrival notices.

This has largely been covered, but just to add or clarify slightly:

 This is not an md issue.  This is either a buss controller or
 userspace issue.
 2.6 has a "hotplug" infrastructure and each buss should report
 hotplug events to userspace.
 If they don't they should be enhanced so they do.
 If they do, then userspace needs to be told what to do with these
 events, and when to assemble devices into arrays.

 
> 
> - RAID-0 fixes:  The MD RAID-0 personality is unable to perform I/O
>    that spans a chunk boundary.  Modifications are needed so that it can
>    take a request and break it up into 1 or more per-disk requests.

In 2.4 it cannot, but arguable doesn't need to.  However I have a
fairly straight-forward patch which supports raid0 request splitting.
In 2.6, this should work properly already.

> 
> - Metadata abstraction:  We intend to support multiple on-disk metadata
>    formats, along with the 'native MD' format.  To do this, specific
>    knowledge of MD on-disk structures must be abstracted out of the core
>    and personalities modules.

In 2.4, this would be a massive amount of work and I don't recommend
it.
In 2.6, most of this is already done - the knowledge about superblock
format is very localised.  I would like to extend this so that a
loadable module can add a new format.  Patches welcome.

Note that the kernel does need to know about the format of the
superblock.
DM can manage without knowing as it's superblock is read mostly and
the very few updates (for reconfiguration) are managed by userspace.
For raid1 and raid5 (which DM doesn't support), we need to update the
superblock on errors and I think that is best done in the kernel.


> 
> - DDF Metadata support: Future products will use the 'DDF' on-disk
>    metadata scheme.  These products will be bootable by the BIOS, but
>    must have DDF support in the OS.  This will plug into the abstraction
>    mentioned above.

I'm looking forward to seeing the specs for DDF (but isn't it pretty
dump to develop a standard in a closed forum).  If DDF turns out to
have real value I would he happy to have support for it in linux/md.

NeilBrown

  parent reply	other threads:[~2004-01-14 23:07 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-13  0:34 Proposed enhancements to MD Scott Long
2004-01-13 16:26 ` Jakob Oestergaard
     [not found]   ` <20040113201058.GD1594@srv-lnx2600.matchmail.com>
2004-01-14 19:07     ` Jakob Oestergaard
     [not found]       ` <20040114194052.GK1594@srv-lnx2600.matchmail.com>
2004-01-14 21:02         ` Jakob Oestergaard
     [not found]           ` <20040114222447.GL1594@srv-lnx2600.matchmail.com>
2004-01-15  1:42             ` Jakob Oestergaard
2004-01-13 18:21 ` mutex
2004-01-13 19:05   ` Jeff Garzik
2004-01-13 19:30     ` mutex
2004-01-13 19:43       ` Jeff Garzik
2004-01-13 20:00         ` mutex
2004-01-13 20:44   ` Scott Long
2004-01-13 18:44 ` Jeff Garzik
2004-01-13 19:01   ` John Bradford
2004-01-13 19:41   ` Matt Domsch
2004-01-13 22:10     ` Arjan van de Ven
2004-01-16  9:31     ` Lars Marowsky-Bree
2004-01-16  9:31       ` Lars Marowsky-Bree
2004-01-16  9:57       ` Arjan van de Ven
2004-01-13 20:41   ` Scott Long
2004-01-13 22:33     ` Jure Pečar
2004-01-13 22:33       ` Jure Pečar
2004-01-13 22:44       ` Scott Long
2004-01-13 22:56       ` viro
2004-01-14 15:52     ` Kevin Corry
2004-01-13 22:42   ` Luca Berra
2004-01-13 22:06 ` Arjan van de Ven
2004-01-13 22:44   ` Wakko Warner
2004-01-13 22:34     ` Arjan van de Ven
2004-01-13 23:09     ` Andreas Steinmetz
2004-01-13 23:38       ` Wakko Warner
2004-01-14 16:16         ` Kevin Corry
2004-01-14 16:53           ` Kevin P. Fleming
2004-01-14 23:07 ` Neil Brown [this message]
2004-01-15 11:10   ` Norman Schmidt
2004-01-15 21:52   ` Matt Domsch
2004-01-15 21:52     ` Matt Domsch
2004-01-16  9:24     ` Lars Marowsky-Bree
2004-01-16  9:24       ` Lars Marowsky-Bree
2004-01-16 13:43       ` Matt Domsch
2004-01-16 13:56         ` Lars Marowsky-Bree
2004-01-16 14:06           ` Christoph Hellwig
2004-01-16 14:11             ` Matt Domsch
2004-01-16 14:11               ` Matt Domsch
2004-01-16 14:13               ` Christoph Hellwig
2004-01-13  3:41 Proposed Enhancements " Scott Long
2004-01-13 10:24 ` Lars Marowsky-Bree
2004-01-13 18:03   ` Scott Long
2004-01-16  9:29     ` Lars Marowsky-Bree
2004-01-13 14:19 ` Matt Domsch
2004-01-13 17:13   ` Andreas Dilger
2004-01-13 22:26     ` Andreas Dilger
2004-01-13 18:19   ` Kevin P. Fleming
2004-01-13 18:19   ` Jeff Garzik
2004-01-13 20:29     ` Chris Friesen
2004-01-13 20:35       ` Matt Domsch
2004-01-13 21:10     ` Matt Domsch
2004-01-13 19:59 Proposed enhancements " Cress, Andrew R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16389.52150.148792.875315@notabene.cse.unsw.edu.au \
    --to=neilb@cse.unsw.edu.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=scott_long@adaptec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.