All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2 of 9] MD:  should_read_superblock
@ 2011-05-24  3:06 Jonathan Brassow
  2011-05-25  4:01 ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Brassow @ 2011-05-24  3:06 UTC (permalink / raw)
  To: linux-raid

Patch name: md-should_read_superblock.patch

Add new function to determine whether MD superblocks should be read.

It used to be sufficient to check if mddev->raid_disks was set to determine
whether to read the superblock or not.  However, device-mapper (dm-raid.c)
sets this value before calling md_run().  Thus, we need additional mechanisms
for determining whether to read the superblock.  This patch adds the condition
that if rdev->meta_bdev is set, the superblock should be read - something that
only device-mapper does (and only when there are superblocks to be read/used).

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

Index: linux-2.6/drivers/md/md.c
===================================================================
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -4421,6 +4421,20 @@ static void md_safemode_timeout(unsigned
 	md_wakeup_thread(mddev->thread);
 }
 
+static int should_read_super(mddev_t *mddev)
+{
+	mdk_rdev_t *rdev, *tmp;
+
+	if (!mddev->raid_disks)
+		return 1;
+
+	rdev_for_each(rdev, tmp, mddev)
+		if (rdev->meta_bdev)
+			return 1;
+
+	return 0;
+}
+
 static int start_dirty_degraded;
 
 int md_run(mddev_t *mddev)
@@ -4442,7 +4456,7 @@ int md_run(mddev_t *mddev)
 	/*
 	 * Analyze all RAID superblock(s)
 	 */
-	if (!mddev->raid_disks) {
+	if (should_read_super(mddev)) {
 		if (!mddev->persistent)
 			return -EINVAL;
 		analyze_sbs(mddev);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2 of 9] MD:  should_read_superblock
  2011-05-24  3:06 [PATCH 2 of 9] MD: should_read_superblock Jonathan Brassow
@ 2011-05-25  4:01 ` NeilBrown
  2011-05-25 14:00   ` Jonathan Brassow
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2011-05-25  4:01 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: linux-raid

On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow <jbrassow@f14.redhat.com>
wrote:

> Patch name: md-should_read_superblock.patch
> 
> Add new function to determine whether MD superblocks should be read.
> 
> It used to be sufficient to check if mddev->raid_disks was set to determine
> whether to read the superblock or not.  However, device-mapper (dm-raid.c)
> sets this value before calling md_run().  Thus, we need additional mechanisms
> for determining whether to read the superblock.  This patch adds the condition
> that if rdev->meta_bdev is set, the superblock should be read - something that
> only device-mapper does (and only when there are superblocks to be read/used).
> 
> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>

I've been feeling uncomfortable about this and have spent a while trying to
see if my discomfort is at all justified.  It seems that maybe it is.

The discomfort is really at analyze_sbs being used for dm arrays.  It is
really for arrays where md completely controls the metadata.  dm array are in
a strange intermediate situation where some metadata is controlled by
user-space (so md is told about some details of the array) and other metadata
is managed by the kernel - so md finds those bits out by itself.

It isn't yet entirely clear to me how to handle the half-way state best.

But the particular problem is that analyse_sbs can call kick_rdev_from_array.
This will call export_rdev which will call kobject_put(&rdev->kboj) which is
bad because dm-based rdevs do not get their kobj initialised.

So I think analyse_sbs should not be used for dm arrays.
Rather the code in dm-raid.c which parses the metadata_device info from the
constructor line should load_super.  Then before md_run is called it should
do the 'validate_super' step and record any failures.

So the only super_types method that md code would call on a dm-raid array
would be sync_super.

Does that work for you?

Thanks,
NeilBrown


> 
> Index: linux-2.6/drivers/md/md.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/md.c
> +++ linux-2.6/drivers/md/md.c
> @@ -4421,6 +4421,20 @@ static void md_safemode_timeout(unsigned
>  	md_wakeup_thread(mddev->thread);
>  }
>  
> +static int should_read_super(mddev_t *mddev)
> +{
> +	mdk_rdev_t *rdev, *tmp;
> +
> +	if (!mddev->raid_disks)
> +		return 1;
> +
> +	rdev_for_each(rdev, tmp, mddev)
> +		if (rdev->meta_bdev)
> +			return 1;
> +
> +	return 0;
> +}
> +
>  static int start_dirty_degraded;
>  
>  int md_run(mddev_t *mddev)
> @@ -4442,7 +4456,7 @@ int md_run(mddev_t *mddev)
>  	/*
>  	 * Analyze all RAID superblock(s)
>  	 */
> -	if (!mddev->raid_disks) {
> +	if (should_read_super(mddev)) {
>  		if (!mddev->persistent)
>  			return -EINVAL;
>  		analyze_sbs(mddev);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2 of 9] MD:  should_read_superblock
  2011-05-25  4:01 ` NeilBrown
@ 2011-05-25 14:00   ` Jonathan Brassow
  2011-05-26  0:32     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Brassow @ 2011-05-25 14:00 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid


On May 24, 2011, at 11:01 PM, NeilBrown wrote:

> On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow <jbrassow@f14.redhat.com>
> wrote:
> 
>> Patch name: md-should_read_superblock.patch
>> 
>> Add new function to determine whether MD superblocks should be read.
>> 
>> It used to be sufficient to check if mddev->raid_disks was set to determine
>> whether to read the superblock or not.  However, device-mapper (dm-raid.c)
>> sets this value before calling md_run().  Thus, we need additional mechanisms
>> for determining whether to read the superblock.  This patch adds the condition
>> that if rdev->meta_bdev is set, the superblock should be read - something that
>> only device-mapper does (and only when there are superblocks to be read/used).
>> 
>> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
> 
> I've been feeling uncomfortable about this and have spent a while trying to
> see if my discomfort is at all justified.  It seems that maybe it is.
> 
> The discomfort is really at analyze_sbs being used for dm arrays.  It is
> really for arrays where md completely controls the metadata.  dm array are in
> a strange intermediate situation where some metadata is controlled by
> user-space (so md is told about some details of the array) and other metadata
> is managed by the kernel - so md finds those bits out by itself.
> 
> It isn't yet entirely clear to me how to handle the half-way state best.
> 
> But the particular problem is that analyse_sbs can call kick_rdev_from_array.
> This will call export_rdev which will call kobject_put(&rdev->kboj) which is
> bad because dm-based rdevs do not get their kobj initialised.
> 
> So I think analyse_sbs should not be used for dm arrays.
> Rather the code in dm-raid.c which parses the metadata_device info from the
> constructor line should load_super.  Then before md_run is called it should
> do the 'validate_super' step and record any failures.
> 
> So the only super_types method that md code would call on a dm-raid array
> would be sync_super.
> 
> Does that work for you?

That seems sensible.  It changes things up a bit though...

1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c.  I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h?  (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again.  Is this a better solution?)

 brassow

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2 of 9] MD:  should_read_superblock
  2011-05-25 14:00   ` Jonathan Brassow
@ 2011-05-26  0:32     ` NeilBrown
  2011-05-26 14:50       ` Jonathan Brassow
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2011-05-26  0:32 UTC (permalink / raw)
  To: Jonathan Brassow; +Cc: linux-raid

On Wed, 25 May 2011 09:00:19 -0500 Jonathan Brassow <jbrassow@redhat.com>
wrote:

> 
> On May 24, 2011, at 11:01 PM, NeilBrown wrote:
> 
> > On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow <jbrassow@f14.redhat.com>
> > wrote:
> > 
> >> Patch name: md-should_read_superblock.patch
> >> 
> >> Add new function to determine whether MD superblocks should be read.
> >> 
> >> It used to be sufficient to check if mddev->raid_disks was set to determine
> >> whether to read the superblock or not.  However, device-mapper (dm-raid.c)
> >> sets this value before calling md_run().  Thus, we need additional mechanisms
> >> for determining whether to read the superblock.  This patch adds the condition
> >> that if rdev->meta_bdev is set, the superblock should be read - something that
> >> only device-mapper does (and only when there are superblocks to be read/used).
> >> 
> >> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
> > 
> > I've been feeling uncomfortable about this and have spent a while trying to
> > see if my discomfort is at all justified.  It seems that maybe it is.
> > 
> > The discomfort is really at analyze_sbs being used for dm arrays.  It is
> > really for arrays where md completely controls the metadata.  dm array are in
> > a strange intermediate situation where some metadata is controlled by
> > user-space (so md is told about some details of the array) and other metadata
> > is managed by the kernel - so md finds those bits out by itself.
> > 
> > It isn't yet entirely clear to me how to handle the half-way state best.
> > 
> > But the particular problem is that analyse_sbs can call kick_rdev_from_array.
> > This will call export_rdev which will call kobject_put(&rdev->kboj) which is
> > bad because dm-based rdevs do not get their kobj initialised.
> > 
> > So I think analyse_sbs should not be used for dm arrays.
> > Rather the code in dm-raid.c which parses the metadata_device info from the
> > constructor line should load_super.  Then before md_run is called it should
> > do the 'validate_super' step and record any failures.
> > 
> > So the only super_types method that md code would call on a dm-raid array
> > would be sync_super.
> > 
> > Does that work for you?
> 
> That seems sensible.  It changes things up a bit though...
> 
> 1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
> 2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c.  I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h?  (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again.  Is this a better solution?)
> 
>  brassow

How about we put a 'sync_super' or possibly a 'struct super_type' pointer in
mddev_t, and use it instead of mddev->major_version for finding operations.
Then all knowledge of the dm metadata  can live in dm-raid.c??

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2 of 9] MD:  should_read_superblock
  2011-05-26  0:32     ` NeilBrown
@ 2011-05-26 14:50       ` Jonathan Brassow
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Brassow @ 2011-05-26 14:50 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid


On May 25, 2011, at 7:32 PM, NeilBrown wrote:

> On Wed, 25 May 2011 09:00:19 -0500 Jonathan Brassow <jbrassow@redhat.com>
> wrote:
> 
>> 
>> On May 24, 2011, at 11:01 PM, NeilBrown wrote:
>> 
>>> On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow <jbrassow@f14.redhat.com>
>>> wrote:
>>> 
>>>> Patch name: md-should_read_superblock.patch
>>>> 
>>>> Add new function to determine whether MD superblocks should be read.
>>>> 
>>>> It used to be sufficient to check if mddev->raid_disks was set to determine
>>>> whether to read the superblock or not.  However, device-mapper (dm-raid.c)
>>>> sets this value before calling md_run().  Thus, we need additional mechanisms
>>>> for determining whether to read the superblock.  This patch adds the condition
>>>> that if rdev->meta_bdev is set, the superblock should be read - something that
>>>> only device-mapper does (and only when there are superblocks to be read/used).
>>>> 
>>>> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
>>> 
>>> I've been feeling uncomfortable about this and have spent a while trying to
>>> see if my discomfort is at all justified.  It seems that maybe it is.
>>> 
>>> The discomfort is really at analyze_sbs being used for dm arrays.  It is
>>> really for arrays where md completely controls the metadata.  dm array are in
>>> a strange intermediate situation where some metadata is controlled by
>>> user-space (so md is told about some details of the array) and other metadata
>>> is managed by the kernel - so md finds those bits out by itself.
>>> 
>>> It isn't yet entirely clear to me how to handle the half-way state best.
>>> 
>>> But the particular problem is that analyse_sbs can call kick_rdev_from_array.
>>> This will call export_rdev which will call kobject_put(&rdev->kboj) which is
>>> bad because dm-based rdevs do not get their kobj initialised.
>>> 
>>> So I think analyse_sbs should not be used for dm arrays.
>>> Rather the code in dm-raid.c which parses the metadata_device info from the
>>> constructor line should load_super.  Then before md_run is called it should
>>> do the 'validate_super' step and record any failures.
>>> 
>>> So the only super_types method that md code would call on a dm-raid array
>>> would be sync_super.
>>> 
>>> Does that work for you?
>> 
>> That seems sensible.  It changes things up a bit though...
>> 
>> 1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
>> 2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c.  I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h?  (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again.  Is this a better solution?)
>> 
>> brassow
> 
> How about we put a 'sync_super' or possibly a 'struct super_type' pointer in
> mddev_t, and use it instead of mddev->major_version for finding operations.
> Then all knowledge of the dm metadata  can live in dm-raid.c??

I was just thinking that - yes, that sounds good.

I haven't thought about it too deeply yet, so I'm not sure which I like better:
1) just sync_super ptr in mddev_t
2) super_types in mddev_t

My first impression is just sync_super, after all, the load and validate can be done within device-mapper and never need to be called by MD outside analyze_sbs and routines that add devices, right?  Perhaps we would just remove sync_super from super_types or check for mddev->sync_super before calling super_types[x].sync_super?  I'll think more about it.

thanks,
 brassow


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-05-26 14:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-24  3:06 [PATCH 2 of 9] MD: should_read_superblock Jonathan Brassow
2011-05-25  4:01 ` NeilBrown
2011-05-25 14:00   ` Jonathan Brassow
2011-05-26  0:32     ` NeilBrown
2011-05-26 14:50       ` Jonathan Brassow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.