All of lore.kernel.org
 help / color / mirror / Atom feed
From: Doug Ledford <dledford@redhat.com>
To: Neil Brown <neilb@suse.de>
Cc: Jon Nelson <jnelson-linux-raid@jamponi.net>,
	LinuxRaid <linux-raid@vger.kernel.org>
Subject: Re: [Patch] mdadm ignoring homehost?
Date: Mon, 6 Apr 2009 10:47:00 -0400	[thread overview]
Message-ID: <51C39605-BBE7-48E8-AB35-D55D0B36B3A6@redhat.com> (raw)
In-Reply-To: <18899.61151.445765.360191@notabene.brown>


[-- Attachment #1.1: Type: text/plain, Size: 5478 bytes --]

On Apr 1, 2009, at 6:46 PM, Neil Brown wrote:

> On Wednesday April 1, jnelson-linux-raid@jamponi.net wrote:
>> ping?
>
> Oh yeah, that's right, I was going to reply to that - thanks for the
> reminder.
>
>>
>> On Tue, Mar 24, 2009 at 11:57 AM, Jon Nelson
>> <jnelson-linux-raid@jamponi.net> wrote:
>>>
>>> I have a raid1 comprised of a local physical device (/dev/sda) and a
>>> network block device (/dev/nbd0).
>>> When the machine hosting the network block device comes up, however,
>>> it creates /dev/md127.
>>> Why?
>
> Because you cannot please all the people, all the time.

Very true.

>
> People seem to want their arrays to auto-assemble - you know, just
> appear and do the right thing, read their mind probably, because
> creating config files is too hard.
> So I've endeavoured to make that happen.
>
> The biggest problem with auto-assembly is what to do if two arrays
> claim to have the same name. (e.g. /dev/md0) - which one wins.
> The 'homehost' is (currently) used to resolve that.  An array only
> gets to use the name it claims to have if it can show that it belongs
> to "this" host.  If it doesn't it still get assembled, but with some
> other more generic name.

FWIW, I happen to disagree with this method.  And I'm currently  
testing out a new algorithm for this in Fedora 11 beta.

The logic behind this in mdadm-3.0devel3 is basically "if the array  
exists in mdadm.conf or if it has this homehost, assemble using normal  
name, else use a random name".  However, in the world of movable  
arrays (think one of those 5 disk SATA raid towers that just has a  
single eSATA port and a port replicator, which can easily be moved  
from machine to machine), this doesn't work so well.  The problem is  
that when you assemble an array with a random number, you confuse  
users.  They might find the array eventually, but it's certainly not  
as easy as if the array used the name they expected.  In an attempt to  
get mdadm to not possibly conflict with local array names, the  
homehost method of selecting which array name to use causes confusion  
all the time, instead of only confusing users when a conflict actually  
occurs.  This doesn't make sense to me, so I redid the tests in mdadm  
to change this (this is exacerbated by the fact that if your array  
does not define a homehost, it gets treated as though it has a  
different homehost, so common version 0.90 arrays will always get  
assembled as a random number if they aren't in the mdadm.conf file  
whether they are meant for this host or not).

So, my logic goes like this:

Does the array match an array mdadm.conf via uuid?  If yes, use name  
from mdadm.conf.  If no, does the array match an entry in mdadm.conf  
via the standard super-minor/name mapping?  If yes, and that array  
line contains a uuid that doesn't match this entry, then use a random  
name because this is likely a conflict.  If yes and that line does not  
contain a uuid entry, then this is likely a match, but a poor one.   
Use the name, but don't like it.  If no, then this array didn't match  
the mdadm.conf file at all and is likely a foreign array.  However, if  
there is no mdadm.conf file, or if there is a mdadm.conf file and  
nothing in it used our name, then foreign or not, it likely won't  
conflict on name, so go ahead and use the standard name for this device.

I had to modify the match loop to store both uuid and name matches  
separately in order to support this logic.  There's some other changes  
that were necessary in order to make it work properly, and I had to  
change mdopen.c to automatically go from what we thought was a good  
name to a random name if a conflict on an array happens in order to  
avoid failed autoassembles.  However, I'm personally much happier with  
the results.  For example, I can define md0 in the mdadm.conf file,  
create two different md0 arrays, then attempt to autoassemble the one  
that isn't in mdadm.conf and it will automatically get a random name  
and when the one that is in mdadm.conf shows up it gets the right  
name.  I can also define to md0 arrays with neither of them in the  
mdadm.conf file and it will assemble the first as md0 and the second  
as name md0_0 with a random minor (I think, it's been a week or so  
since I did that testing).  Anyway, it works well, and it basically  
negates the need for homehost in my opinion.  And the fact that it  
only assembles an array with a random number when it truly needs to is  
something that will help to greatly reduce confusion of users, which  
is always a plus in my book.  I'll attach the patch for your review.   
I could have shortened the logic in the match tests to just what's  
needed to set things right, but I left the long version so people can  
see all the possible options and why a specific setting is chosen on  
any given option.  Oh, and the patch also loosens up the name matching  
somewhat so that if someone names their device /dev/md0, that matches  
super-minor 0, as does md0 and just plain 0.  The original match  
setup, at least for devices not in the mdadm.conf file with a name in  
the array line, would only match the array name if it was numeric only  
(aka, homehost:0 or just 0).  I found that to be overly restrictive  
and contrary to what a lot of people would expect should be entered in  
the name field of the superblock.

Since I'm sending this anyway, I'll send a couple other changes I made  
to our mdadm in separate mails.


[-- Attachment #1.2: mdadm-3.0-foreign.patch --]
[-- Type: application/octet-stream, Size: 9968 bytes --]

--- mdadm-3.0-devel3/Incremental.c.foreign	2009-03-20 17:49:20.000000000 -0400
+++ mdadm-3.0-devel3/Incremental.c	2009-03-20 21:19:50.000000000 -0400
@@ -29,12 +29,14 @@
  */
 
 #include	"mdadm.h"
+#include	<ctype.h>
 
 static int count_active(struct supertype *st, int mdfd, char **availp,
 			struct mdinfo *info);
 static void find_reject(int mdfd, struct supertype *st, struct mdinfo *sra,
 			int number, __u64 events, int verbose,
 			char *array_name);
+static int compare_array_name(char *conf_name, char *sb_name);
 
 int Incremental(char *devname, int verbose, int runstop,
 		struct supertype *st, char *homehost, int autof)
@@ -48,8 +50,10 @@ int Incremental(char *devname, int verbo
 	 * 2/ Find metadata, reject if none appropriate (check
 	 *       version/name from args)
 	 * 3/ Check if there is a match in mdadm.conf
-	 * 3a/ if not, check for homehost match.  If no match, assemble as
-	 *    a 'foreign' array.
+	 * 3a/ Evalutate the quality of match and whether or not we have a
+	 * 	 conf file at all, and make a decision about whether or not
+	 * 	 to allow this array to keep its preferred name based upon 
+	 * 	 that
 	 * 4/ Determine device number.
 	 * - If in mdadm.conf with std name, use that
 	 * - UUID in /dev/md/mdadm.map  use that
@@ -78,7 +82,7 @@ int Incremental(char *devname, int verbo
 	 */
 	struct stat stb;
 	struct mdinfo info;
-	struct mddev_ident_s *array_list, *match;
+	struct mddev_ident_s *array_list, *match, *match_uuid, *match_name;
 	char chosen_name[1024];
 	int rv;
 	struct map_ent *mp, *map = NULL;
@@ -148,26 +152,42 @@ int Incremental(char *devname, int verbo
 	st->ss->getinfo_super(st, &info);
 	/* 3/ Check if there is a match in mdadm.conf */
 
+	name_to_use = strchr(info.name, ':');
+	if (name_to_use)
+		name_to_use++;
+	else
+		name_to_use = info.name;
 	array_list = conf_get_ident(NULL);
 	match = NULL;
+	match_uuid = NULL;
+	match_name = NULL;
 	for (; array_list; array_list = array_list->next) {
+		/* Check for matching uuid, then drop through to check and see
+		 * if we also have a matching name, and to catch cases of
+		 * matching names without a corresponding uuid match */
 		if (array_list->uuid_set &&
 		    same_uuid(array_list->uuid, info.uuid, st->ss->swapuuid)
-		    == 0) {
-			if (verbose >= 2 && array_list->devname)
+		    != 0)
+			match_uuid = array_list;
+		else if (array_list->uuid_set && verbose >= 2 &&
+			 array_list->devname)
 				fprintf(stderr, Name
 					": UUID differs from %s.\n",
 					array_list->devname);
-			continue;
-		}
+		/* If we match name, save it off separately so we can tell if
+		 * we matched uuid, name, or both, and if both, if they were
+		 * the same entry */
 		if (array_list->name[0] &&
-		    strcasecmp(array_list->name, info.name) != 0) {
-			if (verbose >= 2 && array_list->devname)
+		    compare_array_name(array_list->name, info.name))
+			match_name = array_list;
+		else if (array_list->name[0] && verbose >= 2 &&
+			 array_list->devname)
 				fprintf(stderr, Name
 					": Name differs from %s.\n",
 					array_list->devname);
+		if ((!match_uuid || match == match_uuid) &&
+		    (!match_name || match == match_name))
 			continue;
-		}
 		if (array_list->devices &&
 		    !match_oneof(array_list->devices, devname)) {
 			if (verbose >= 2 && array_list->devname)
@@ -197,7 +217,13 @@ int Incremental(char *devname, int verbo
 		/* FIXME, should I check raid_disks and level too?? */
 
 		if (match) {
-			if (verbose >= 0) {
+			if (match_uuid != match_name) {
+				if (match_uuid->devname)
+					fprintf(stderr, Name ": more than one "
+						"match for %s, using the UUID "
+						"match\n", match_uuid->devname);
+				match = match_uuid;
+			} else if (verbose >= 0) {
 				if (match->devname && array_list->devname)
 					fprintf(stderr, Name
 		   ": we match both %s and %s - cannot decide which to use.\n",
@@ -205,23 +231,52 @@ int Incremental(char *devname, int verbo
 				else
 					fprintf(stderr, Name
 						": multiple lines in mdadm.conf match\n");
+				return 2;
 			}
-			return 2;
 		}
 		match = array_list;
 	}
 
-	/* 3a/ if not, check for homehost match.  If no match, continue
-	 * but don't trust the 'name' in the array. Thus a 'random' minor
-	 * number will be assigned, and the device name will be based
-	 * on that. */
-	if (match)
+	/* 3a/ Decide if we got a good match, two matches, no matches, or a
+	 * likely foreign match.  I dropped the homehost test entirely because
+	 * it didn't seem to add any value whatsoever above and beyond what
+	 * these tests can do. */
+	if (match && match_uuid == match_name) {
+		/* found in conf, both name and uuid match */
 		trustworthy = LOCAL;
-	else if (homehost == NULL ||
-		 st->ss->match_home(st, homehost) != 1)
-		trustworthy = FOREIGN;
-	else
+	} else if (match_uuid && match_name) {
+		/* found both a name and a uuid match, but not on the same
+		 * entry, so prefer the uuid match (done above) */
 		trustworthy = LOCAL;
+	} else if (!match_uuid && match_name) {
+		/* no uuid match, but name match */
+		if (match_name->uuid_set) {
+			/* oops, name that matched had a uuid, it just wasn't
+			 * right, assume there is a local device with both
+			 * a matching name and uuid, so this needs a random
+			 * name */
+			trustworthy = FOREIGN;
+			match = NULL;
+		} else
+			/* matched name, and the matching entry in conf file
+			 * didn't include a uuid, and this uuid never showed
+			 * up anywhere else in the conf file, so consider it
+			 * a soft match and allow it...although users should
+			 * *REALLY* include the uuid on array lines in the
+			 * conf file */
+			trustworthy = LOCAL;
+	} else { /* no match at all */
+		if (!conf_exists())
+			/* If we don't even have a conf file, this is foreign,
+			 * but also not likely to conflict with anything
+			 * local, so let it keep its preferred name */
+			trustworthy = LOCAL;
+		else
+			/* We have a conf file, this didn't match any uuids
+			 * or names, so also not likely to conflict, let it
+			 * keep its own name */
+			trustworthy = LOCAL;
+	}
 
 	/* There are three possible sources for 'autof':  command line,
 	 * ARRAY line in mdadm.conf, or CREATE line in mdadm.conf.
@@ -240,11 +295,6 @@ int Incremental(char *devname, int verbo
 		return Incremental_container(st, devname, verbose, runstop,
 					     autof, trustworthy);
 	}
-	name_to_use = strchr(info.name, ':');
-	if (name_to_use)
-		name_to_use++;
-	else
-		name_to_use = info.name;
 
 	if ((!name_to_use || name_to_use[0] == 0) &&
 	    info.array.level == LEVEL_CONTAINER &&
@@ -797,3 +847,45 @@ int Incremental_container(struct superty
 	map_unlock(&map);
 	return 0;
 }
+
+static int compare_array_name(char *conf_name, char *sb_name)
+{
+	char *cptr, *sptr;
+	int conf_num = -1;
+
+	/* usage of the name variable in the superblock comes in several
+	 * flavors:
+	 * A) full md pathname (/dev/md0)
+	 * B) just the md name (md0)
+	 * C) just the md number (0)
+	 * D) all of the above, but with hostname: prefixed to it
+	 *
+	 * Depending on which of those variants we have, we need to alter
+	 * how we attempt to match the array name in the mdadm.conf file
+	 * which is always a full pathname.  We don't match on hostname:
+	 * though, so eliminate it from the equation.
+	 */
+
+	if ((sptr = strchr(sb_name, ':')) == NULL)
+		sptr = sb_name;
+	else
+		sptr++;
+
+	/* Do we have a full pathname in the superblock name field? */
+	if (strchr(sptr, '/'))
+		return !strcasecmp(conf_name, sptr);
+	/* If not, is it just a number or an md device name? */
+	else if (isdigit(sptr[0])) {
+		cptr = conf_name + strlen(conf_name);
+		while (cptr > conf_name && isdigit(cptr[-1]))
+			cptr--;
+		if (cptr[0])
+			conf_num = strtoul(cptr, NULL, 10);
+		return conf_num == strtoul(sptr, NULL, 10);
+	} /* fall through else, it's a device name but not a full path */
+
+	cptr = strcasestr(conf_name, sptr);
+	if (cptr)
+		return !strcasecmp(cptr, sptr);
+	return 0;
+}
--- mdadm-3.0-devel3/mdadm.h.foreign	2009-03-10 01:39:41.000000000 -0400
+++ mdadm-3.0-devel3/mdadm.h	2009-03-20 17:49:20.000000000 -0400
@@ -785,6 +785,7 @@ extern mddev_dev_t conf_get_devs(void);
 extern int conf_test_dev(char *devname);
 extern struct createinfo *conf_get_create_info(void);
 extern void set_conffile(char *file);
+extern int conf_exists(void);
 extern char *conf_get_mailaddr(void);
 extern char *conf_get_mailfrom(void);
 extern char *conf_get_program(void);
--- mdadm-3.0-devel3/mdopen.c.foreign	2009-03-20 19:02:38.000000000 -0400
+++ mdadm-3.0-devel3/mdopen.c	2009-03-20 19:02:43.000000000 -0400
@@ -159,7 +159,6 @@ int create_mddev(char *dev, char *name, 
 	strcpy(chosen, "/dev/md/");
 	cname = chosen + strlen(chosen);
 
-
 	if (dev) {
 		
 		if (strncmp(dev, "/dev/md/", 8) == 0) {
@@ -240,12 +239,14 @@ int create_mddev(char *dev, char *name, 
 	if (num < 0 && trustworthy == LOCAL && name) {
 		/* if name is numeric, use that for num
 		 * if it is not already in use */
-		char *ep;
-		num = strtoul(name, &ep, 10);
-		if (ep == name || *ep)
-			num = -1;
-		else if (mddev_busy(use_mdp ? (-1-num) : num))
-			num = -1;
+		char *e = name + strlen(name);
+		while (e > name && isdigit(e[-1]))
+			e--;
+		if (e[0]) {
+			num = strtoul(e, NULL, 10);
+			if (mddev_busy(use_mdp ? (-1-num) : num))
+				num = -1;
+		}
 	}
 
 	if (num < 0) {
--- mdadm-3.0-devel3/config.c.foreign	2009-03-10 01:39:41.000000000 -0400
+++ mdadm-3.0-devel3/config.c	2009-03-20 17:49:20.000000000 -0400
@@ -637,7 +637,7 @@ void homehostline(char *line)
 	}
 }
 
-
+int exists = 0;
 int loaded = 0;
 
 static char *conffile = NULL;
@@ -683,6 +683,7 @@ void load_conffile(void)
 	if (f == NULL)
 		return;
 
+	exists = 1;
 	loaded = 1;
 	while ((line=conf_line(f))) {
 		switch(match_keyword(line)) {
@@ -718,6 +719,13 @@ void load_conffile(void)
 /*    printf("got file\n"); */
 }
 
+int conf_exists(void)
+{
+	if (!loaded)
+		load_conffile();
+	return exists;
+}
+
 char *conf_get_mailaddr(void)
 {
 	load_conffile();

[-- Attachment #1.3: Type: text/plain, Size: 171 bytes --]



--

Doug Ledford <dledford@redhat.com>

GPG KeyID: CFBFF194
http://people.redhat.com/dledford

InfiniBand Specific RPMS
http://people.redhat.com/dledford/Infiniband





[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

  reply	other threads:[~2009-04-06 14:47 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-24 16:57 mdadm ignoring homehost? Jon Nelson
2009-04-01 15:15 ` Jon Nelson
2009-04-01 22:46   ` Neil Brown
2009-04-06 14:47     ` Doug Ledford [this message]
2009-04-06 19:33       ` [Patch] " Luca Berra
2009-04-17  3:49       ` Neil Brown
2009-04-17  7:08         ` Gabor Gombas
2009-04-20  5:23           ` Neil Brown
2009-04-21  6:34             ` Gabor Gombas
2009-04-21  7:06               ` Luca Berra
2009-04-17 18:17         ` Doug Ledford
2009-04-17 18:40           ` Piergiorgio Sartor
2009-04-18  7:54             ` Luca Berra
2009-04-18  8:36               ` Piergiorgio Sartor
2009-04-18 10:19                 ` Luca Berra
2009-04-18 13:06                   ` Piergiorgio Sartor
2009-04-20  5:58                     ` Neil Brown
2009-04-20 12:29                       ` Doug Ledford
2009-04-20 18:17                       ` Piergiorgio Sartor
2009-04-20 19:49                         ` Leslie Rhorer
2009-04-20 20:04                           ` Piergiorgio Sartor
2009-04-20 21:18                           ` Luca Berra
2009-04-20 21:13                         ` Luca Berra
2009-04-20 21:24                           ` Piergiorgio Sartor
2009-04-20 23:47                             ` Doug Ledford
2009-04-21  0:00                               ` Doug Ledford
2009-04-21  8:57                                 ` Michal Soltys
2009-04-21  6:29                               ` Luca Berra
2009-04-21 18:15                           ` Piergiorgio Sartor
2009-04-22 16:06                             ` Andrew Burgess
2009-04-23  1:20                               ` Doug Ledford
2009-04-23  5:51                                 ` Luca Berra
2009-04-23  6:09                                   ` Luca Berra
2009-04-23 11:05                                   ` Doug Ledford
2009-04-23 21:31                                     ` Luca Berra
2009-04-24 16:46                                       ` Doug Ledford
2009-04-24 19:15                                 ` Piergiorgio Sartor
2009-04-26 11:52                                   ` Doug Ledford
2009-04-26 12:14                                     ` Piergiorgio Sartor
2009-04-26 12:58                                       ` Piergiorgio Sartor
2009-04-26 18:06                                         ` Doug Ledford
2009-04-26 19:08                                           ` Piergiorgio Sartor
2009-04-26 21:37                                       ` Michal Soltys
2009-04-18 14:34             ` Andrew Burgess
2009-04-18  8:12           ` Luca Berra
2009-04-18  8:44             ` Piergiorgio Sartor
2009-04-18 13:35             ` Doug Ledford
2009-04-18 13:52               ` Piergiorgio Sartor
2009-04-18 14:50                 ` Doug Ledford
2009-04-18 14:48               ` Jon Nelson
2009-04-20  6:08               ` Neil Brown
2009-04-20 12:26                 ` Luca Berra
2009-04-20 12:36                 ` Doug Ledford
2009-04-18 13:58           ` Bill Davidsen
2009-04-20  7:23           ` Neil Brown
2009-04-20 13:15             ` Doug Ledford
2009-04-21  6:54               ` Neil Brown
2009-05-11  6:47               ` Neil Brown
2009-04-01 22:47 ` Michal Soltys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C39605-BBE7-48E8-AB35-D55D0B36B3A6@redhat.com \
    --to=dledford@redhat.com \
    --cc=jnelson-linux-raid@jamponi.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.