All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
       [not found] ` <20220818210630.19395-2-mwilck@suse.com>
@ 2022-08-23 20:48   ` Benjamin Marzinski
  2022-08-24  7:07     ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Marzinski @ 2022-08-23 20:48 UTC (permalink / raw)
  To: mwilck; +Cc: dm-devel

On Thu, Aug 18, 2022 at 11:06:28PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> If the mptable is very large (for example, in a configuration with
> lots of maps assigned individual aliases), merge_mptable may get
> very slow because it needs to make O(n^2) string comparisons (with
> n being the number of mptable entries). WWID strings often differ
> only in the last few bytes, causing a lot of CPU time spent in
> strcmp().
> 
> merge_mptable is executed whenever multipath or multipathd is started, that
> is, also for "multipath -u" and "multipath -U" invocations from udev rules.
> Optimize it by sorting the mptable before merging. This way we don't need
> to iterate towards the end of the vector searching for duplicates.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/config.c | 15 +++++++++++++--
>  libmultipath/vector.c |  8 ++++++++
>  libmultipath/vector.h |  1 +
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/libmultipath/config.c b/libmultipath/config.c
> index ab8b26e..a2c79a4 100644
> --- a/libmultipath/config.c
> +++ b/libmultipath/config.c
> @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
>  	merge_num(mode);
>  }
>  
> +static int wwid_compar(const void *p1, const void *p2)
> +{
> +	const char *wwid1 = (*(struct mpentry * const *)p1)->wwid;
> +	const char *wwid2 = (*(struct mpentry * const *)p2)->wwid;
> +
> +	return strcmp(wwid1, wwid2);
> +}
> +
>  void merge_mptable(vector mptable)
>  {
>  	struct mpentry *mp1, *mp2;
> @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
>  			free_mpe(mp1);
>  			continue;
>  		}
> +	}
> +	vector_sort(mptable, wwid_compar);
> +	vector_foreach_slot(mptable, mp1, i) {
>  		j = i + 1;
>  		vector_foreach_slot_after(mptable, mp2, j) {
> -			if (!mp2->wwid || strcmp(mp1->wwid, mp2->wwid))
> -				continue;
> +			if (strcmp(mp1->wwid, mp2->wwid))
> +				break;
>  			condlog(1, "%s: duplicate multipath config section for %s",
>  				__func__, mp1->wwid);
>  			merge_mpe(mp2, mp1);

The way merge_mpe() works, values are only copied from mp1's variables
if the corresponding variable is unset in mp2. This requires the
original order of mp1 and mp2 to be unchanged by the sorting algorithm,
but according to the qsort() man page "If two members compare as equal,
their order in the sorted array is undefined."

One quick and dirty way we could fix this is to add a variable to struct
mptable called config_idx, which is its index in the config file.  If
the wwids are equal, you compare that.

Something like (only compile tested):
------------
diff --git a/libmultipath/config.c b/libmultipath/config.c
index a2c79a48..a8e2620b 100644
--- a/libmultipath/config.c
+++ b/libmultipath/config.c
@@ -522,10 +522,15 @@ merge_mpe(struct mpentry *dst, struct mpentry *src)
 
 static int wwid_compar(const void *p1, const void *p2)
 {
+	int r;
 	const char *wwid1 = (*(struct mpentry * const *)p1)->wwid;
 	const char *wwid2 = (*(struct mpentry * const *)p2)->wwid;
+	int idx1 = (*(struct mpentry * const *)p1)->config_idx;
+	int idx2 = (*(struct mpentry * const *)p2)->config_idx;
 
-	return strcmp(wwid1, wwid2);
+	if ((r = strcmp(wwid1, wwid2)) != 0)
+		return r;
+	return idx1 - idx2;
 }
 
 void merge_mptable(vector mptable)
@@ -541,6 +546,7 @@ void merge_mptable(vector mptable)
 			free_mpe(mp1);
 			continue;
 		}
+		mp1->config_idx = i;
 	}
 	vector_sort(mptable, wwid_compar);
 	vector_foreach_slot(mptable, mp1, i) {
diff --git a/libmultipath/config.h b/libmultipath/config.h
index fdcdff0a..fc44914c 100644
--- a/libmultipath/config.h
+++ b/libmultipath/config.h
@@ -133,6 +133,7 @@ struct mpentry {
 	uid_t uid;
 	gid_t gid;
 	mode_t mode;
+	int config_idx;
 };
 
 struct config {
------------

-Ben

> diff --git a/libmultipath/vector.c b/libmultipath/vector.c
> index e2d1ec9..0befe71 100644
> --- a/libmultipath/vector.c
> +++ b/libmultipath/vector.c
> @@ -208,3 +208,11 @@ int vector_find_or_add_slot(vector v, void *value)
>  	vector_set_slot(v, value);
>  	return VECTOR_SIZE(v) - 1;
>  }
> +
> +void vector_sort(vector v, int (*compar)(const void *, const void *))
> +{
> +	if (!v || !v->slot || !v->allocated)
> +		return;
> +	return qsort((void *)v->slot, v->allocated, sizeof(void *), compar);
> +
> +}
> diff --git a/libmultipath/vector.h b/libmultipath/vector.h
> index 2862dc2..c0b09cb 100644
> --- a/libmultipath/vector.h
> +++ b/libmultipath/vector.h
> @@ -89,4 +89,5 @@ extern void vector_repack(vector v);
>  extern void vector_dump(vector v);
>  extern void dump_strvec(vector strvec);
>  extern int vector_move_up(vector v, int src, int dest);
> +void vector_sort(vector v, int (*compar)(const void *, const void *));
>  #endif
> -- 
> 2.37.1
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 2/3] libmultipath: check_alias_settings(): pre-sort mptable by alias
       [not found] ` <20220818210630.19395-3-mwilck@suse.com>
@ 2022-08-23 21:27   ` Benjamin Marzinski
  2022-08-24  7:32     ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Marzinski @ 2022-08-23 21:27 UTC (permalink / raw)
  To: mwilck; +Cc: dm-devel

On Thu, Aug 18, 2022 at 11:06:29PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> add_binding() contains an optimization; it assumes that the list of
> bindings is alphabetically sorted by alias, and tries to maintain
> this order.
> 
> But conf->mptable is sorted by wwid. Therefore check_alias_settings() may take
> a long time if the mptable is large.
> 
> Create a copy of the mptable, sorted by alias, and use it for add_bindings().
> This speeds up check_alias_settings by roughly a factor of 10 (0.1s vs. 1s)
> for my test setup with 10000 entries in the mptable.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> ---
>  libmultipath/alias.c | 29 ++++++++++++++++++++++++++++-
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/libmultipath/alias.c b/libmultipath/alias.c
> index 548a118..60428fe 100644
> --- a/libmultipath/alias.c
> +++ b/libmultipath/alias.c
> @@ -672,6 +672,26 @@ static void cleanup_fclose(void *p)
>  	fclose(p);
>  }
>  
> +static int alias_compar(const void *p1, const void *p2)
> +{
> +	const char *alias1 = (*(struct mpentry * const *)p1)->alias;
> +	const char *alias2 = (*(struct mpentry * const *)p2)->alias;
> +
> +	if (!alias1 && !alias2)
> +		return 0;
> +	if (!alias1)
> +		return 1;
> +	if (!alias2)
> +		return -1;
> +	return strcmp(alias1, alias2);
> +}
> +
> +static void cleanup_vector_free(void *arg)
> +{
> +	if  (arg)
> +		vector_free((vector)arg);
> +}
> +
>  /*
>   * check_alias_settings(): test for inconsistent alias configuration
>   *
> @@ -693,10 +713,16 @@ int check_alias_settings(const struct config *conf)
>  	int can_write;
>  	int rc = 0, i, fd;
>  	Bindings bindings = {.allocated = 0, };
> +	vector mptable = NULL;
>  	struct mpentry *mpe;
>  
>  	pthread_cleanup_push_cast(free_bindings, &bindings);
> -	vector_foreach_slot(conf->mptable, mpe, i) {
> +	pthread_cleanup_push(cleanup_vector_free, mptable);
> +	mptable = vector_convert(NULL, conf->mptable, struct mpentry *, identity);
> +	if (!mptable)
> +		return -1;

Are there other places in the code where we return without popping the
cleanup handler? According to the man page "POSIX.1 says that the effect
of using return, break, continue, or  goto to  prematurely  leave  a
block  bracketed pthread_cleanup_push() and pthread_cleanup_pop() is
undefined.  Portable applications should avoid doing this." It also says
that linux implements these as macros that expand to create code blocks.
So, I'm pretty sure this is safe in linux, but if we aren't already
doing it, we probably shouldn't start, especially since vector_convert()
doesn't have any pthread cancellation points, so we can just move the
push until after we are sure we successfully copied the vector.

> +	vector_sort(mptable, alias_compar);
> +	vector_foreach_slot(mptable, mpe, i) {
>  		if (!mpe->wwid || !mpe->alias)

Unless I'm missing something, merge_mptable() should have already
guaranteed that mpe->wwid must be non-NULL. Also, since mptable has all
of the entries with a NULL alias sorted to the end, as soon as we hit
one, we can stop checking.

-Ben

>  			continue;
>  		if (add_binding(&bindings, mpe->alias, mpe->wwid) ==
> @@ -710,6 +736,7 @@ int check_alias_settings(const struct config *conf)
>  	}
>  	/* This clears the bindings */
>  	pthread_cleanup_pop(1);
> +	pthread_cleanup_pop(1);
>  
>  	pthread_cleanup_push_cast(free_bindings, &bindings);
>  	fd = open_file(conf->bindings_file, &can_write, BINDINGS_FILE_HEADER);
> -- 
> 2.37.1
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 3/3] multipath: optimize program startup for frequent invocations
       [not found] ` <20220818210630.19395-4-mwilck@suse.com>
@ 2022-08-23 21:53   ` Benjamin Marzinski
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2022-08-23 21:53 UTC (permalink / raw)
  To: mwilck; +Cc: dm-devel

On Thu, Aug 18, 2022 at 11:06:30PM +0200, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> Neither "multipath -u" nor "multipath -U" need initialization of the
> prioritizers, checkers, and foreign libraries. Also, these commands
> need not fail if the bindings file is inconsistent. Move these
> possibly slow initialization steps after these special command
> invocations.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  multipath/main.c | 33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/multipath/main.c b/multipath/main.c
> index 034dd2f..8e5154a 100644
> --- a/multipath/main.c
> +++ b/multipath/main.c
> @@ -957,11 +957,6 @@ main (int argc, char *argv[])
>  		exit(RTVL_FAIL);
>  	}
>  
> -	if (check_alias_settings(conf)) {
> -		fprintf(stderr, "fatal configuration error, aborting");
> -		exit(RTVL_FAIL);
> -	}
> -
>  	if (optind < argc) {
>  		dev = calloc(1, FILE_NAME_SIZE);
>  
> @@ -988,20 +983,9 @@ main (int argc, char *argv[])
>  
>  	libmp_udev_set_sync_support(1);
>  
> -	if (init_checkers()) {
> -		condlog(0, "failed to initialize checkers");
> -		goto out;
> -	}
> -	if (init_prio()) {
> -		condlog(0, "failed to initialize prioritizers");
> -		goto out;
> -	}
> -
>  	if ((cmd == CMD_LIST_SHORT || cmd == CMD_LIST_LONG) && enable_foreign)
>  		conf->enable_foreign = strdup("");
>  
> -	/* Failing here is non-fatal */
> -	init_foreign(conf->enable_foreign);
>  	if (cmd == CMD_USABLE_PATHS) {
>  		r = check_usable_paths(conf, dev, dev_type) ?
>  			RTVL_FAIL : RTVL_OK;
> @@ -1036,6 +1020,23 @@ main (int argc, char *argv[])
>  		break;
>  	}
>  
> +	if (check_alias_settings(conf)) {
> +		fprintf(stderr, "fatal configuration error, aborting");
> +		exit(RTVL_FAIL);
> +	}
> +
> +	if (init_checkers()) {
> +		condlog(0, "failed to initialize checkers");
> +		goto out;
> +	}
> +	if (init_prio()) {
> +		condlog(0, "failed to initialize prioritizers");
> +		goto out;
> +	}
> +
> +	/* Failing here is non-fatal */
> +	init_foreign(conf->enable_foreign);
> +
>  	if (cmd == CMD_RESET_WWIDS) {
>  		struct multipath * mpp;
>  		int i;
> -- 
> 2.37.1
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
  2022-08-23 20:48   ` [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid Benjamin Marzinski
@ 2022-08-24  7:07     ` Martin Wilck
  2022-08-24 16:16       ` Benjamin Marzinski
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Wilck @ 2022-08-24  7:07 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> On Thu, Aug 18, 2022 at 11:06:28PM +0200, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > If the mptable is very large (for example, in a configuration with
> > lots of maps assigned individual aliases), merge_mptable may get
> > very slow because it needs to make O(n^2) string comparisons (with
> > n being the number of mptable entries). WWID strings often differ
> > only in the last few bytes, causing a lot of CPU time spent in
> > strcmp().
> > 
> > merge_mptable is executed whenever multipath or multipathd is
> > started, that
> > is, also for "multipath -u" and "multipath -U" invocations from
> > udev rules.
> > Optimize it by sorting the mptable before merging. This way we
> > don't need
> > to iterate towards the end of the vector searching for duplicates.
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/config.c | 15 +++++++++++++--
> >  libmultipath/vector.c |  8 ++++++++
> >  libmultipath/vector.h |  1 +
> >  3 files changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > index ab8b26e..a2c79a4 100644
> > --- a/libmultipath/config.c
> > +++ b/libmultipath/config.c
> > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct mpentry
> > *src)
> >         merge_num(mode);
> >  }
> >  
> > +static int wwid_compar(const void *p1, const void *p2)
> > +{
> > +       const char *wwid1 = (*(struct mpentry * const *)p1)->wwid;
> > +       const char *wwid2 = (*(struct mpentry * const *)p2)->wwid;
> > +
> > +       return strcmp(wwid1, wwid2);
> > +}
> > +
> >  void merge_mptable(vector mptable)
> >  {
> >         struct mpentry *mp1, *mp2;
> > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> >                         free_mpe(mp1);
> >                         continue;
> >                 }
> > +       }
> > +       vector_sort(mptable, wwid_compar);
> > +       vector_foreach_slot(mptable, mp1, i) {
> >                 j = i + 1;
> >                 vector_foreach_slot_after(mptable, mp2, j) {
> > -                       if (!mp2->wwid || strcmp(mp1->wwid, mp2-
> > >wwid))
> > -                               continue;
> > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > +                               break;
> >                         condlog(1, "%s: duplicate multipath config
> > section for %s",
> >                                 __func__, mp1->wwid);
> >                         merge_mpe(mp2, mp1);
> 
> The way merge_mpe() works, values are only copied from mp1's
> variables
> if the corresponding variable is unset in mp2. This requires the
> original order of mp1 and mp2 to be unchanged by the sorting
> algorithm,
> but according to the qsort() man page "If two members compare as
> equal,
> their order in the sorted array is undefined."
> 
> One quick and dirty way we could fix this is to add a variable to
> struct
> mptable called config_idx, which is its index in the config file.  If
> the wwids are equal, you compare that.

Thanks for pointing this out. I believe it's easier than that: as we're
passed the offsets into the array (aka struct mpentry **), we can
simply compare p1 and p2 if the strings are equal.

Agree?

Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 2/3] libmultipath: check_alias_settings(): pre-sort mptable by alias
  2022-08-23 21:27   ` [dm-devel] [PATCH 2/3] libmultipath: check_alias_settings(): pre-sort mptable by alias Benjamin Marzinski
@ 2022-08-24  7:32     ` Martin Wilck
  0 siblings, 0 replies; 9+ messages in thread
From: Martin Wilck @ 2022-08-24  7:32 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Tue, 2022-08-23 at 16:27 -0500, Benjamin Marzinski wrote:
> On Thu, Aug 18, 2022 at 11:06:29PM +0200, mwilck@suse.com wrote:
> > From: Martin Wilck <mwilck@suse.com>
> > 
> > add_binding() contains an optimization; it assumes that the list of
> > bindings is alphabetically sorted by alias, and tries to maintain
> > this order.
> > 
> > But conf->mptable is sorted by wwid. Therefore
> > check_alias_settings() may take
> > a long time if the mptable is large.
> > 
> > Create a copy of the mptable, sorted by alias, and use it for
> > add_bindings().
> > This speeds up check_alias_settings by roughly a factor of 10 (0.1s
> > vs. 1s)
> > for my test setup with 10000 entries in the mptable.
> > 
> > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > ---
> >  libmultipath/alias.c | 29 ++++++++++++++++++++++++++++-
> >  1 file changed, 28 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libmultipath/alias.c b/libmultipath/alias.c
> > index 548a118..60428fe 100644
> > --- a/libmultipath/alias.c
> > +++ b/libmultipath/alias.c
> > @@ -672,6 +672,26 @@ static void cleanup_fclose(void *p)
> >         fclose(p);
> >  }
> >  
> > +static int alias_compar(const void *p1, const void *p2)
> > +{
> > +       const char *alias1 = (*(struct mpentry * const *)p1)-
> > >alias;
> > +       const char *alias2 = (*(struct mpentry * const *)p2)-
> > >alias;
> > +
> > +       if (!alias1 && !alias2)
> > +               return 0;
> > +       if (!alias1)
> > +               return 1;
> > +       if (!alias2)
> > +               return -1;
> > +       return strcmp(alias1, alias2);
> > +}
> > +
> > +static void cleanup_vector_free(void *arg)
> > +{
> > +       if  (arg)
> > +               vector_free((vector)arg);
> > +}
> > +
> >  /*
> >   * check_alias_settings(): test for inconsistent alias
> > configuration
> >   *
> > @@ -693,10 +713,16 @@ int check_alias_settings(const struct config
> > *conf)
> >         int can_write;
> >         int rc = 0, i, fd;
> >         Bindings bindings = {.allocated = 0, };
> > +       vector mptable = NULL;
> >         struct mpentry *mpe;
> >  
> >         pthread_cleanup_push_cast(free_bindings, &bindings);
> > -       vector_foreach_slot(conf->mptable, mpe, i) {
> > +       pthread_cleanup_push(cleanup_vector_free, mptable);
> > +       mptable = vector_convert(NULL, conf->mptable, struct
> > mpentry *, identity);
> > +       if (!mptable)
> > +               return -1;
> 
> Are there other places in the code where we return without popping
> the
> cleanup handler?

Of course not. Stupid mistake.

>  According to the man page "POSIX.1 says that the effect
> of using return, break, continue, or  goto to  prematurely  leave  a
> block  bracketed pthread_cleanup_push() and pthread_cleanup_pop() is
> undefined.  Portable applications should avoid doing this." It also
> says
> that linux implements these as macros that expand to create code
> blocks.
> So, I'm pretty sure this is safe in linux, but if we aren't already
> doing it, we probably shouldn't start, especially since
> vector_convert()
> doesn't have any pthread cancellation points, so we can just move the
> push until after we are sure we successfully copied the vector.
> 
> > +       vector_sort(mptable, alias_compar);
> > +       vector_foreach_slot(mptable, mpe, i) {
> >                 if (!mpe->wwid || !mpe->alias)
> 
> Unless I'm missing something, merge_mptable() should have already
> guaranteed that mpe->wwid must be non-NULL. Also, since mptable has
> all
> of the entries with a NULL alias sorted to the end, as soon as we hit
> one, we can stop checking.

Right, thanks. 

Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
  2022-08-24  7:07     ` Martin Wilck
@ 2022-08-24 16:16       ` Benjamin Marzinski
  2022-08-24 17:02         ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Marzinski @ 2022-08-24 16:16 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, Aug 24, 2022 at 09:07:56AM +0200, Martin Wilck wrote:
> On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> > On Thu, Aug 18, 2022 at 11:06:28PM +0200, mwilck@suse.com wrote:
> > > From: Martin Wilck <mwilck@suse.com>
> > > 
> > > If the mptable is very large (for example, in a configuration with
> > > lots of maps assigned individual aliases), merge_mptable may get
> > > very slow because it needs to make O(n^2) string comparisons (with
> > > n being the number of mptable entries). WWID strings often differ
> > > only in the last few bytes, causing a lot of CPU time spent in
> > > strcmp().
> > > 
> > > merge_mptable is executed whenever multipath or multipathd is
> > > started, that
> > > is, also for "multipath -u" and "multipath -U" invocations from
> > > udev rules.
> > > Optimize it by sorting the mptable before merging. This way we
> > > don't need
> > > to iterate towards the end of the vector searching for duplicates.
> > > 
> > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > ---
> > >  libmultipath/config.c | 15 +++++++++++++--
> > >  libmultipath/vector.c |  8 ++++++++
> > >  libmultipath/vector.h |  1 +
> > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > index ab8b26e..a2c79a4 100644
> > > --- a/libmultipath/config.c
> > > +++ b/libmultipath/config.c
> > > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct mpentry
> > > *src)
> > >         merge_num(mode);
> > >  }
> > >  
> > > +static int wwid_compar(const void *p1, const void *p2)
> > > +{
> > > +       const char *wwid1 = (*(struct mpentry * const *)p1)->wwid;
> > > +       const char *wwid2 = (*(struct mpentry * const *)p2)->wwid;
> > > +
> > > +       return strcmp(wwid1, wwid2);
> > > +}
> > > +
> > >  void merge_mptable(vector mptable)
> > >  {
> > >         struct mpentry *mp1, *mp2;
> > > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> > >                         free_mpe(mp1);
> > >                         continue;
> > >                 }
> > > +       }
> > > +       vector_sort(mptable, wwid_compar);
> > > +       vector_foreach_slot(mptable, mp1, i) {
> > >                 j = i + 1;
> > >                 vector_foreach_slot_after(mptable, mp2, j) {
> > > -                       if (!mp2->wwid || strcmp(mp1->wwid, mp2-
> > > >wwid))
> > > -                               continue;
> > > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > > +                               break;
> > >                         condlog(1, "%s: duplicate multipath config
> > > section for %s",
> > >                                 __func__, mp1->wwid);
> > >                         merge_mpe(mp2, mp1);
> > 
> > The way merge_mpe() works, values are only copied from mp1's
> > variables
> > if the corresponding variable is unset in mp2. This requires the
> > original order of mp1 and mp2 to be unchanged by the sorting
> > algorithm,
> > but according to the qsort() man page "If two members compare as
> > equal,
> > their order in the sorted array is undefined."
> > 
> > One quick and dirty way we could fix this is to add a variable to
> > struct
> > mptable called config_idx, which is its index in the config file.  If
> > the wwids are equal, you compare that.
> 
> Thanks for pointing this out. I believe it's easier than that: as we're
> passed the offsets into the array (aka struct mpentry **), we can
> simply compare p1 and p2 if the strings are equal.
> 
> Agree?

I don't think so. Comparing the array offsets assumes that two mpentries
are still ordered correctly when they are compared against each other.
But the way quick sort works, array elements can get swapped around
multiple times, based on whether they are larger or smaller than some
pivot point. It's possible that the two mpentries already switched order
before they were compared.

Essentially, all comparing the offset does is force qsort to not switch
the place of two otherwise equal entries. But for speed reasons alone,
there is no reason why qsort would bother to swap the location of two
equal entries.  

Here's an example of what could go wrong:

Say you start with the array (I'm also tracking the mpentry's original
config index)

array offset:	0	1	2	3	4
config_idx:	0	1	2	3	4
wwid:		D	B	C	D	A	

Say qsort picks the element at array offset 2 (wwid "C") as the pivot.
Usually quicksort works by walking towards the center of the array
segment from both ends, swapping the positions of elements bigger than
the pivot with the positions of ones smaller than or equal to the pivot.
So after the first round you would swap the element at array offset 4
(wwid "A") with the element at array offset 0 (wwid "D"). This would
give you.

array offset:	0	1	2	3	4
config_idx	4	1	2	3	0
wwid:		A	B	C	D	D

After this no further swaps will happen using the original
wwid_compar(). Adding a comparison to the array offset won't change
anything. But the wwid "D" mpentry that was orinally earlier in the
config (config_idx = 0) is now after the wwid "D" mpentry that was
originally later (config_idx = 3).

Comparing the config_idx will cause the elements at array offset 3 and 4
to switch places, restoring their original ordering.

-Ben

> 
> Martin
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
  2022-08-24 16:16       ` Benjamin Marzinski
@ 2022-08-24 17:02         ` Martin Wilck
  2022-08-25  7:44           ` Martin Wilck
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Wilck @ 2022-08-24 17:02 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Wed, 2022-08-24 at 11:16 -0500, Benjamin Marzinski wrote:
> On Wed, Aug 24, 2022 at 09:07:56AM +0200, Martin Wilck wrote:
> > On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> > > On Thu, Aug 18, 2022 at 11:06:28PM +0200, mwilck@suse.com wrote:
> > > > From: Martin Wilck <mwilck@suse.com>
> > > > 
> > > > If the mptable is very large (for example, in a configuration
> > > > with
> > > > lots of maps assigned individual aliases), merge_mptable may
> > > > get
> > > > very slow because it needs to make O(n^2) string comparisons
> > > > (with
> > > > n being the number of mptable entries). WWID strings often
> > > > differ
> > > > only in the last few bytes, causing a lot of CPU time spent in
> > > > strcmp().
> > > > 
> > > > merge_mptable is executed whenever multipath or multipathd is
> > > > started, that
> > > > is, also for "multipath -u" and "multipath -U" invocations from
> > > > udev rules.
> > > > Optimize it by sorting the mptable before merging. This way we
> > > > don't need
> > > > to iterate towards the end of the vector searching for
> > > > duplicates.
> > > > 
> > > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > > ---
> > > >  libmultipath/config.c | 15 +++++++++++++--
> > > >  libmultipath/vector.c |  8 ++++++++
> > > >  libmultipath/vector.h |  1 +
> > > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > > index ab8b26e..a2c79a4 100644
> > > > --- a/libmultipath/config.c
> > > > +++ b/libmultipath/config.c
> > > > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct
> > > > mpentry
> > > > *src)
> > > >         merge_num(mode);
> > > >  }
> > > >  
> > > > +static int wwid_compar(const void *p1, const void *p2)
> > > > +{
> > > > +       const char *wwid1 = (*(struct mpentry * const *)p1)-
> > > > >wwid;
> > > > +       const char *wwid2 = (*(struct mpentry * const *)p2)-
> > > > >wwid;
> > > > +
> > > > +       return strcmp(wwid1, wwid2);
> > > > +}
> > > > +
> > > >  void merge_mptable(vector mptable)
> > > >  {
> > > >         struct mpentry *mp1, *mp2;
> > > > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> > > >                         free_mpe(mp1);
> > > >                         continue;
> > > >                 }
> > > > +       }
> > > > +       vector_sort(mptable, wwid_compar);
> > > > +       vector_foreach_slot(mptable, mp1, i) {
> > > >                 j = i + 1;
> > > >                 vector_foreach_slot_after(mptable, mp2, j) {
> > > > -                       if (!mp2->wwid || strcmp(mp1->wwid,
> > > > mp2-
> > > > > wwid))
> > > > -                               continue;
> > > > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > > > +                               break;
> > > >                         condlog(1, "%s: duplicate multipath
> > > > config
> > > > section for %s",
> > > >                                 __func__, mp1->wwid);
> > > >                         merge_mpe(mp2, mp1);
> > > 
> > > The way merge_mpe() works, values are only copied from mp1's
> > > variables
> > > if the corresponding variable is unset in mp2. This requires the
> > > original order of mp1 and mp2 to be unchanged by the sorting
> > > algorithm,
> > > but according to the qsort() man page "If two members compare as
> > > equal,
> > > their order in the sorted array is undefined."
> > > 
> > > One quick and dirty way we could fix this is to add a variable to
> > > struct
> > > mptable called config_idx, which is its index in the config
> > > file.  If
> > > the wwids are equal, you compare that.
> > 
> > Thanks for pointing this out. I believe it's easier than that: as
> > we're
> > passed the offsets into the array (aka struct mpentry **), we can
> > simply compare p1 and p2 if the strings are equal.
> > 
> > Agree?
> 
> I don't think so. Comparing the array offsets assumes that two
> mpentries
> are still ordered correctly when they are compared against each
> other.
> But the way quick sort works, array elements can get swapped around
> multiple times, based on whether they are larger or smaller than some
> pivot point. It's possible that the two mpentries already switched
> order
> before they were compared.
> 
> Essentially, all comparing the offset does is force qsort to not
> switch
> the place of two otherwise equal entries. But for speed reasons
> alone,
> there is no reason why qsort would bother to swap the location of two
> equal entries.  
> 
> Here's an example of what could go wrong:
> 
> Say you start with the array (I'm also tracking the mpentry's
> original
> config index)
> 
> array offset:   0       1       2       3       4
> config_idx:     0       1       2       3       4
> wwid:           D       B       C       D       A       
> 
> Say qsort picks the element at array offset 2 (wwid "C") as the
> pivot.
> Usually quicksort works by walking towards the center of the array
> segment from both ends, swapping the positions of elements bigger
> than
> the pivot with the positions of ones smaller than or equal to the
> pivot.
> So after the first round you would swap the element at array offset 4
> (wwid "A") with the element at array offset 0 (wwid "D"). This would
> give you.
> 
> array offset:   0       1       2       3       4
> config_idx      4       1       2       3       0
> wwid:           A       B       C       D       D
> 
> After this no further swaps will happen using the original
> wwid_compar(). Adding a comparison to the array offset won't change
> anything. But the wwid "D" mpentry that was orinally earlier in the
> config (config_idx = 0) is now after the wwid "D" mpentry that was
> originally later (config_idx = 3).
> 
> Comparing the config_idx will cause the elements at array offset 3
> and 4
> to switch places, restoring their original ordering.

Hm, too bad. 

I don't like the idea of adding another field to the array just to
stabilize the sort. But a fast, stable sort algorithm in for strings in
C seems to be hard to find. So yes, let's add the sort index for now,
perhaps we'll find a more elegant solution later.


Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
  2022-08-24 17:02         ` Martin Wilck
@ 2022-08-25  7:44           ` Martin Wilck
  2022-08-25 22:55             ` Benjamin Marzinski
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Wilck @ 2022-08-25  7:44 UTC (permalink / raw)
  To: Benjamin Marzinski; +Cc: dm-devel

On Wed, 2022-08-24 at 19:02 +0200, Martin Wilck wrote:
> On Wed, 2022-08-24 at 11:16 -0500, Benjamin Marzinski wrote:
> > On Wed, Aug 24, 2022 at 09:07:56AM +0200, Martin Wilck wrote:
> > > On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> > > > On Thu, Aug 18, 2022 at 11:06:28PM +0200,
> > > > mwilck@suse.com wrote:
> > > > > From: Martin Wilck <mwilck@suse.com>
> > > > > 
> > > > > If the mptable is very large (for example, in a configuration
> > > > > with
> > > > > lots of maps assigned individual aliases), merge_mptable may
> > > > > get
> > > > > very slow because it needs to make O(n^2) string comparisons
> > > > > (with
> > > > > n being the number of mptable entries). WWID strings often
> > > > > differ
> > > > > only in the last few bytes, causing a lot of CPU time spent
> > > > > in
> > > > > strcmp().
> > > > > 
> > > > > merge_mptable is executed whenever multipath or multipathd is
> > > > > started, that
> > > > > is, also for "multipath -u" and "multipath -U" invocations
> > > > > from
> > > > > udev rules.
> > > > > Optimize it by sorting the mptable before merging. This way
> > > > > we
> > > > > don't need
> > > > > to iterate towards the end of the vector searching for
> > > > > duplicates.
> > > > > 
> > > > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > > > ---
> > > > >  libmultipath/config.c | 15 +++++++++++++--
> > > > >  libmultipath/vector.c |  8 ++++++++
> > > > >  libmultipath/vector.h |  1 +
> > > > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > > > index ab8b26e..a2c79a4 100644
> > > > > --- a/libmultipath/config.c
> > > > > +++ b/libmultipath/config.c
> > > > > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct
> > > > > mpentry
> > > > > *src)
> > > > >         merge_num(mode);
> > > > >  }
> > > > >  
> > > > > +static int wwid_compar(const void *p1, const void *p2)
> > > > > +{
> > > > > +       const char *wwid1 = (*(struct mpentry * const *)p1)-
> > > > > > wwid;
> > > > > +       const char *wwid2 = (*(struct mpentry * const *)p2)-
> > > > > > wwid;
> > > > > +
> > > > > +       return strcmp(wwid1, wwid2);
> > > > > +}
> > > > > +
> > > > >  void merge_mptable(vector mptable)
> > > > >  {
> > > > >         struct mpentry *mp1, *mp2;
> > > > > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> > > > >                         free_mpe(mp1);
> > > > >                         continue;
> > > > >                 }
> > > > > +       }
> > > > > +       vector_sort(mptable, wwid_compar);
> > > > > +       vector_foreach_slot(mptable, mp1, i) {
> > > > >                 j = i + 1;
> > > > >                 vector_foreach_slot_after(mptable, mp2, j) {
> > > > > -                       if (!mp2->wwid || strcmp(mp1->wwid,
> > > > > mp2-
> > > > > > wwid))
> > > > > -                               continue;
> > > > > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > > > > +                               break;
> > > > >                         condlog(1, "%s: duplicate multipath
> > > > > config
> > > > > section for %s",
> > > > >                                 __func__, mp1->wwid);
> > > > >                         merge_mpe(mp2, mp1);
> > > > 
> > > > The way merge_mpe() works, values are only copied from mp1's
> > > > variables
> > > > if the corresponding variable is unset in mp2. This requires
> > > > the
> > > > original order of mp1 and mp2 to be unchanged by the sorting
> > > > algorithm,
> > > > but according to the qsort() man page "If two members compare
> > > > as
> > > > equal,
> > > > their order in the sorted array is undefined."
> > > > 
> > > > One quick and dirty way we could fix this is to add a variable
> > > > to
> > > > struct
> > > > mptable called config_idx, which is its index in the config
> > > > file.  If
> > > > the wwids are equal, you compare that.
> > > 
> > > Thanks for pointing this out. I believe it's easier than that: as
> > > we're
> > > passed the offsets into the array (aka struct mpentry **), we can
> > > simply compare p1 and p2 if the strings are equal.
> > > 
> > > Agree?
> > 
> > I don't think so. Comparing the array offsets assumes that two
> > mpentries
> > are still ordered correctly when they are compared against each
> > other.
> > But the way quick sort works, array elements can get swapped around
> > multiple times, based on whether they are larger or smaller than
> > some
> > pivot point. It's possible that the two mpentries already switched
> > order
> > before they were compared.
> > 
> > Essentially, all comparing the offset does is force qsort to not
> > switch
> > the place of two otherwise equal entries. But for speed reasons
> > alone,
> > there is no reason why qsort would bother to swap the location of
> > two
> > equal entries.  
> > 
> > Here's an example of what could go wrong:
> > 
> > Say you start with the array (I'm also tracking the mpentry's
> > original
> > config index)
> > 
> > array offset:   0       1       2       3       4
> > config_idx:     0       1       2       3       4
> > wwid:           D       B       C       D       A       
> > 
> > Say qsort picks the element at array offset 2 (wwid "C") as the
> > pivot.
> > Usually quicksort works by walking towards the center of the array
> > segment from both ends, swapping the positions of elements bigger
> > than
> > the pivot with the positions of ones smaller than or equal to the
> > pivot.
> > So after the first round you would swap the element at array offset
> > 4
> > (wwid "A") with the element at array offset 0 (wwid "D"). This
> > would
> > give you.
> > 
> > array offset:   0       1       2       3       4
> > config_idx      4       1       2       3       0
> > wwid:           A       B       C       D       D
> > 
> > After this no further swaps will happen using the original
> > wwid_compar(). Adding a comparison to the array offset won't change
> > anything. But the wwid "D" mpentry that was orinally earlier in the
> > config (config_idx = 0) is now after the wwid "D" mpentry that was
> > originally later (config_idx = 3).
> > 
> > Comparing the config_idx will cause the elements at array offset 3
> > and 4
> > to switch places, restoring their original ordering.
> 
> Hm, too bad. 
> 
> I don't like the idea of adding another field to the array just to
> stabilize the sort. But a fast, stable sort algorithm in for strings
> in
> C seems to be hard to find. So yes, let's add the sort index for now,
> perhaps we'll find a more elegant solution later.

Digging some more, I found that glibc's qsort() is actually merge sort,
which is a stable sort algorithm and doesn't suffer from this issue.
The glibc documentation is misleading in this respect. Unfortunately
we'can't just support glibc. But we could simply copy in glibc's
msort.c code for other libraries.

Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid
  2022-08-25  7:44           ` Martin Wilck
@ 2022-08-25 22:55             ` Benjamin Marzinski
  0 siblings, 0 replies; 9+ messages in thread
From: Benjamin Marzinski @ 2022-08-25 22:55 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Thu, Aug 25, 2022 at 09:44:22AM +0200, Martin Wilck wrote:
> On Wed, 2022-08-24 at 19:02 +0200, Martin Wilck wrote:
> > On Wed, 2022-08-24 at 11:16 -0500, Benjamin Marzinski wrote:
> > > On Wed, Aug 24, 2022 at 09:07:56AM +0200, Martin Wilck wrote:
> > > > On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> > > > > On Thu, Aug 18, 2022 at 11:06:28PM +0200,
> > > > > mwilck@suse.com wrote:
> > > > > > From: Martin Wilck <mwilck@suse.com>
> > > > > > 
> > > > > > If the mptable is very large (for example, in a configuration
> > > > > > with
> > > > > > lots of maps assigned individual aliases), merge_mptable may
> > > > > > get
> > > > > > very slow because it needs to make O(n^2) string comparisons
> > > > > > (with
> > > > > > n being the number of mptable entries). WWID strings often
> > > > > > differ
> > > > > > only in the last few bytes, causing a lot of CPU time spent
> > > > > > in
> > > > > > strcmp().
> > > > > > 
> > > > > > merge_mptable is executed whenever multipath or multipathd is
> > > > > > started, that
> > > > > > is, also for "multipath -u" and "multipath -U" invocations
> > > > > > from
> > > > > > udev rules.
> > > > > > Optimize it by sorting the mptable before merging. This way
> > > > > > we
> > > > > > don't need
> > > > > > to iterate towards the end of the vector searching for
> > > > > > duplicates.
> > > > > > 
> > > > > > Signed-off-by: Martin Wilck <mwilck@suse.com>
> > > > > > ---
> > > > > >  libmultipath/config.c | 15 +++++++++++++--
> > > > > >  libmultipath/vector.c |  8 ++++++++
> > > > > >  libmultipath/vector.h |  1 +
> > > > > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > > > > index ab8b26e..a2c79a4 100644
> > > > > > --- a/libmultipath/config.c
> > > > > > +++ b/libmultipath/config.c
> > > > > > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct
> > > > > > mpentry
> > > > > > *src)
> > > > > >         merge_num(mode);
> > > > > >  }
> > > > > >  
> > > > > > +static int wwid_compar(const void *p1, const void *p2)
> > > > > > +{
> > > > > > +       const char *wwid1 = (*(struct mpentry * const *)p1)-
> > > > > > > wwid;
> > > > > > +       const char *wwid2 = (*(struct mpentry * const *)p2)-
> > > > > > > wwid;
> > > > > > +
> > > > > > +       return strcmp(wwid1, wwid2);
> > > > > > +}
> > > > > > +
> > > > > >  void merge_mptable(vector mptable)
> > > > > >  {
> > > > > >         struct mpentry *mp1, *mp2;
> > > > > > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> > > > > >                         free_mpe(mp1);
> > > > > >                         continue;
> > > > > >                 }
> > > > > > +       }
> > > > > > +       vector_sort(mptable, wwid_compar);
> > > > > > +       vector_foreach_slot(mptable, mp1, i) {
> > > > > >                 j = i + 1;
> > > > > >                 vector_foreach_slot_after(mptable, mp2, j) {
> > > > > > -                       if (!mp2->wwid || strcmp(mp1->wwid,
> > > > > > mp2-
> > > > > > > wwid))
> > > > > > -                               continue;
> > > > > > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > > > > > +                               break;
> > > > > >                         condlog(1, "%s: duplicate multipath
> > > > > > config
> > > > > > section for %s",
> > > > > >                                 __func__, mp1->wwid);
> > > > > >                         merge_mpe(mp2, mp1);
> > > > > 
> > > > > The way merge_mpe() works, values are only copied from mp1's
> > > > > variables
> > > > > if the corresponding variable is unset in mp2. This requires
> > > > > the
> > > > > original order of mp1 and mp2 to be unchanged by the sorting
> > > > > algorithm,
> > > > > but according to the qsort() man page "If two members compare
> > > > > as
> > > > > equal,
> > > > > their order in the sorted array is undefined."
> > > > > 
> > > > > One quick and dirty way we could fix this is to add a variable
> > > > > to
> > > > > struct
> > > > > mptable called config_idx, which is its index in the config
> > > > > file.  If
> > > > > the wwids are equal, you compare that.
> > > > 
> > > > Thanks for pointing this out. I believe it's easier than that: as
> > > > we're
> > > > passed the offsets into the array (aka struct mpentry **), we can
> > > > simply compare p1 and p2 if the strings are equal.
> > > > 
> > > > Agree?
> > > 
> > > I don't think so. Comparing the array offsets assumes that two
> > > mpentries
> > > are still ordered correctly when they are compared against each
> > > other.
> > > But the way quick sort works, array elements can get swapped around
> > > multiple times, based on whether they are larger or smaller than
> > > some
> > > pivot point. It's possible that the two mpentries already switched
> > > order
> > > before they were compared.
> > > 
> > > Essentially, all comparing the offset does is force qsort to not
> > > switch
> > > the place of two otherwise equal entries. But for speed reasons
> > > alone,
> > > there is no reason why qsort would bother to swap the location of
> > > two
> > > equal entries.  
> > > 
> > > Here's an example of what could go wrong:
> > > 
> > > Say you start with the array (I'm also tracking the mpentry's
> > > original
> > > config index)
> > > 
> > > array offset:   0       1       2       3       4
> > > config_idx:     0       1       2       3       4
> > > wwid:           D       B       C       D       A       
> > > 
> > > Say qsort picks the element at array offset 2 (wwid "C") as the
> > > pivot.
> > > Usually quicksort works by walking towards the center of the array
> > > segment from both ends, swapping the positions of elements bigger
> > > than
> > > the pivot with the positions of ones smaller than or equal to the
> > > pivot.
> > > So after the first round you would swap the element at array offset
> > > 4
> > > (wwid "A") with the element at array offset 0 (wwid "D"). This
> > > would
> > > give you.
> > > 
> > > array offset:   0       1       2       3       4
> > > config_idx      4       1       2       3       0
> > > wwid:           A       B       C       D       D
> > > 
> > > After this no further swaps will happen using the original
> > > wwid_compar(). Adding a comparison to the array offset won't change
> > > anything. But the wwid "D" mpentry that was orinally earlier in the
> > > config (config_idx = 0) is now after the wwid "D" mpentry that was
> > > originally later (config_idx = 3).
> > > 
> > > Comparing the config_idx will cause the elements at array offset 3
> > > and 4
> > > to switch places, restoring their original ordering.
> > 
> > Hm, too bad. 
> > 
> > I don't like the idea of adding another field to the array just to
> > stabilize the sort. But a fast, stable sort algorithm in for strings
> > in
> > C seems to be hard to find. So yes, let's add the sort index for now,
> > perhaps we'll find a more elegant solution later.
> 
> Digging some more, I found that glibc's qsort() is actually merge sort,
> which is a stable sort algorithm and doesn't suffer from this issue.

Huh. Good to know.

> The glibc documentation is misleading in this respect. Unfortunately
> we'can't just support glibc. But we could simply copy in glibc's
> msort.c code for other libraries.

Sounds reasonable.

-Ben

> Martin
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-08-25 22:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220818210630.19395-1-mwilck@suse.com>
     [not found] ` <20220818210630.19395-2-mwilck@suse.com>
2022-08-23 20:48   ` [dm-devel] [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid Benjamin Marzinski
2022-08-24  7:07     ` Martin Wilck
2022-08-24 16:16       ` Benjamin Marzinski
2022-08-24 17:02         ` Martin Wilck
2022-08-25  7:44           ` Martin Wilck
2022-08-25 22:55             ` Benjamin Marzinski
     [not found] ` <20220818210630.19395-3-mwilck@suse.com>
2022-08-23 21:27   ` [dm-devel] [PATCH 2/3] libmultipath: check_alias_settings(): pre-sort mptable by alias Benjamin Marzinski
2022-08-24  7:32     ` Martin Wilck
     [not found] ` <20220818210630.19395-4-mwilck@suse.com>
2022-08-23 21:53   ` [dm-devel] [PATCH 3/3] multipath: optimize program startup for frequent invocations Benjamin Marzinski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.