All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-devel] [PATCH 0/5] Memory issues found by coverity
@ 2021-05-11 23:21 Benjamin Marzinski
  2021-05-11 23:22 ` [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed Benjamin Marzinski
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:21 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

This is collection of issues found by coverity. The first three patches
deal with ev_remove_path() removing the path, but returning failure,
causing a use-after-free error. The last two patches fix memory leaks.

Benjamin Marzinski (5):
  multipathd: don't fail to remove path once the map is removed
  multipathd: remove duplicate orphan_paths in flush_map
  multipathd: make ev_remove_path return success on path removal
  multipath: free vectors in configure
  kpartx: Don't leak memory when getblock returns NULL

 kpartx/kpartx.c            |  2 ++
 libmultipath/structs_vec.c |  3 +--
 multipath/main.c           |  7 ++++++-
 multipathd/main.c          | 20 ++++++++++++--------
 4 files changed, 21 insertions(+), 11 deletions(-)

-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed
  2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
@ 2021-05-11 23:22 ` Benjamin Marzinski
  2021-05-12  9:11   ` Martin Wilck
  2021-05-11 23:22 ` [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map Benjamin Marzinski
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:22 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

In ev_remove_path(), if update_mpp_paths() fails, we delete the entire
map. However, since update_mpp_paths() happens before we call
set_path_removed(), pp->initialized isn't set to INIT_REMOVED, so
remove_map_and_stop_waiter() doesn't remove the path when in removes the
map.  But with the map removed, there's nothing to keep us from removing
the path.

Call set_path_removed() before update_mpp_paths() to avoid the odd case
of ev_remove_path() removing the map but not the path.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/structs_vec.c |  3 +--
 multipathd/main.c          | 13 ++++++++-----
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
index d242c06b..432c0c63 100644
--- a/libmultipath/structs_vec.c
+++ b/libmultipath/structs_vec.c
@@ -45,8 +45,7 @@ int update_mpp_paths(struct multipath *mpp, vector pathvec)
 
 				/*
 				 * Avoid adding removed paths to the map again
-				 * when we reload it. Such paths may exist if
-				 * domap fails in ev_remove_path().
+				 * when we reload it.
 				 */
 				pp1 = find_path_by_devt(pathvec, pp->dev_t);
 				if (pp1 && pp->initialized != INIT_REMOVED &&
diff --git a/multipathd/main.c b/multipathd/main.c
index 102946bf..449ce384 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1199,6 +1199,13 @@ ev_remove_path (struct path *pp, struct vectors * vecs, int need_do_map)
 	 * avoid referring to the map of an orphaned path
 	 */
 	if ((mpp = pp->mpp)) {
+		/*
+		 * Mark the path as removed. In case of success, we
+		 * will delete it for good. Otherwise, it will be deleted
+		 * later, unless all attempts to reload this map fail.
+		 */
+		set_path_removed(pp);
+
 		/*
 		 * transform the mp->pg vector of vectors of paths
 		 * into a mp->params string to feed the device-mapper
@@ -1210,13 +1217,9 @@ ev_remove_path (struct path *pp, struct vectors * vecs, int need_do_map)
 		}
 
 		/*
-		 * Mark the path as removed. In case of success, we
-		 * will delete it for good. Otherwise, it will be deleted
-		 * later, unless all attempts to reload this map fail.
-		 * Note: we have to explicitly remove pp from mpp->paths,
+		 * we have to explicitly remove pp from mpp->paths,
 		 * update_mpp_paths() doesn't do that.
 		 */
-		set_path_removed(pp);
 		i = find_slot(mpp->paths, pp);
 		if (i != -1)
 			vector_del_slot(mpp->paths, i);
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map
  2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
  2021-05-11 23:22 ` [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed Benjamin Marzinski
@ 2021-05-11 23:22 ` Benjamin Marzinski
  2021-05-12  9:16   ` Martin Wilck
  2021-05-11 23:22 ` [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal Benjamin Marzinski
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:22 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

remove_map_and_stop_waiter() already calls orphan_paths() so flush_map()
doesn't need to call orphan_paths() before calling
remove_map_and_stop_waiter().

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 449ce384..6090434c 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -660,7 +660,6 @@ flush_map(struct multipath * mpp, struct vectors * vecs, int nopaths)
 	else
 		condlog(2, "%s: map flushed", mpp->alias);
 
-	orphan_paths(vecs->pathvec, mpp, "map flushed");
 	remove_map_and_stop_waiter(mpp, vecs);
 
 	return 0;
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
  2021-05-11 23:22 ` [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed Benjamin Marzinski
  2021-05-11 23:22 ` [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map Benjamin Marzinski
@ 2021-05-11 23:22 ` Benjamin Marzinski
  2021-05-12 11:38   ` Martin Wilck
  2021-05-11 23:22 ` [dm-devel] [PATCH 4/5] multipath: free vectors in configure Benjamin Marzinski
  2021-05-11 23:22 ` [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL Benjamin Marzinski
  4 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:22 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

When ev_remove_path() returns success, callers assume that the path (and
possibly the map) has been removed.  When ev_remove_path() returns
failure, callers assume that the path has not been removed. However, the
path could be removed on both success or failure. This could cause
callers to dereference the path after it was removed. Change
ev_remove_path() to return success whenever the path is removed, even if
the map was removed due to a failure when trying to reload it. Found by
coverity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipathd/main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/multipathd/main.c b/multipathd/main.c
index 6090434c..4bdf14bd 100644
--- a/multipathd/main.c
+++ b/multipathd/main.c
@@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct vectors * vecs, int need_do_map)
 
 			strlcpy(devt, pp->dev_t, sizeof(devt));
 			if (setup_multipath(vecs, mpp))
-				return 1;
+				return 0;
 			/*
 			 * Successful map reload without this path:
 			 * sync_map_state() will free it.
@@ -1304,8 +1304,10 @@ out:
 	return retval;
 
 fail:
+	condlog(0, "%s: error removing path. removing map %s", pp->dev,
+		mpp->alias);
 	remove_map_and_stop_waiter(mpp, vecs);
-	return 1;
+	return 0;
 }
 
 static int
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [dm-devel] [PATCH 4/5] multipath: free vectors in configure
  2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
                   ` (2 preceding siblings ...)
  2021-05-11 23:22 ` [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal Benjamin Marzinski
@ 2021-05-11 23:22 ` Benjamin Marzinski
  2021-05-12 12:36   ` Martin Wilck
  2021-05-11 23:22 ` [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL Benjamin Marzinski
  4 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:22 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

configure() can retry multiple times, each time reallocing a maps and
paths vector, and leaking the previous ones. Fix this by always freeing
the vectors before configure() exits. Found by coverity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 multipath/main.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/multipath/main.c b/multipath/main.c
index ef89c7cf..25c5dbfd 100644
--- a/multipath/main.c
+++ b/multipath/main.c
@@ -466,7 +466,6 @@ configure (struct config *conf, enum mpath_cmds cmd,
 	 */
 	curmp = vector_alloc();
 	pathvec = vector_alloc();
-	atexit(cleanup_vecs);
 
 	if (!curmp || !pathvec) {
 		condlog(0, "can not allocate memory");
@@ -578,6 +577,11 @@ out:
 	if (refwwid)
 		FREE(refwwid);
 
+	free_multipathvec(curmp, KEEP_PATHS);
+	vecs.mpvec = NULL;
+	free_pathvec(pathvec, FREE_PATHS);
+	vecs.pathvec = NULL;
+
 	return r;
 }
 
@@ -1053,6 +1057,7 @@ main (int argc, char *argv[])
 		r = dm_flush_maps(1, retries) ? RTVL_FAIL : RTVL_OK;
 		goto out;
 	}
+	atexit(cleanup_vecs);
 	while ((r = configure(conf, cmd, dev_type, dev)) == RTVL_RETRY)
 		condlog(3, "restart multipath configuration process");
 
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL
  2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
                   ` (3 preceding siblings ...)
  2021-05-11 23:22 ` [dm-devel] [PATCH 4/5] multipath: free vectors in configure Benjamin Marzinski
@ 2021-05-11 23:22 ` Benjamin Marzinski
  2021-05-12 12:39   ` Martin Wilck
  4 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-11 23:22 UTC (permalink / raw)
  To: Christophe Varoqui; +Cc: device-mapper development, Martin Wilck

If a new block was allocated, but couldn't be filled, getblock will
discard it. When it does so, it needs to free the block to avoid leaking
memory. Found by coverity.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 kpartx/kpartx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kpartx/kpartx.c b/kpartx/kpartx.c
index 8ff116b8..7bc64543 100644
--- a/kpartx/kpartx.c
+++ b/kpartx/kpartx.c
@@ -766,6 +766,8 @@ getblock (int fd, unsigned int blknr) {
 	if (read(fd, bp->block, secsz) != secsz) {
 		fprintf(stderr, "read error, sector %d\n", secnr);
 		blockhead = bp->next;
+		free(bp->block);
+		free(bp);
 		return NULL;
 	}
 
-- 
2.17.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed
  2021-05-11 23:22 ` [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed Benjamin Marzinski
@ 2021-05-12  9:11   ` Martin Wilck
  2021-05-12 14:17     ` Benjamin Marzinski
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Wilck @ 2021-05-12  9:11 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> In ev_remove_path(), if update_mpp_paths() fails, we delete the
> entire
> map. However, since update_mpp_paths() happens before we call
> set_path_removed(), pp->initialized isn't set to INIT_REMOVED, so
> remove_map_and_stop_waiter() doesn't remove the path when in removes
> the
> map.  But with the map removed, there's nothing to keep us from
> removing
> the path.
> 
> Call set_path_removed() before update_mpp_paths() to avoid the odd
> case
> of ev_remove_path() removing the map but not the path.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  libmultipath/structs_vec.c |  3 +--
>  multipathd/main.c          | 13 ++++++++-----
>  2 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> index d242c06b..432c0c63 100644
> --- a/libmultipath/structs_vec.c
> +++ b/libmultipath/structs_vec.c
> @@ -45,8 +45,7 @@ int update_mpp_paths(struct multipath *mpp, vector
> pathvec)
>  
>                                 /*
>                                  * Avoid adding removed paths to the
> map again
> -                                * when we reload it. Such paths may
> exist if
> -                                * domap fails in ev_remove_path().
> +                                * when we reload it.

I'd like to keep the remark about domap(). It's meant as a reminder for
us and future developers how this situation is most likely to come to
pass.

Other than that, ACK.

Regards,
Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map
  2021-05-11 23:22 ` [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map Benjamin Marzinski
@ 2021-05-12  9:16   ` Martin Wilck
  0 siblings, 0 replies; 17+ messages in thread
From: Martin Wilck @ 2021-05-12  9:16 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> remove_map_and_stop_waiter() already calls orphan_paths() so
> flush_map()
> doesn't need to call orphan_paths() before calling
> remove_map_and_stop_waiter().
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>

> ---
>  multipathd/main.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 449ce384..6090434c 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -660,7 +660,6 @@ flush_map(struct multipath * mpp, struct vectors *
> vecs, int nopaths)
>         else
>                 condlog(2, "%s: map flushed", mpp->alias);
>  
> -       orphan_paths(vecs->pathvec, mpp, "map flushed");
>         remove_map_and_stop_waiter(mpp, vecs);
>  
>         return 0;



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-11 23:22 ` [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal Benjamin Marzinski
@ 2021-05-12 11:38   ` Martin Wilck
  2021-05-12 19:53     ` Benjamin Marzinski
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Wilck @ 2021-05-12 11:38 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> When ev_remove_path() returns success, callers assume that the path
> (and
> possibly the map) has been removed.  When ev_remove_path() returns
> failure, callers assume that the path has not been removed. However,
> the
> path could be removed on both success or failure. This could cause
> callers to dereference the path after it was removed. Change
> ev_remove_path() to return success whenever the path is removed, even
> if
> the map was removed due to a failure when trying to reload it. Found by
> coverity.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

This looks ok, but I'd like to discuss it in some more depth.

We need to clarify the meaning of the return code of ev_remove_path().
We guarantee that, when ev_remove_path() returns, either the path is
removed from internal data structures and kernel maps, or INIT_REMOVED
is set. We can't guarantee whether the path

 a) is not referenced any more by any kernel map,
 b) was actually removed from pathvec and other data structures in
multipathd.

We have to agree on whether it means a) or b) if we can't make these
two cases equivalent. Assuming multipathd has correct information about
the kernel mappings, the only difference between a) and b) is the
unlikely failure in setup_multipath(), where a) is true and b) is not
(because sync_map_state() won't be called). Your patch assumes the
semantics of a), which is correct AFAICS, but should be more clearly
documented.

Actually, I think we can fix the discrepancy between a) and b) - if
domap() was successful, we should be able to orphan the path, even if
update_multipath_strings() failed for some reason.
 
IMO we should consequently change the retval for the cases where
ev_remove_path() returns without deleting the path for a non-"failure"
case (wait_for_udev and !need_do_map).

Thoughts? Whatever we decide wrt the semantics of the return code, we
should document it clearly in comments at the function header.

Here's a quick review of callers:

 - cli_add_path(): AFAICS this needs b) semantics. We shouldn't set up
a new path unless it had been successfully removed internally.
 - cli_del_path(): needs a) semantics.
 - handle_path_wwid_change(): needs a).
 - uev_add_path(): needs a).
 - uev_remove_path(): the return code of ev_remove_path doesn't matter
much here. This is the only caller that sets need_do_map = false.
 - uev_update_path(): we currently don't look at the return code.
uev_add_path() will make another attempt at removing the path if
necessary.

Regards
Martin

P.S.: The remaining failure cases in ev_remove_path() are the failures
in update_mpp_paths() and setup_map(). The former can only fail with
ENOMEM, afaics. The latter, likewise, or if the map is fundamentally
misconfigured (which to me means that a previous call to setup_map()
would have failed as well). I'm wondering why we remove the entire map
in these failure cases. This goes back all the way to 45eb316 
("[multipathd] DM configuration final cut") from 2005. It's true that
both failures are pretty much fatal, but why is removing the map the
answer here?

However, this goes beyond the purpose of your patch. *If* we remove the
map, returning 0 is correct for either a) or b).

P.S. 2: I wonder if the logic in uev_update_path() is correct. Rather
than calling uev_add_path() after rescan_path() directly, I think we
should rather wait for another uevent (and possibly trigger another
"add" event, I don't think "rescan" automatically generates one).



> ---
>  multipathd/main.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/multipathd/main.c b/multipathd/main.c
> index 6090434c..4bdf14bd 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct vectors *
> vecs, int need_do_map)
>  
>                         strlcpy(devt, pp->dev_t, sizeof(devt));
>                         if (setup_multipath(vecs, mpp))
> -                               return 1;
> +                               return 0;
>                         /*
>                          * Successful map reload without this path:
>                          * sync_map_state() will free it.
> @@ -1304,8 +1304,10 @@ out:
>         return retval;
>  
>  fail:
> +       condlog(0, "%s: error removing path. removing map %s", pp->dev,
> +               mpp->alias);
>         remove_map_and_stop_waiter(mpp, vecs);
> -       return 1;
> +       return 0;
>  }
>  
>  static int


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 4/5] multipath: free vectors in configure
  2021-05-11 23:22 ` [dm-devel] [PATCH 4/5] multipath: free vectors in configure Benjamin Marzinski
@ 2021-05-12 12:36   ` Martin Wilck
  2021-05-12 19:53     ` Benjamin Marzinski
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Wilck @ 2021-05-12 12:36 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> configure() can retry multiple times, each time reallocing a maps and
> paths vector, and leaking the previous ones. Fix this by always
> freeing
> the vectors before configure() exits. Found by coverity.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  multipath/main.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/multipath/main.c b/multipath/main.c
> index ef89c7cf..25c5dbfd 100644
> --- a/multipath/main.c
> +++ b/multipath/main.c
> @@ -466,7 +466,6 @@ configure (struct config *conf, enum mpath_cmds
> cmd,
>          */
>         curmp = vector_alloc();
>         pathvec = vector_alloc();
> -       atexit(cleanup_vecs);
>  
>         if (!curmp || !pathvec) {
>                 condlog(0, "can not allocate memory");
> @@ -578,6 +577,11 @@ out:
>         if (refwwid)
>                 FREE(refwwid);
>  
> +       free_multipathvec(curmp, KEEP_PATHS);
> +       vecs.mpvec = NULL;
> +       free_pathvec(pathvec, FREE_PATHS);
> +       vecs.pathvec = NULL;
> +
>         return r;
>  }
>  
> @@ -1053,6 +1057,7 @@ main (int argc, char *argv[])
>                 r = dm_flush_maps(1, retries) ? RTVL_FAIL : RTVL_OK;
>                 goto out;
>         }
> +       atexit(cleanup_vecs);
>         while ((r = configure(conf, cmd, dev_type, dev)) ==
> RTVL_RETRY)
>                 condlog(3, "restart multipath configuration
> process");
>  


Nit: I'd rather move this atexit() call towards the beginning of
main(), after the other atexit() calls.

Apart from that, ACK.

Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL
  2021-05-11 23:22 ` [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL Benjamin Marzinski
@ 2021-05-12 12:39   ` Martin Wilck
  0 siblings, 0 replies; 17+ messages in thread
From: Martin Wilck @ 2021-05-12 12:39 UTC (permalink / raw)
  To: bmarzins, christophe.varoqui; +Cc: dm-devel

On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> If a new block was allocated, but couldn't be filled, getblock will
> discard it. When it does so, it needs to free the block to avoid
> leaking
> memory. Found by coverity.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed
  2021-05-12  9:11   ` Martin Wilck
@ 2021-05-12 14:17     ` Benjamin Marzinski
  0 siblings, 0 replies; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-12 14:17 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, May 12, 2021 at 09:11:01AM +0000, Martin Wilck wrote:
> On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > In ev_remove_path(), if update_mpp_paths() fails, we delete the
> > entire
> > map. However, since update_mpp_paths() happens before we call
> > set_path_removed(), pp->initialized isn't set to INIT_REMOVED, so
> > remove_map_and_stop_waiter() doesn't remove the path when in removes
> > the
> > map.  But with the map removed, there's nothing to keep us from
> > removing
> > the path.
> > 
> > Call set_path_removed() before update_mpp_paths() to avoid the odd
> > case
> > of ev_remove_path() removing the map but not the path.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  libmultipath/structs_vec.c |  3 +--
> >  multipathd/main.c          | 13 ++++++++-----
> >  2 files changed, 9 insertions(+), 7 deletions(-)
> > 
> > diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
> > index d242c06b..432c0c63 100644
> > --- a/libmultipath/structs_vec.c
> > +++ b/libmultipath/structs_vec.c
> > @@ -45,8 +45,7 @@ int update_mpp_paths(struct multipath *mpp, vector
> > pathvec)
> >  
> >                                 /*
> >                                  * Avoid adding removed paths to the
> > map again
> > -                                * when we reload it. Such paths may
> > exist if
> > -                                * domap fails in ev_remove_path().
> > +                                * when we reload it.
> 
> I'd like to keep the remark about domap(). It's meant as a reminder for
> us and future developers how this situation is most likely to come to
> pass.

Sure. I just removed it, since we now call update_mpp_paths immediately
after calling set_path_removed(), so it seemed more obvious that we will
run into this situation than it did before, when it only happened if we
first failed in ev_remove_path(). I'm fine with putting it back.

> 
> Other than that, ACK.
> 
> Regards,
> Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-12 11:38   ` Martin Wilck
@ 2021-05-12 19:53     ` Benjamin Marzinski
  2021-05-12 20:36       ` Martin Wilck
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-12 19:53 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, May 12, 2021 at 11:38:08AM +0000, Martin Wilck wrote:
> On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > When ev_remove_path() returns success, callers assume that the path
> > (and
> > possibly the map) has been removed.  When ev_remove_path() returns
> > failure, callers assume that the path has not been removed. However,
> > the
> > path could be removed on both success or failure. This could cause
> > callers to dereference the path after it was removed. Change
> > ev_remove_path() to return success whenever the path is removed, even
> > if
> > the map was removed due to a failure when trying to reload it. Found by
> > coverity.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> 
> This looks ok, but I'd like to discuss it in some more depth.
> 
> We need to clarify the meaning of the return code of ev_remove_path().
> We guarantee that, when ev_remove_path() returns, either the path is
> removed from internal data structures and kernel maps, or INIT_REMOVED
> is set. We can't guarantee whether the path
> 
>  a) is not referenced any more by any kernel map,
>  b) was actually removed from pathvec and other data structures in
> multipathd.
> 
> We have to agree on whether it means a) or b) if we can't make these
> two cases equivalent. Assuming multipathd has correct information about
> the kernel mappings, the only difference between a) and b) is the
> unlikely failure in setup_multipath(), where a) is true and b) is not
> (because sync_map_state() won't be called). Your patch assumes the
> semantics of a), which is correct AFAICS, but should be more clearly
> documented.

Well, actually because of wait_for_udev and !need_do_map, a successful
return can still leave the kernel maps and internal structures
unchanged. It's just that callers have to assume that b is the case.
 
> Actually, I think we can fix the discrepancy between a) and b) - if
> domap() was successful, we should be able to orphan the path, even if
> update_multipath_strings() failed for some reason.

I'm pretty sure that this is already the case.  This comment

 /*
  * Successful map reload without this path:
  * sync_map_state() will free it.
  */

is a lie. If setup_multipath() succeeds, the path will get removed by
check_removed_paths() via:

__setup_multipath -> update_path_strings -> sync_paths -> check_removed_paths

If setup_multipath() fails, the path will get removed by
remove_map_and_stop_waiter() via:

__setup_multipath -> remove_map_and_stop_waiter -> remove_map -> orphan_paths

So AFAICS, the only way for a path not to get removed is if you succeed
with wait_for_udev or !need_do_map, or if you fail in domap.

> IMO we should consequently change the retval for the cases where
> ev_remove_path() returns without deleting the path for a non-"failure"
> case (wait_for_udev and !need_do_map).

So you think these should return failure? For need_do_map, I think we
would want to distinguish between cases where everything worked
correctly and we're just waiting to update the map, and cases where
something went wrong. Since wait_for_udev can happen in more situations,
it's a lot harder to say what the right answer is. For cli_add_path and
uev_add_path, it seems like we want to know if the path was really
removed. So returning failure there makes sense.  For cli_del_path and
uev_remove_path, it seems like we want to avoid spurious error messages
when everything went alright and we're just waiting to update the map.
So returning success makes sense there.

Perhaps the answer is to return symbolic values, to describe what
actually happened, rather than success or failure. They could either be
bitmask values or we could have helper functions to help checking
for multiple valid return values.

> Thoughts? Whatever we decide wrt the semantics of the return code, we
> should document it clearly in comments at the function header.
> 
> Here's a quick review of callers:
> 
>  - cli_add_path(): AFAICS this needs b) semantics. We shouldn't set up
> a new path unless it had been successfully removed internally.
>  - cli_del_path(): needs a) semantics.
>  - handle_path_wwid_change(): needs a).
>  - uev_add_path(): needs a).
>  - uev_remove_path(): the return code of ev_remove_path doesn't matter
> much here. This is the only caller that sets need_do_map = false.
>  - uev_update_path(): we currently don't look at the return code.
> uev_add_path() will make another attempt at removing the path if
> necessary.
> 
> Regards
> Martin
> 
> P.S.: The remaining failure cases in ev_remove_path() are the failures
> in update_mpp_paths() and setup_map(). The former can only fail with
> ENOMEM, afaics. The latter, likewise, or if the map is fundamentally
> misconfigured (which to me means that a previous call to setup_map()
> would have failed as well). I'm wondering why we remove the entire map
> in these failure cases. This goes back all the way to 45eb316 
> ("[multipathd] DM configuration final cut") from 2005. It's true that
> both failures are pretty much fatal, but why is removing the map the
> answer here?

I don't think it has to be the answer. There are other cases where
setup_map() fails and we don't automatically wipe the map.  I did
consider changing it when I was looking through ev_remove_path(), but
I've never known this code to cause any issues, and as you mention,
it isn't wrong to do so, just not really necessary AFAICS.

> However, this goes beyond the purpose of your patch. *If* we remove the
> map, returning 0 is correct for either a) or b).
> 
> P.S. 2: I wonder if the logic in uev_update_path() is correct. Rather
> than calling uev_add_path() after rescan_path() directly, I think we
> should rather wait for another uevent (and possibly trigger another
> "add" event, I don't think "rescan" automatically generates one).
> 

Yep. You're correct. I'll fix that.

-Ben

> 
> > ---
> >  multipathd/main.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/multipathd/main.c b/multipathd/main.c
> > index 6090434c..4bdf14bd 100644
> > --- a/multipathd/main.c
> > +++ b/multipathd/main.c
> > @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct vectors *
> > vecs, int need_do_map)
> >  
> >                         strlcpy(devt, pp->dev_t, sizeof(devt));
> >                         if (setup_multipath(vecs, mpp))
> > -                               return 1;
> > +                               return 0;
> >                         /*
> >                          * Successful map reload without this path:
> >                          * sync_map_state() will free it.
> > @@ -1304,8 +1304,10 @@ out:
> >         return retval;
> >  
> >  fail:
> > +       condlog(0, "%s: error removing path. removing map %s", pp->dev,
> > +               mpp->alias);
> >         remove_map_and_stop_waiter(mpp, vecs);
> > -       return 1;
> > +       return 0;
> >  }
> >  
> >  static int

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 4/5] multipath: free vectors in configure
  2021-05-12 12:36   ` Martin Wilck
@ 2021-05-12 19:53     ` Benjamin Marzinski
  0 siblings, 0 replies; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-12 19:53 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, May 12, 2021 at 12:36:42PM +0000, Martin Wilck wrote:
> On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > configure() can retry multiple times, each time reallocing a maps and
> > paths vector, and leaking the previous ones. Fix this by always
> > freeing
> > the vectors before configure() exits. Found by coverity.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > ---
> >  multipath/main.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/multipath/main.c b/multipath/main.c
> > index ef89c7cf..25c5dbfd 100644
> > --- a/multipath/main.c
> > +++ b/multipath/main.c
> > @@ -466,7 +466,6 @@ configure (struct config *conf, enum mpath_cmds
> > cmd,
> >          */
> >         curmp = vector_alloc();
> >         pathvec = vector_alloc();
> > -       atexit(cleanup_vecs);
> >  
> >         if (!curmp || !pathvec) {
> >                 condlog(0, "can not allocate memory");
> > @@ -578,6 +577,11 @@ out:
> >         if (refwwid)
> >                 FREE(refwwid);
> >  
> > +       free_multipathvec(curmp, KEEP_PATHS);
> > +       vecs.mpvec = NULL;
> > +       free_pathvec(pathvec, FREE_PATHS);
> > +       vecs.pathvec = NULL;
> > +
> >         return r;
> >  }
> >  
> > @@ -1053,6 +1057,7 @@ main (int argc, char *argv[])
> >                 r = dm_flush_maps(1, retries) ? RTVL_FAIL : RTVL_OK;
> >                 goto out;
> >         }
> > +       atexit(cleanup_vecs);
> >         while ((r = configure(conf, cmd, dev_type, dev)) ==
> > RTVL_RETRY)
> >                 condlog(3, "restart multipath configuration
> > process");
> >  
> 
> 
> Nit: I'd rather move this atexit() call towards the beginning of
> main(), after the other atexit() calls.

Sure.

-Ben

> 
> Apart from that, ACK.
> 
> Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-12 19:53     ` Benjamin Marzinski
@ 2021-05-12 20:36       ` Martin Wilck
  2021-05-12 21:52         ` Benjamin Marzinski
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Wilck @ 2021-05-12 20:36 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Wed, 2021-05-12 at 14:53 -0500, Benjamin Marzinski wrote:
> On Wed, May 12, 2021 at 11:38:08AM +0000, Martin Wilck wrote:
> > On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > > When ev_remove_path() returns success, callers assume that the
> > > path
> > > (and
> > > possibly the map) has been removed.  When ev_remove_path()
> > > returns
> > > failure, callers assume that the path has not been removed.
> > > However,
> > > the
> > > path could be removed on both success or failure. This could
> > > cause
> > > callers to dereference the path after it was removed. Change
> > > ev_remove_path() to return success whenever the path is removed,
> > > even
> > > if
> > > the map was removed due to a failure when trying to reload it.
> > > Found by
> > > coverity.
> > > 
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > 
> > This looks ok, but I'd like to discuss it in some more depth.
> > 
> > We need to clarify the meaning of the return code of
> > ev_remove_path().
> > We guarantee that, when ev_remove_path() returns, either the path
> > is
> > removed from internal data structures and kernel maps, or
> > INIT_REMOVED
> > is set. We can't guarantee whether the path
> > 
> >  a) is not referenced any more by any kernel map,
> >  b) was actually removed from pathvec and other data structures in
> > multipathd.
> > 
> > We have to agree on whether it means a) or b) if we can't make
> > these
> > two cases equivalent. Assuming multipathd has correct information
> > about
> > the kernel mappings, the only difference between a) and b) is the
> > unlikely failure in setup_multipath(), where a) is true and b) is
> > not
> > (because sync_map_state() won't be called). Your patch assumes the
> > semantics of a), which is correct AFAICS, but should be more
> > clearly
> > documented.
> 
> Well, actually because of wait_for_udev and !need_do_map, a
> successful
> return can still leave the kernel maps and internal structures
> unchanged. It's just that callers have to assume that b is the case.
>  
> > Actually, I think we can fix the discrepancy between a) and b) - if
> > domap() was successful, we should be able to orphan the path, even
> > if
> > update_multipath_strings() failed for some reason.
> 
> I'm pretty sure that this is already the case.  This comment
> 
>  /*
>   * Successful map reload without this path:
>   * sync_map_state() will free it.
>   */
> 
> is a lie.

Indeed, you are right. I wasn't deliberately lying though, just failing
to understand my own code :-(  We should fix these comments.

>  If setup_multipath() succeeds, the path will get removed by
> check_removed_paths() via:
> 
> __setup_multipath -> update_path_strings -> sync_paths ->
> check_removed_paths
> 
> If setup_multipath() fails, the path will get removed by
> remove_map_and_stop_waiter() via:
> 
> __setup_multipath -> remove_map_and_stop_waiter -> remove_map ->
> orphan_paths
> 
> So AFAICS, the only way for a path not to get removed is if you
> succeed
> with wait_for_udev or !need_do_map, or if you fail in domap.

Agreed. Let's fix these comments.

> > IMO we should consequently change the retval for the cases where
> > ev_remove_path() returns without deleting the path for a non-
> > "failure"
> > case (wait_for_udev and !need_do_map).
> 
> So you think these should return failure? 

What I meant is that the return code of the function doesn't need to be
interpreted in terms of "success" or "failure". Rather as "path is now
gone" or "path is still referenced somewhere", which doesn't map 1:1 to
"success" and "failure", IMO.

> For need_do_map, I think we
> would want to distinguish between cases where everything worked
> correctly and we're just waiting to update the map, and cases where
> something went wrong.

This one is actually trivial, because it's only set to false by
uev_remove_path() if it's merging uevents.

>  Since wait_for_udev can happen in more situations,
> it's a lot harder to say what the right answer is. For cli_add_path
> and
> uev_add_path, it seems like we want to know if the path was really
> removed. So returning failure there makes sense.  For cli_del_path
> and
> uev_remove_path, it seems like we want to avoid spurious error
> messages
> when everything went alright and we're just waiting to update the
> map.
> So returning success makes sense there.
> 
> Perhaps the answer is to return symbolic values, to describe what
> actually happened, rather than success or failure.

This is what I meant. I didn't express myself clearly enough; I just
thought that 0 doesn't have to mean "success".


>  They could either be
> bitmask values or we could have helper functions to help checking
> for multiple valid return values.

I think the callers just need to know if the path is still referenced
somewhere. Acting appropriately is then up to the caller. You just
proved that my cases a) and b) are actually equivalent, which is nice.
Perhaps we need to introduce another return code indicating that the
entire map had been removed (e.g. failure in setup_multipath()).

> > Thoughts? Whatever we decide wrt the semantics of the return code,
> > we
> > should document it clearly in comments at the function header.
> > 
> > Here's a quick review of callers:
> > 
> >  - cli_add_path(): AFAICS this needs b) semantics. We shouldn't set
> > up
> > a new path unless it had been successfully removed internally.
> >  - cli_del_path(): needs a) semantics.
> >  - handle_path_wwid_change(): needs a).
> >  - uev_add_path(): needs a).
> >  - uev_remove_path(): the return code of ev_remove_path doesn't
> > matter
> > much here. This is the only caller that sets need_do_map = false.
> >  - uev_update_path(): we currently don't look at the return code.
> > uev_add_path() will make another attempt at removing the path if
> > necessary.
> > 
> > Regards
> > Martin
> > 
> > P.S.: The remaining failure cases in ev_remove_path() are the
> > failures
> > in update_mpp_paths() and setup_map(). The former can only fail
> > with
> > ENOMEM, afaics. The latter, likewise, or if the map is
> > fundamentally
> > misconfigured (which to me means that a previous call to
> > setup_map()
> > would have failed as well). I'm wondering why we remove the entire
> > map
> > in these failure cases. This goes back all the way to 45eb316 
> > ("[multipathd] DM configuration final cut") from 2005. It's true
> > that
> > both failures are pretty much fatal, but why is removing the map
> > the
> > answer here?
> 
> I don't think it has to be the answer. There are other cases where
> setup_map() fails and we don't automatically wipe the map.  I did
> consider changing it when I was looking through ev_remove_path(), but
> I've never known this code to cause any issues, and as you mention,
> it isn't wrong to do so, just not really necessary AFAICS.

Let's take care of this another time.

Regards,
Martin

> 
> > However, this goes beyond the purpose of your patch. *If* we remove
> > the
> > map, returning 0 is correct for either a) or b).
> > 
> > P.S. 2: I wonder if the logic in uev_update_path() is correct.
> > Rather
> > than calling uev_add_path() after rescan_path() directly, I think
> > we
> > should rather wait for another uevent (and possibly trigger another
> > "add" event, I don't think "rescan" automatically generates one).
> > 
> 
> Yep. You're correct. I'll fix that.
> 
> -Ben
> 
> > 
> > > ---
> > >  multipathd/main.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/multipathd/main.c b/multipathd/main.c
> > > index 6090434c..4bdf14bd 100644
> > > --- a/multipathd/main.c
> > > +++ b/multipathd/main.c
> > > @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct
> > > vectors *
> > > vecs, int need_do_map)
> > >  
> > >                         strlcpy(devt, pp->dev_t, sizeof(devt));
> > >                         if (setup_multipath(vecs, mpp))
> > > -                               return 1;
> > > +                               return 0;
> > >                         /*
> > >                          * Successful map reload without this
> > > path:
> > >                          * sync_map_state() will free it.
> > > @@ -1304,8 +1304,10 @@ out:
> > >         return retval;
> > >  
> > >  fail:
> > > +       condlog(0, "%s: error removing path. removing map %s",
> > > pp->dev,
> > > +               mpp->alias);
> > >         remove_map_and_stop_waiter(mpp, vecs);
> > > -       return 1;
> > > +       return 0;
> > >  }
> > >  
> > >  static int
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-12 20:36       ` Martin Wilck
@ 2021-05-12 21:52         ` Benjamin Marzinski
  2021-05-13 19:36           ` Martin Wilck
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Marzinski @ 2021-05-12 21:52 UTC (permalink / raw)
  To: Martin Wilck; +Cc: dm-devel

On Wed, May 12, 2021 at 08:36:49PM +0000, Martin Wilck wrote:
> On Wed, 2021-05-12 at 14:53 -0500, Benjamin Marzinski wrote:
> > On Wed, May 12, 2021 at 11:38:08AM +0000, Martin Wilck wrote:
> > > On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > So AFAICS, the only way for a path not to get removed is if you
> > succeed
> > with wait_for_udev or !need_do_map, or if you fail in domap.
> 
> Agreed. Let's fix these comments.

Yep.
 
> >  Since wait_for_udev can happen in more situations,
> > it's a lot harder to say what the right answer is. For cli_add_path
> > and
> > uev_add_path, it seems like we want to know if the path was really
> > removed. So returning failure there makes sense.  For cli_del_path
> > and
> > uev_remove_path, it seems like we want to avoid spurious error
> > messages
> > when everything went alright and we're just waiting to update the
> > map.
> > So returning success makes sense there.
> > 
> > Perhaps the answer is to return symbolic values, to describe what
> > actually happened, rather than success or failure.
> 
> This is what I meant. I didn't express myself clearly enough; I just
> thought that 0 doesn't have to mean "success".
> 

Sure. I'll add symbolic returns.

> 
> I think the callers just need to know if the path is still referenced
> somewhere. Acting appropriately is then up to the caller. You just
> proved that my cases a) and b) are actually equivalent, which is nice.
> Perhaps we need to introduce another return code indicating that the
> entire map had been removed (e.g. failure in setup_multipath()).

The more important return to me seems to be an indication of whether the
remove has been delayed.  For uev_remove_path(), you don't want to
return failure just because the remove has been delayed. Otherwise there
will be spurious error messages in the logs. cli_del_path is a little
trickier.  My biggest question with that is whether it would mess with
people's scripts to add a reply message saying what happened. It seems
like it should only fail if domap failed. But it would be nice to tell
the user that the remove has been delayed, or that the map couldn't be
reloaded and was removed as well. 

> > > However, this goes beyond the purpose of your patch. *If* we remove
> > > the
> > > map, returning 0 is correct for either a) or b).
> > > 
> > > P.S. 2: I wonder if the logic in uev_update_path() is correct.
> > > Rather
> > > than calling uev_add_path() after rescan_path() directly, I think
> > > we
> > > should rather wait for another uevent (and possibly trigger another
> > > "add" event, I don't think "rescan" automatically generates one).
> > > 
> > 
> > Yep. You're correct. I'll fix that.

Actually, I take it back. The code seems to work o.k. as is. The
uev_update_path() code checks if get_uid() now returns a different
value, instead of using get_vpd_sgio() like the recheck_wwid code does.
This means that the uid_attribute must have already gotten updated when
rescan_path() is called. So my real question is "is there any real
benefit to calling rescan_path() at all here". This code seemed to be
working correctly before we added it, except in the case where
uid_attribute wasn't getting updated (which recheck_wwid now will
hopefully catch).

If there is a benefit, then we have to be careful to only call it once.
Otherwise, we could get stuck in an endless loop where we trigger an add
uevent, which in turn triggers another add uevent, and so on.

-Ben
 
> > -Ben
> > 
> > > 
> > > > ---
> > > >  multipathd/main.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/multipathd/main.c b/multipathd/main.c
> > > > index 6090434c..4bdf14bd 100644
> > > > --- a/multipathd/main.c
> > > > +++ b/multipathd/main.c
> > > > @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct
> > > > vectors *
> > > > vecs, int need_do_map)
> > > >  
> > > >                         strlcpy(devt, pp->dev_t, sizeof(devt));
> > > >                         if (setup_multipath(vecs, mpp))
> > > > -                               return 1;
> > > > +                               return 0;
> > > >                         /*
> > > >                          * Successful map reload without this
> > > > path:
> > > >                          * sync_map_state() will free it.
> > > > @@ -1304,8 +1304,10 @@ out:
> > > >         return retval;
> > > >  
> > > >  fail:
> > > > +       condlog(0, "%s: error removing path. removing map %s",
> > > > pp->dev,
> > > > +               mpp->alias);
> > > >         remove_map_and_stop_waiter(mpp, vecs);
> > > > -       return 1;
> > > > +       return 0;
> > > >  }
> > > >  
> > > >  static int
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://listman.redhat.com/mailman/listinfo/dm-devel
> > 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal
  2021-05-12 21:52         ` Benjamin Marzinski
@ 2021-05-13 19:36           ` Martin Wilck
  0 siblings, 0 replies; 17+ messages in thread
From: Martin Wilck @ 2021-05-13 19:36 UTC (permalink / raw)
  To: bmarzins; +Cc: dm-devel

On Wed, 2021-05-12 at 16:52 -0500, Benjamin Marzinski wrote:
> On Wed, May 12, 2021 at 08:36:49PM +0000, Martin Wilck wrote:
> > On Wed, 2021-05-12 at 14:53 -0500, Benjamin Marzinski wrote:
> > > On Wed, May 12, 2021 at 11:38:08AM +0000, Martin Wilck wrote:
> > > > On Tue, 2021-05-11 at 18:22 -0500, Benjamin Marzinski wrote:
> > > So AFAICS, the only way for a path not to get removed is if you
> > > succeed
> > > with wait_for_udev or !need_do_map, or if you fail in domap.
> > 
> > Agreed. Let's fix these comments.
> 
> Yep.
>  
> > >  Since wait_for_udev can happen in more situations,
> > > it's a lot harder to say what the right answer is. For
> > > cli_add_path
> > > and
> > > uev_add_path, it seems like we want to know if the path was
> > > really
> > > removed. So returning failure there makes sense.  For
> > > cli_del_path
> > > and
> > > uev_remove_path, it seems like we want to avoid spurious error
> > > messages
> > > when everything went alright and we're just waiting to update the
> > > map.
> > > So returning success makes sense there.
> > > 
> > > Perhaps the answer is to return symbolic values, to describe what
> > > actually happened, rather than success or failure.
> > 
> > This is what I meant. I didn't express myself clearly enough; I
> > just
> > thought that 0 doesn't have to mean "success".
> > 
> 
> Sure. I'll add symbolic returns.
> 
> > 
> > I think the callers just need to know if the path is still
> > referenced
> > somewhere. Acting appropriately is then up to the caller. You just
> > proved that my cases a) and b) are actually equivalent, which is
> > nice.
> > Perhaps we need to introduce another return code indicating that
> > the
> > entire map had been removed (e.g. failure in setup_multipath()).
> 
> The more important return to me seems to be an indication of whether
> the
> remove has been delayed. 

To make sure that we talk about the same thing: when you say "the
remove has been delayed", you mean the case where we just set
INIT_REMOVED, without actually deleting the path from pathvec etc.,
right? This is what I meant with "path is still referenced somewhere"
in my previous post. Ack, this is of course the most important thing
for the callers to know.

>  For uev_remove_path(), you don't want to
> return failure just because the remove has been delayed. Otherwise
> there
> will be spurious error messages in the logs.

With the introduction of INIT_REMOVED, I think we could do away with
these error messages altogether. uev_remove_path() could actually be a
void function. We *know* that at least INIT_REMOVED will be set, which
means that that path will be treated by multipathd as if it didn't
exist. The error message you're talking about would be the highly
unhelpful "uevent trigger error" message - we might was well just ditch
that message. We print much more meaningful messages in
ev_remove_path().

>  cli_del_path is a little
> trickier.  My biggest question with that is whether it would mess
> with
> people's scripts to add a reply message saying what happened. It
> seems
> like it should only fail if domap failed. But it would be nice to
> tell
> the user that the remove has been delayed, or that the map couldn't
> be
> reloaded and was removed as well. 

Same argument here. As far as multipathd is concerned, that path will
be gone. We print "fail" if the domap() call failed, and we should
continue to do so. We could add documentation saying that this means a
"deferred removal".

> 
> > > > However, this goes beyond the purpose of your patch. *If* we
> > > > remove
> > > > the
> > > > map, returning 0 is correct for either a) or b).
> > > > 
> > > > P.S. 2: I wonder if the logic in uev_update_path() is correct.
> > > > Rather
> > > > than calling uev_add_path() after rescan_path() directly, I
> > > > think
> > > > we
> > > > should rather wait for another uevent (and possibly trigger
> > > > another
> > > > "add" event, I don't think "rescan" automatically generates
> > > > one).
> > > > 
> > > 
> > > Yep. You're correct. I'll fix that.
> 
> Actually, I take it back. The code seems to work o.k. as is. The
> uev_update_path() code checks if get_uid() now returns a different
> value, instead of using get_vpd_sgio() like the recheck_wwid code
> does.
> This means that the uid_attribute must have already gotten updated
> when
> rescan_path() is called. So my real question is "is there any real
> benefit to calling rescan_path() at all here". This code seemed to be
> working correctly before we added it, except in the case where
> uid_attribute wasn't getting updated (which recheck_wwid now will
> hopefully catch).

My point was that calling uev_add_path() right after rescan_path() is
wrong, and I still think so - *if* we rescan, we shouldn't look at udev
properties before we can be reasonably sure that the rescan has
completed and has been processed by udev. I agree that calling
rescan_path() in this code path is probably not helpful. 

Let's remove it.

> If there is a benefit, then we have to be careful to only call it
> once.
> Otherwise, we could get stuck in an endless loop where we trigger an
> add
> uevent, which in turn triggers another add uevent, and so on.

I don't see that risk, because uev_update_path() is only called for
"change" uevents, not "add".

Regards,
Martin

> 
> -Ben
>  
> > > -Ben
> > > 
> > > > 
> > > > > ---
> > > > >  multipathd/main.c | 6 ++++--
> > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/multipathd/main.c b/multipathd/main.c
> > > > > index 6090434c..4bdf14bd 100644
> > > > > --- a/multipathd/main.c
> > > > > +++ b/multipathd/main.c
> > > > > @@ -1284,7 +1284,7 @@ ev_remove_path (struct path *pp, struct
> > > > > vectors *
> > > > > vecs, int need_do_map)
> > > > >  
> > > > >                         strlcpy(devt, pp->dev_t,
> > > > > sizeof(devt));
> > > > >                         if (setup_multipath(vecs, mpp))
> > > > > -                               return 1;
> > > > > +                               return 0;
> > > > >                         /*
> > > > >                          * Successful map reload without this
> > > > > path:
> > > > >                          * sync_map_state() will free it.
> > > > > @@ -1304,8 +1304,10 @@ out:
> > > > >         return retval;
> > > > >  
> > > > >  fail:
> > > > > +       condlog(0, "%s: error removing path. removing map
> > > > > %s",
> > > > > pp->dev,
> > > > > +               mpp->alias);
> > > > >         remove_map_and_stop_waiter(mpp, vecs);
> > > > > -       return 1;
> > > > > +       return 0;
> > > > >  }
> > > > >  
> > > > >  static int
> > > 
> > > --
> > > dm-devel mailing list
> > > dm-devel@redhat.com
> > > https://listman.redhat.com/mailman/listinfo/dm-devel
> > > 
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-05-13 19:37 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 23:21 [dm-devel] [PATCH 0/5] Memory issues found by coverity Benjamin Marzinski
2021-05-11 23:22 ` [dm-devel] [PATCH 1/5] multipathd: don't fail to remove path once the map is removed Benjamin Marzinski
2021-05-12  9:11   ` Martin Wilck
2021-05-12 14:17     ` Benjamin Marzinski
2021-05-11 23:22 ` [dm-devel] [PATCH 2/5] multipathd: remove duplicate orphan_paths in flush_map Benjamin Marzinski
2021-05-12  9:16   ` Martin Wilck
2021-05-11 23:22 ` [dm-devel] [PATCH 3/5] multipathd: make ev_remove_path return success on path removal Benjamin Marzinski
2021-05-12 11:38   ` Martin Wilck
2021-05-12 19:53     ` Benjamin Marzinski
2021-05-12 20:36       ` Martin Wilck
2021-05-12 21:52         ` Benjamin Marzinski
2021-05-13 19:36           ` Martin Wilck
2021-05-11 23:22 ` [dm-devel] [PATCH 4/5] multipath: free vectors in configure Benjamin Marzinski
2021-05-12 12:36   ` Martin Wilck
2021-05-12 19:53     ` Benjamin Marzinski
2021-05-11 23:22 ` [dm-devel] [PATCH 5/5] kpartx: Don't leak memory when getblock returns NULL Benjamin Marzinski
2021-05-12 12:39   ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.