All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel
@ 2020-07-08  1:47 Chongyun Wu
  2020-07-09 11:11 ` Martin Wilck
  0 siblings, 1 reply; 4+ messages in thread
From: Chongyun Wu @ 2020-07-08  1:47 UTC (permalink / raw)
  To: Martin Wilck, Benjamin Marzinski, dm-devel
  Cc: Chengchiwen, Guozhonghua, Changlimin, Wangyong, Zhangguanghui

Hello Martin,
Thanks for your reply. 
Waiting for your patches I'd like to test it.  And also, I'd like to tell more detail about this issue.

> Is /dev/sdbk indeed a healthy path in this situation, or not?
> Please run "multipathd show devices", too.
Yes, /dev/sdbk can read and write, the condition is well, it fully recovered. 
Just because it not in pathvec and check_path have no chance to found it up and reinstate it in
dm, and dm map have this path but queue_if_no_path still have, dm state is down, dm io blocked.

>I wonder if your logs provide some more evidence how this situation
>came to pass. I suspect that either a) uevents got lost, or that b)
>multipathd failed to (re-)add the path while handling an uevent. It'd
>be interesting to find out what it actually was.
Firstly I also suspect like you, but actually there is not any uevent to remove the path in my debugging version with more log.
But I found that after reconfig processed issue might appears, detect more, found that when storage network down block
device not accessed reconfig will make paths disappear form multipathd because its status is bad. 
I think this might be a reason. Add more log for pathvec removing path, but not found other glue.


>I believe that your problem would have been solved simply by removing
>the "!is_daemon" condition above. I'd like to know if that's actually
>the case, because your add_missing_path() function does basically the
>same thing (plus calling pathinfo()).
I have tried this but not work. Only need pathinfo and store pathinfo into pathvec can work, because check_path need
full path info to finish check old code path only have pp->dev and pp->dev_t, check_path cannot use this to check path status.
Dm io blocked issue not happen again, because this patch can repair this blocked very fast.
Also dm io blocked issue actually is a low probability event.


>However, the "!is_daemon" test is there for a reason (see b96dead 
> ("[multipathd] remove the retry login in uev_remove_path()"), and
>that's why your patch isn't correct as-is.

-----original-----
发件人: Martin Wilck [mailto:mwilck@suse.com] 
发送时间: 2020年7月7日 17:43
收件人: wuchongyun (Cloud) <wu.chongyun@h3c.com>; Benjamin Marzinski <bmarzins@redhat.com>; dm-devel@redhat.com
抄送: guozhonghua (Cloud) <guozhonghua@h3c.com>; wangyong (Cloud) <wang.yongD@h3c.com>; changlimin (Cloud) <changlimin@h3c.com>; zhangguanghui (Cloud) <zhang.guanghui@h3c.com>; chengchiwen (Cloud) <chengchiwen@h3c.com>
主题: Re: [dm-devel] [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel

Hello Chongyun,

On Tue, 2020-07-07 at 03:08 +0000, Chongyun Wu wrote:
> Hi Martin and Ben,
> 
> Cloud you help to view below patch, thanks.
> 
> > From b2786c81a78bf3868f300fd3177e852e718e7790 Mon Sep 17 00:00:00
> > 2001
> From: Chongyun Wu <wu.chongyun@h3c.com>
> Date: Mon, 6 Jul 2020 11:22:21 +0800
> Subject: [PATCH] libmultipath/dmparser: add missing path with good 
> status  when sync state with dm kernel
> 

Nack, sorry. I know we have an issue in this area, but I would like to
handle it differently.

First of all, I want to get rid of disassemble_map() making
modifications to the pathvec. The fact that disassemble_map() currently
does this is an ugly layer violation IMO, and I don't like the idea of
adding more of this on top. I'm currently preparing a patch set that
addresses this (among other things). It will also address the issue of
missing paths in pathvec, and I'd be very glad if you could give it a
try in your test environment.


> Add path back into deamon vecs->pathvecs when found path missing in
> multipathd which can fix dm io blocked issue.
> 
> Test environment:
> several hosts;
> each host have 100+ multipath, each multipath have 1 to 4 paths.
> run up/down storage network loop script for days.
> 
> Issue:
> After several hours stop script, found some hosts have dm io blocked
> issue:
> slave block device access well, but its dm device blocked.
> 36c0bfc0100a8d4a228214da50000003c dm-20 HUAWEI,XSG1
> size=350G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=enabled
>   |- 1:0:0:20  sdbk 67:224  failed ready running
> multipathd show paths cannot found sdbk!

Is /dev/sdbk indeed a healthy path in this situation, or not?
Please run "multipathd show devices", too.

I wonder if your logs provide some more evidence how this situation
came to pass. I suspect that either a) uevents got lost, or that b)
multipathd failed to (re-)add the path while handling an uevent. It'd
be interesting to find out what it actually was.

More notes below.

> Test result:
> With this patch, run script several days, io blocked issue
> not found again.
> 
> This patch can fix this issue: when found path only missing in
> multipathd but still in dm, check the missing path status if ok
> try to add it back, and checker have chance to reinstate this
> path and the dm io blocked issue will disappear.
> 
> Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
> ---
>  libmultipath/dmparser.c | 31 +++++++++++++++++++++++++++++++
>  libmultipath/dmparser.h |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/libmultipath/dmparser.c b/libmultipath/dmparser.c
> index b856a07f..2f90b17c 100644
> --- a/libmultipath/dmparser.c
> +++ b/libmultipath/dmparser.c
> @@ -317,6 +317,12 @@ int disassemble_map(vector pathvec, char
> *params, struct multipath *mpp,
>  				/* Only call this in multipath client
> mode */
>  				if (!is_daemon && store_path(pathvec,
> pp))
>  					goto out1;

I believe that your problem would have been solved simply by removing
the "!is_daemon" condition above. I'd like to know if that's actually
the case, because your add_missing_path() function does basically the
same thing (plus calling pathinfo()).

However, the "!is_daemon" test is there for a reason (see b96dead 
("[multipathd] remove the retry login in uev_remove_path()"), and
that's why your patch isn't correct as-is. 

Regards,
Martin

> +
> +				/* Try to add good status path back to
> avoid
> +				 * dm io blocked issue in special
> condition.
> +				 */
> +				if(add_missing_path(pathvec, devname))
> +					condlog(2, "Try to add missing
> path %s failed", devname);
>  			} else {
>  				if (!strlen(pp->wwid) &&
>  				    strlen(mpp->wwid))
> @@ -569,3 +575,28 @@ int disassemble_status(char *params, struct
> multipath *mpp)
>  	}
>  	return 0;
>  }
> +
> +/* Add missing good status path back to multipathd */
> +int add_missing_path(vector pathvec, char *missing_dev)
> +{
> +	struct udev_device *udevice;
> +	struct config *conf;
> +	int ret = 0;
> +				
> +	condlog(2, "Cant't found path %s try to add it back if its
> state is up.", 
> +			missing_dev);
> +
> +	udevice = udev_device_new_from_subsystem_sysname(udev, "block",
> missing_dev);
> +	if (!udevice) {
> +		condlog(0, "%s: can't find path form udev",
> missing_dev);
> +		return 1;
> +	}
> +	conf = get_multipath_config();
> +	pthread_cleanup_push(put_multipath_config, conf);
> +	ret = store_pathinfo(pathvec, conf,
> +						 udevice, DI_ALL |
> DI_BLACKLIST, NULL);
> +	pthread_cleanup_pop(1);
> +	udev_device_unref(udevice);
> +
> +	return ret;
> +}
> diff --git a/libmultipath/dmparser.h b/libmultipath/dmparser.h
> index e1badb0b..515ca900 100644
> --- a/libmultipath/dmparser.h
> +++ b/libmultipath/dmparser.h
> @@ -1,3 +1,4 @@
>  int assemble_map (struct multipath *, char *, int);
>  int disassemble_map (vector, char *, struct multipath *, int);
>  int disassemble_status (char *, struct multipath *);
> +int add_missing_path(vector pathvec, char *missing_dev);



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel
  2020-07-08  1:47 [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel Chongyun Wu
@ 2020-07-09 11:11 ` Martin Wilck
  0 siblings, 0 replies; 4+ messages in thread
From: Martin Wilck @ 2020-07-09 11:11 UTC (permalink / raw)
  To: Chongyun Wu, Benjamin Marzinski, dm-devel
  Cc: Guozhonghua, Chengchiwen, Wangyong, Changlimin, Zhangguanghui

Hi Chongyun,

On Wed, 2020-07-08 at 01:47 +0000, Chongyun Wu wrote:
> Hello Martin,
> Thanks for your reply. 
> Waiting for your patches I'd like to test it.  And also, I'd like to
> tell more detail about this issue.

Please check out the series I just posted. Part V is what matters for
your use case. I planned to CC you on that part but forgot, sorry.

Thanks,
Martin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel
  2020-07-07  3:08 Chongyun Wu
@ 2020-07-07  9:43 ` Martin Wilck
  0 siblings, 0 replies; 4+ messages in thread
From: Martin Wilck @ 2020-07-07  9:43 UTC (permalink / raw)
  To: Chongyun Wu, Benjamin Marzinski, dm-devel
  Cc: Chengchiwen, Guozhonghua, Changlimin, Wangyong, Zhangguanghui

Hello Chongyun,

On Tue, 2020-07-07 at 03:08 +0000, Chongyun Wu wrote:
> Hi Martin and Ben,
> 
> Cloud you help to view below patch, thanks.
> 
> > From b2786c81a78bf3868f300fd3177e852e718e7790 Mon Sep 17 00:00:00
> > 2001
> From: Chongyun Wu <wu.chongyun@h3c.com>
> Date: Mon, 6 Jul 2020 11:22:21 +0800
> Subject: [PATCH] libmultipath/dmparser: add missing path with good
> status
>  when sync state with dm kernel
> 

Nack, sorry. I know we have an issue in this area, but I would like to
handle it differently.

First of all, I want to get rid of disassemble_map() making
modifications to the pathvec. The fact that disassemble_map() currently
does this is an ugly layer violation IMO, and I don't like the idea of
adding more of this on top. I'm currently preparing a patch set that
addresses this (among other things). It will also address the issue of
missing paths in pathvec, and I'd be very glad if you could give it a
try in your test environment.

> Add path back into deamon vecs->pathvecs when found path missing in
> multipathd which can fix dm io blocked issue.
> 
> Test environment:
> several hosts;
> each host have 100+ multipath, each multipath have 1 to 4 paths.
> run up/down storage network loop script for days.
> 
> Issue:
> After several hours stop script, found some hosts have dm io blocked
> issue:
> slave block device access well, but its dm device blocked.
> 36c0bfc0100a8d4a228214da50000003c dm-20 HUAWEI,XSG1
> size=350G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=enabled
>   |- 1:0:0:20  sdbk 67:224  failed ready running
> multipathd show paths cannot found sdbk!

Is /dev/sdbk indeed a healthy path in this situation, or not?
Please run "multipathd show devices", too.

I wonder if your logs provide some more evidence how this situation
came to pass. I suspect that either a) uevents got lost, or that b)
multipathd failed to (re-)add the path while handling an uevent. It'd
be interesting to find out what it actually was.

More notes below.

> Test result:
> With this patch, run script several days, io blocked issue
> not found again.
> 
> This patch can fix this issue: when found path only missing in
> multipathd but still in dm, check the missing path status if ok
> try to add it back, and checker have chance to reinstate this
> path and the dm io blocked issue will disappear.
> 
> Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
> ---
>  libmultipath/dmparser.c | 31 +++++++++++++++++++++++++++++++
>  libmultipath/dmparser.h |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/libmultipath/dmparser.c b/libmultipath/dmparser.c
> index b856a07f..2f90b17c 100644
> --- a/libmultipath/dmparser.c
> +++ b/libmultipath/dmparser.c
> @@ -317,6 +317,12 @@ int disassemble_map(vector pathvec, char
> *params, struct multipath *mpp,
>  				/* Only call this in multipath client
> mode */
>  				if (!is_daemon && store_path(pathvec,
> pp))
>  					goto out1;

I believe that your problem would have been solved simply by removing
the "!is_daemon" condition above. I'd like to know if that's actually
the case, because your add_missing_path() function does basically the
same thing (plus calling pathinfo()).

However, the "!is_daemon" test is there for a reason (see b96dead 
("[multipathd] remove the retry login in uev_remove_path()"), and
that's why your patch isn't correct as-is. 

Regards,
Martin

> +
> +				/* Try to add good status path back to
> avoid
> +				 * dm io blocked issue in special
> condition.
> +				 */
> +				if(add_missing_path(pathvec, devname))
> +					condlog(2, "Try to add missing
> path %s failed", devname);
>  			} else {
>  				if (!strlen(pp->wwid) &&
>  				    strlen(mpp->wwid))
> @@ -569,3 +575,28 @@ int disassemble_status(char *params, struct
> multipath *mpp)
>  	}
>  	return 0;
>  }
> +
> +/* Add missing good status path back to multipathd */
> +int add_missing_path(vector pathvec, char *missing_dev)
> +{
> +	struct udev_device *udevice;
> +	struct config *conf;
> +	int ret = 0;
> +				
> +	condlog(2, "Cant't found path %s try to add it back if its
> state is up.", 
> +			missing_dev);
> +
> +	udevice = udev_device_new_from_subsystem_sysname(udev, "block",
> missing_dev);
> +	if (!udevice) {
> +		condlog(0, "%s: can't find path form udev",
> missing_dev);
> +		return 1;
> +	}
> +	conf = get_multipath_config();
> +	pthread_cleanup_push(put_multipath_config, conf);
> +	ret = store_pathinfo(pathvec, conf,
> +						 udevice, DI_ALL |
> DI_BLACKLIST, NULL);
> +	pthread_cleanup_pop(1);
> +	udev_device_unref(udevice);
> +
> +	return ret;
> +}
> diff --git a/libmultipath/dmparser.h b/libmultipath/dmparser.h
> index e1badb0b..515ca900 100644
> --- a/libmultipath/dmparser.h
> +++ b/libmultipath/dmparser.h
> @@ -1,3 +1,4 @@
>  int assemble_map (struct multipath *, char *, int);
>  int disassemble_map (vector, char *, struct multipath *, int);
>  int disassemble_status (char *, struct multipath *);
> +int add_missing_path(vector pathvec, char *missing_dev);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel
@ 2020-07-07  3:08 Chongyun Wu
  2020-07-07  9:43 ` Martin Wilck
  0 siblings, 1 reply; 4+ messages in thread
From: Chongyun Wu @ 2020-07-07  3:08 UTC (permalink / raw)
  To: Martin Wilck (mwilck@suse.com), Benjamin Marzinski, dm-devel
  Cc: Guozhonghua, Wangyong, Changlimin, Zhangguanghui, Chengchiwen

Hi Martin and Ben,

Cloud you help to view below patch, thanks.

>From b2786c81a78bf3868f300fd3177e852e718e7790 Mon Sep 17 00:00:00 2001
From: Chongyun Wu <wu.chongyun@h3c.com>
Date: Mon, 6 Jul 2020 11:22:21 +0800
Subject: [PATCH] libmultipath/dmparser: add missing path with good status
 when sync state with dm kernel

Add path back into deamon vecs->pathvecs when found path missing in
multipathd which can fix dm io blocked issue.

Test environment:
several hosts;
each host have 100+ multipath, each multipath have 1 to 4 paths.
run up/down storage network loop script for days.

Issue:
After several hours stop script, found some hosts have dm io blocked issue:
slave block device access well, but its dm device blocked.
36c0bfc0100a8d4a228214da50000003c dm-20 HUAWEI,XSG1
size=350G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 1:0:0:20  sdbk 67:224  failed ready running
multipathd show paths cannot found sdbk!

Test result:
With this patch, run script several days, io blocked issue
not found again.

This patch can fix this issue: when found path only missing in
multipathd but still in dm, check the missing path status if ok
try to add it back, and checker have chance to reinstate this
path and the dm io blocked issue will disappear.

Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
---
 libmultipath/dmparser.c | 31 +++++++++++++++++++++++++++++++
 libmultipath/dmparser.h |  1 +
 2 files changed, 32 insertions(+)

diff --git a/libmultipath/dmparser.c b/libmultipath/dmparser.c
index b856a07f..2f90b17c 100644
--- a/libmultipath/dmparser.c
+++ b/libmultipath/dmparser.c
@@ -317,6 +317,12 @@ int disassemble_map(vector pathvec, char *params, struct multipath *mpp,
 				/* Only call this in multipath client mode */
 				if (!is_daemon && store_path(pathvec, pp))
 					goto out1;
+
+				/* Try to add good status path back to avoid
+				 * dm io blocked issue in special condition.
+				 */
+				if(add_missing_path(pathvec, devname))
+					condlog(2, "Try to add missing path %s failed", devname);
 			} else {
 				if (!strlen(pp->wwid) &&
 				    strlen(mpp->wwid))
@@ -569,3 +575,28 @@ int disassemble_status(char *params, struct multipath *mpp)
 	}
 	return 0;
 }
+
+/* Add missing good status path back to multipathd */
+int add_missing_path(vector pathvec, char *missing_dev)
+{
+	struct udev_device *udevice;
+	struct config *conf;
+	int ret = 0;
+				
+	condlog(2, "Cant't found path %s try to add it back if its state is up.", 
+			missing_dev);
+
+	udevice = udev_device_new_from_subsystem_sysname(udev, "block", missing_dev);
+	if (!udevice) {
+		condlog(0, "%s: can't find path form udev", missing_dev);
+		return 1;
+	}
+	conf = get_multipath_config();
+	pthread_cleanup_push(put_multipath_config, conf);
+	ret = store_pathinfo(pathvec, conf,
+						 udevice, DI_ALL | DI_BLACKLIST, NULL);
+	pthread_cleanup_pop(1);
+	udev_device_unref(udevice);
+
+	return ret;
+}
diff --git a/libmultipath/dmparser.h b/libmultipath/dmparser.h
index e1badb0b..515ca900 100644
--- a/libmultipath/dmparser.h
+++ b/libmultipath/dmparser.h
@@ -1,3 +1,4 @@
 int assemble_map (struct multipath *, char *, int);
 int disassemble_map (vector, char *, struct multipath *, int);
 int disassemble_status (char *, struct multipath *);
+int add_missing_path(vector pathvec, char *missing_dev);
-- 
2.12.1.windows.1


Best Regards,
Chongyun Wu


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-09 11:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08  1:47 [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel Chongyun Wu
2020-07-09 11:11 ` Martin Wilck
  -- strict thread matches above, loose matches on Subject: below --
2020-07-07  3:08 Chongyun Wu
2020-07-07  9:43 ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.