All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chongyun Wu <wu.chongyun@h3c.com>
To: Martin Wilck <mwilck@suse.com>,
	Benjamin Marzinski <bmarzins@redhat.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>
Cc: Chengchiwen <chengchiwen@h3c.com>,
	Guozhonghua <guozhonghua@h3c.com>,
	Changlimin <changlimin@h3c.com>, Wangyong <wang.yongD@h3c.com>,
	Zhangguanghui <zhang.guanghui@h3c.com>
Subject: Re: [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel
Date: Wed, 8 Jul 2020 01:47:18 +0000	[thread overview]
Message-ID: <27245ecfc8be4229b52f736062585230@h3c.com> (raw)

Hello Martin,
Thanks for your reply. 
Waiting for your patches I'd like to test it.  And also, I'd like to tell more detail about this issue.

> Is /dev/sdbk indeed a healthy path in this situation, or not?
> Please run "multipathd show devices", too.
Yes, /dev/sdbk can read and write, the condition is well, it fully recovered. 
Just because it not in pathvec and check_path have no chance to found it up and reinstate it in
dm, and dm map have this path but queue_if_no_path still have, dm state is down, dm io blocked.

>I wonder if your logs provide some more evidence how this situation
>came to pass. I suspect that either a) uevents got lost, or that b)
>multipathd failed to (re-)add the path while handling an uevent. It'd
>be interesting to find out what it actually was.
Firstly I also suspect like you, but actually there is not any uevent to remove the path in my debugging version with more log.
But I found that after reconfig processed issue might appears, detect more, found that when storage network down block
device not accessed reconfig will make paths disappear form multipathd because its status is bad. 
I think this might be a reason. Add more log for pathvec removing path, but not found other glue.


>I believe that your problem would have been solved simply by removing
>the "!is_daemon" condition above. I'd like to know if that's actually
>the case, because your add_missing_path() function does basically the
>same thing (plus calling pathinfo()).
I have tried this but not work. Only need pathinfo and store pathinfo into pathvec can work, because check_path need
full path info to finish check old code path only have pp->dev and pp->dev_t, check_path cannot use this to check path status.
Dm io blocked issue not happen again, because this patch can repair this blocked very fast.
Also dm io blocked issue actually is a low probability event.


>However, the "!is_daemon" test is there for a reason (see b96dead 
> ("[multipathd] remove the retry login in uev_remove_path()"), and
>that's why your patch isn't correct as-is.

-----original-----
发件人: Martin Wilck [mailto:mwilck@suse.com] 
发送时间: 2020年7月7日 17:43
收件人: wuchongyun (Cloud) <wu.chongyun@h3c.com>; Benjamin Marzinski <bmarzins@redhat.com>; dm-devel@redhat.com
抄送: guozhonghua (Cloud) <guozhonghua@h3c.com>; wangyong (Cloud) <wang.yongD@h3c.com>; changlimin (Cloud) <changlimin@h3c.com>; zhangguanghui (Cloud) <zhang.guanghui@h3c.com>; chengchiwen (Cloud) <chengchiwen@h3c.com>
主题: Re: [dm-devel] [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel

Hello Chongyun,

On Tue, 2020-07-07 at 03:08 +0000, Chongyun Wu wrote:
> Hi Martin and Ben,
> 
> Cloud you help to view below patch, thanks.
> 
> > From b2786c81a78bf3868f300fd3177e852e718e7790 Mon Sep 17 00:00:00
> > 2001
> From: Chongyun Wu <wu.chongyun@h3c.com>
> Date: Mon, 6 Jul 2020 11:22:21 +0800
> Subject: [PATCH] libmultipath/dmparser: add missing path with good 
> status  when sync state with dm kernel
> 

Nack, sorry. I know we have an issue in this area, but I would like to
handle it differently.

First of all, I want to get rid of disassemble_map() making
modifications to the pathvec. The fact that disassemble_map() currently
does this is an ugly layer violation IMO, and I don't like the idea of
adding more of this on top. I'm currently preparing a patch set that
addresses this (among other things). It will also address the issue of
missing paths in pathvec, and I'd be very glad if you could give it a
try in your test environment.


> Add path back into deamon vecs->pathvecs when found path missing in
> multipathd which can fix dm io blocked issue.
> 
> Test environment:
> several hosts;
> each host have 100+ multipath, each multipath have 1 to 4 paths.
> run up/down storage network loop script for days.
> 
> Issue:
> After several hours stop script, found some hosts have dm io blocked
> issue:
> slave block device access well, but its dm device blocked.
> 36c0bfc0100a8d4a228214da50000003c dm-20 HUAWEI,XSG1
> size=350G features='1 queue_if_no_path' hwhandler='0' wp=rw
> `-+- policy='round-robin 0' prio=1 status=enabled
>   |- 1:0:0:20  sdbk 67:224  failed ready running
> multipathd show paths cannot found sdbk!

Is /dev/sdbk indeed a healthy path in this situation, or not?
Please run "multipathd show devices", too.

I wonder if your logs provide some more evidence how this situation
came to pass. I suspect that either a) uevents got lost, or that b)
multipathd failed to (re-)add the path while handling an uevent. It'd
be interesting to find out what it actually was.

More notes below.

> Test result:
> With this patch, run script several days, io blocked issue
> not found again.
> 
> This patch can fix this issue: when found path only missing in
> multipathd but still in dm, check the missing path status if ok
> try to add it back, and checker have chance to reinstate this
> path and the dm io blocked issue will disappear.
> 
> Signed-off-by: Chongyun Wu <wu.chongyun@h3c.com>
> ---
>  libmultipath/dmparser.c | 31 +++++++++++++++++++++++++++++++
>  libmultipath/dmparser.h |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/libmultipath/dmparser.c b/libmultipath/dmparser.c
> index b856a07f..2f90b17c 100644
> --- a/libmultipath/dmparser.c
> +++ b/libmultipath/dmparser.c
> @@ -317,6 +317,12 @@ int disassemble_map(vector pathvec, char
> *params, struct multipath *mpp,
>  				/* Only call this in multipath client
> mode */
>  				if (!is_daemon && store_path(pathvec,
> pp))
>  					goto out1;

I believe that your problem would have been solved simply by removing
the "!is_daemon" condition above. I'd like to know if that's actually
the case, because your add_missing_path() function does basically the
same thing (plus calling pathinfo()).

However, the "!is_daemon" test is there for a reason (see b96dead 
("[multipathd] remove the retry login in uev_remove_path()"), and
that's why your patch isn't correct as-is. 

Regards,
Martin

> +
> +				/* Try to add good status path back to
> avoid
> +				 * dm io blocked issue in special
> condition.
> +				 */
> +				if(add_missing_path(pathvec, devname))
> +					condlog(2, "Try to add missing
> path %s failed", devname);
>  			} else {
>  				if (!strlen(pp->wwid) &&
>  				    strlen(mpp->wwid))
> @@ -569,3 +575,28 @@ int disassemble_status(char *params, struct
> multipath *mpp)
>  	}
>  	return 0;
>  }
> +
> +/* Add missing good status path back to multipathd */
> +int add_missing_path(vector pathvec, char *missing_dev)
> +{
> +	struct udev_device *udevice;
> +	struct config *conf;
> +	int ret = 0;
> +				
> +	condlog(2, "Cant't found path %s try to add it back if its
> state is up.", 
> +			missing_dev);
> +
> +	udevice = udev_device_new_from_subsystem_sysname(udev, "block",
> missing_dev);
> +	if (!udevice) {
> +		condlog(0, "%s: can't find path form udev",
> missing_dev);
> +		return 1;
> +	}
> +	conf = get_multipath_config();
> +	pthread_cleanup_push(put_multipath_config, conf);
> +	ret = store_pathinfo(pathvec, conf,
> +						 udevice, DI_ALL |
> DI_BLACKLIST, NULL);
> +	pthread_cleanup_pop(1);
> +	udev_device_unref(udevice);
> +
> +	return ret;
> +}
> diff --git a/libmultipath/dmparser.h b/libmultipath/dmparser.h
> index e1badb0b..515ca900 100644
> --- a/libmultipath/dmparser.h
> +++ b/libmultipath/dmparser.h
> @@ -1,3 +1,4 @@
>  int assemble_map (struct multipath *, char *, int);
>  int disassemble_map (vector, char *, struct multipath *, int);
>  int disassemble_status (char *, struct multipath *);
> +int add_missing_path(vector pathvec, char *missing_dev);



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

             reply	other threads:[~2020-07-08  1:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-08  1:47 Chongyun Wu [this message]
2020-07-09 11:11 ` [PATCH]libmultipath/dmparser: add missing path with good status when sync state with dm kernel Martin Wilck
  -- strict thread matches above, loose matches on Subject: below --
2020-07-07  3:08 Chongyun Wu
2020-07-07  9:43 ` Martin Wilck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27245ecfc8be4229b52f736062585230@h3c.com \
    --to=wu.chongyun@h3c.com \
    --cc=bmarzins@redhat.com \
    --cc=changlimin@h3c.com \
    --cc=chengchiwen@h3c.com \
    --cc=dm-devel@redhat.com \
    --cc=guozhonghua@h3c.com \
    --cc=mwilck@suse.com \
    --cc=wang.yongD@h3c.com \
    --cc=zhang.guanghui@h3c.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.