All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: device-mapper development <dm-devel@redhat.com>
Subject: Re: [PATCH 04/30] multipath: Increase dev_loss_tmo prior to fast_io_fail
Date: Fri, 19 Jul 2013 16:17:19 -0500	[thread overview]
Message-ID: <20130719211719.GA2113@dhcp80-209.msp.redhat.com> (raw)
In-Reply-To: <1373958801-103613-5-git-send-email-hare@suse.de>

On Tue, Jul 16, 2013 at 09:12:55AM +0200, Hannes Reinecke wrote:
> There are several constraints when setting fast_io_fail and
> dev_loss_tmo.
> dev_loss_tmo will be capped automatically when fast_io_fail is
> not set. And fast_io_fail can not be raised beyond dev_loss_tmo.
> 
> So to increase dev_loss_tmo and fast_io_fail we first need
> to increase dev_loss_tmo to the given fast_io_fail
> setting, then set fast_io_fail, and then set dev_loss_tmo
> to the final setting.

This patch seems kind of convoluted to me, and even with the fix to
temporarily set dev_loss_tmo to one greater than fast_io_fail_tmo,
there is still a broken case

If you currently have:
fast_io_fail_tmo off
dev_loss_tmo <something_less_than_600>

and you want

fast_io_fail_tmo 600
dev_loss_tmo <something_greater_than_600>

This will fail, since it will try to set dev_loss_tmo to 601 with
fast_io_fail_tmo set to off (Granted, I doubt that 600 is a
popular fast_io_fail_tmo value).

But in the general case, If you aren't turning off fast_io_fail_tmo and
the current dev_loss_tmo is greater than the target fast_io_fail_tmo
(this seems like it is the most common case), you unecessarily go through
the work of first setting dev_loss_tmo to its current value.

        if (mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET &&
            mpp->fast_io_fail != MP_FAST_IO_FAIL_ZERO &&
            mpp->fast_io_fail != MP_FAST_IO_FAIL_OFF) {
                /* Check if we need to temporarily increase dev_loss_tmo
 * */
                ret = sysfs_attr_get_value(rport_dev, "dev_loss_tmo",
                                           value, 16);

<SNIP>

        }
        if (strlen(value)) {
                ret = sysfs_attr_set_value(rport_dev, "dev_loss_tmo",
                                           value, strlen(value));

I have a patch that solves this issue differently.  It still checks
dev_loss_tmo, but it always sets fast_io_fail_tmo first.  If the target
fast_io_fail_tmo is greater than or equal to the current dev_loss_tmo,
it sets fast_io_fail_tmo to one less than the current dev_loss_tmo.
This always works, since the lowest value possible for dev_loss_tmo is 1
and the lowest value possible for fast_io_fail_tmo is 0. Then it sets
dev_loss_tmo. Finally, if necessary, it sets fast_io_fail_tmo to its
final value.

I'll push that shortly.

-Ben
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  libmultipath/discovery.c | 66 +++++++++++++++++++++++++++++++++++++-----------
>  libmultipath/sysfs.c     | 64 ++++++++++++++++++++++++++++++++++++++++++++--
>  libmultipath/sysfs.h     |  2 ++
>  3 files changed, 115 insertions(+), 17 deletions(-)
> 
> diff --git a/libmultipath/discovery.c b/libmultipath/discovery.c
> index 26983ca..18f72ec 100644
> --- a/libmultipath/discovery.c
> +++ b/libmultipath/discovery.c
> @@ -317,6 +317,7 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>  	struct udev_device *rport_dev = NULL;
>  	char value[11];
>  	char rport_id[32];
> +	unsigned long long tmo;
>  
>  	sprintf(rport_id, "rport-%d:%d-%d",
>  		pp->sg_id.host_no, pp->sg_id.channel, pp->sg_id.transport_id);
> @@ -330,23 +331,51 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>  	condlog(4, "target%d:%d:%d -> %s", pp->sg_id.host_no,
>  		pp->sg_id.channel, pp->sg_id.scsi_id, rport_id);
>  
> -	snprintf(value, 11, "%u", mpp->dev_loss);
> -	if (mpp->dev_loss &&
> -	    sysfs_attr_set_value(rport_dev, "dev_loss_tmo", value, 11) <= 0) {
> -		if ((mpp->fast_io_fail == MP_FAST_IO_FAIL_UNSET ||
> -		     mpp->fast_io_fail == MP_FAST_IO_FAIL_OFF)
> -		    && mpp->dev_loss > 600) {
> -			condlog(3, "%s: limiting dev_loss_tmo to 600, since "
> -				"fast_io_fail is not set", mpp->alias);
> -			snprintf(value, 11, "%u", 600);
> -			if (sysfs_attr_set_value(rport_dev, "dev_loss_tmo",
> -						 value, 11) <= 0)
> -				condlog(0, "%s failed to set dev_loss_tmo",
> -					mpp->alias);
> +	/*
> +	 * This is tricky.
> +	 * dev_loss_tmo will be limited to 600 if fast_io_fail
> +	 * is _not_ set.
> +	 * fast_io_fail will be limited by the current dev_loss_tmo
> +	 * setting.
> +	 * So to get everything right we first need to increase
> +	 * dev_loss_tmo to the fast_io_fail setting (if present),
> +	 * then set fast_io_fail, and _then_ set dev_loss_tmo
> +	 * to the correct value.
> +	 */
> +	value[0] = '\0';
> +	if (mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET &&
> +	    mpp->fast_io_fail != MP_FAST_IO_FAIL_OFF) {
> +		/* Check if we need to temporarily increase dev_loss_tmo */
> +		if (sysfs_attr_get_value(rport_dev, "dev_loss_tmo",
> +					 value, 16) <= 0) {
> +			condlog(0, "%s: failed to read dev_loss_tmo value, "
> +				"error %d", pp->dev, errno);
> +			goto out;
> +		}
> +		if (sscanf(value, "%llu\n", &tmo) != 1) {
> +			condlog(0, "%s: Cannot parse dev_loss_tmo "
> +				"attribute '%s'",pp->dev, value);
>  			goto out;
>  		}
> +		if (mpp->fast_io_fail >= tmo) {
> +			snprintf(value, 11, "%u", mpp->fast_io_fail);
> +		} else {
> +			tmo = 0;
> +		}
> +	} else if (mpp->dev_loss > 600) {
> +		condlog(3, "%s: limiting dev_loss_tmo to 600, since "
> +			"fast_io_fail is not set", pp->dev);
> +		snprintf(value, 11, "%u", 600);
> +	} else {
> +		snprintf(value, 11, "%u", mpp->dev_loss);
>  	}
> -	if (mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET){
> +	if (strlen(value) &&
> +	    sysfs_attr_set_value(rport_dev, "dev_loss_tmo", value, 11) <= 0) {
> +		condlog(0, "%s failed to set dev_loss_tmo",
> +			mpp->alias);
> +		goto out;
> +	}
> +	if (mpp->fast_io_fail != MP_FAST_IO_FAIL_UNSET) {
>  		if (mpp->fast_io_fail == MP_FAST_IO_FAIL_OFF)
>  			sprintf(value, "off");
>  		else if (mpp->fast_io_fail == MP_FAST_IO_FAIL_ZERO)
> @@ -359,6 +388,13 @@ sysfs_set_rport_tmo(struct multipath *mpp, struct path *pp)
>  				mpp->alias);
>  		}
>  	}
> +	if (tmo > 0) {
> +		snprintf(value, 11, "%u", mpp->dev_loss);
> +		if (sysfs_attr_set_value(rport_dev, "dev_loss_tmo",
> +					 value, 11) <= 0)
> +			condlog(0, "%s failed to set dev_loss_tmo",
> +				mpp->alias);
> +	}
>  out:
>  	udev_device_unref(rport_dev);
>  }
> @@ -394,7 +430,7 @@ sysfs_set_session_tmo(struct multipath *mpp, struct path *pp)
>  		} else {
>  			snprintf(value, 11, "%u", mpp->fast_io_fail);
>  			if (sysfs_attr_set_value(session_dev, "recovery_tmo",
> -						 value, 11)) {
> +						 value, 11) <= 0) {
>  				condlog(3, "%s: Failed to set recovery_tmo, "
>  					" error %d", pp->dev, errno);
>  			}
> diff --git a/libmultipath/sysfs.c b/libmultipath/sysfs.c
> index d33747f..22b73b1 100644
> --- a/libmultipath/sysfs.c
> +++ b/libmultipath/sysfs.c
> @@ -38,6 +38,62 @@
>  #include "debug.h"
>  #include "devmapper.h"
>  
> +/*
> + * When we modify an attribute value we cannot rely on libudev for now,
> + * as libudev lacks the capability to update an attribute value.
> + * So for modified attributes we need to implement our own function.
> + */
> +ssize_t sysfs_attr_get_value(struct udev_device *dev, const char *attr_name,
> +			     char * value, size_t value_len)
> +{
> +	char devpath[PATH_SIZE];
> +	struct stat statbuf;
> +	int fd;
> +	ssize_t size = -1;
> +
> +	if (!dev || !attr_name || !value)
> +		return 0;
> +
> +	snprintf(devpath, PATH_SIZE, "%s/%s", udev_device_get_syspath(dev),
> +		 attr_name);
> +	condlog(4, "open '%s'", devpath);
> +	if (stat(devpath, &statbuf) != 0) {
> +		condlog(4, "stat '%s' failed: %s", devpath, strerror(errno));
> +		return 0;
> +	}
> +
> +	/* skip directories */
> +	if (S_ISDIR(statbuf.st_mode)) {
> +		condlog(4, "%s is a directory", devpath);
> +		return 0;
> +	}
> +
> +	/* skip non-writeable files */
> +	if ((statbuf.st_mode & S_IRUSR) == 0) {
> +		condlog(4, "%s is not readable", devpath);
> +		return 0;
> +	}
> +
> +	/* read attribute value */
> +	fd = open(devpath, O_RDONLY);
> +	if (fd < 0) {
> +		condlog(4, "attribute '%s' can not be opened: %s",
> +			devpath, strerror(errno));
> +		return 0;
> +	}
> +	size = read(fd, value, value_len);
> +	if (size < 0) {
> +		condlog(4, "read from %s failed: %s", devpath, strerror(errno));
> +		size = 0;
> +	} else if (size == value_len) {
> +		condlog(4, "overflow while reading from %s", devpath);
> +		size = 0;
> +	}
> +
> +	close(fd);
> +	return size;
> +}
> +
>  ssize_t sysfs_attr_set_value(struct udev_device *dev, const char *attr_name,
>  			     char * value, size_t value_len)
>  {
> @@ -58,12 +114,16 @@ ssize_t sysfs_attr_set_value(struct udev_device *dev, const char *attr_name,
>  	}
>  
>  	/* skip directories */
> -	if (S_ISDIR(statbuf.st_mode))
> +	if (S_ISDIR(statbuf.st_mode)) {
> +		condlog(4, "%s is a directory", devpath);
>  		return 0;
> +	}
>  
>  	/* skip non-writeable files */
> -	if ((statbuf.st_mode & S_IWUSR) == 0)
> +	if ((statbuf.st_mode & S_IWUSR) == 0) {
> +		condlog(4, "%s is not writeable", devpath);
>  		return 0;
> +	}
>  
>  	/* write attribute value */
>  	fd = open(devpath, O_WRONLY);
> diff --git a/libmultipath/sysfs.h b/libmultipath/sysfs.h
> index 13d7545..34f3e00 100644
> --- a/libmultipath/sysfs.h
> +++ b/libmultipath/sysfs.h
> @@ -7,6 +7,8 @@
>  
>  ssize_t sysfs_attr_set_value(struct udev_device *dev, const char *attr_name,
>  			     char * value, size_t value_len);
> +ssize_t sysfs_attr_get_value(struct udev_device *dev, const char *attr_name,
> +			     char * value, size_t value_len);
>  int sysfs_get_size (struct path *pp, unsigned long long * size);
>  int sysfs_check_holders(char * check_devt, char * new_devt);
>  #endif
> -- 
> 1.7.12.4
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2013-07-19 21:17 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-16  7:12 [PATCH 00/30] SLES resync, second try Hannes Reinecke
2013-07-16  7:12 ` [PATCH 01/30] multipath: bind lifetime of udev context to main thread Hannes Reinecke
2013-07-16  7:12 ` [PATCH 02/30] Document 'infinity' as possible value for dev_loss_tmo Hannes Reinecke
2013-07-16  7:12 ` [PATCH 03/30] alua: Do not add preferred path priority for active/optimized Hannes Reinecke
2013-07-16  7:12 ` [PATCH 04/30] multipath: Increase dev_loss_tmo prior to fast_io_fail Hannes Reinecke
2013-07-19 21:17   ` Benjamin Marzinski [this message]
2013-07-22  7:19     ` Hannes Reinecke
2013-07-16  7:12 ` [PATCH 05/30] libmultipath: return PATH_DOWN for quiesced paths Hannes Reinecke
2013-07-16  7:12 ` [PATCH 06/30] libmultipath: Implement PATH_TIMEOUT Hannes Reinecke
2013-07-16  7:12 ` [PATCH 07/30] Deprecate pg_timeout Hannes Reinecke
2013-07-16  7:12 ` [PATCH 08/30] kpartx: create correct symlinks for PATH_FAILED events Hannes Reinecke
2013-07-16  7:13 ` [PATCH 09/30] multipath: Deprecate 'getuid' configuration variable Hannes Reinecke
2013-07-16  7:13 ` [PATCH 10/30] kpartx: support disk with non-512B sectors Hannes Reinecke
2013-07-16  7:13 ` [PATCH 11/30] multipath: Add 'Datacore Virtual Disk' to internal hardware table Hannes Reinecke
2013-07-16  7:13 ` [PATCH 12/30] Minor fixes for priority handling Hannes Reinecke
2013-07-16  7:13 ` [PATCH 13/30] Check return value from pathinfo() Hannes Reinecke
2013-07-16  7:13 ` [PATCH 14/30] Read directly from sysfs when checking the device size Hannes Reinecke
2013-07-16  7:13 ` [PATCH 15/30] multipath.conf.annotated: Document rr_min_io_rq Hannes Reinecke
2013-07-16  7:13 ` [PATCH 16/30] Correctly print out 'max' for max_fds Hannes Reinecke
2013-07-16  7:13 ` [PATCH 17/30] Correctly set max_fds in case of failure Hannes Reinecke
2013-07-16  7:13 ` [PATCH 18/30] Update multipath.conf.defaults Hannes Reinecke
2013-07-16  7:13 ` [PATCH 19/30] Correctly set pgfailback Hannes Reinecke
2013-07-16  7:13 ` [PATCH 20/30] multipath.conf.5: clarify 'no_path_retry' default setting Hannes Reinecke
2013-07-16  7:13 ` [PATCH 21/30] multipath.conf.annotated: remove 'udev_dir' Hannes Reinecke
2013-07-16  7:13 ` [PATCH 22/30] multipath: Implement 'property' blacklist Hannes Reinecke
2014-08-21 16:37   ` Bart Van Assche
2014-08-21 19:31     ` Hannes Reinecke
2014-11-07  9:04       ` Bart Van Assche
2014-11-07  9:09         ` Hannes Reinecke
2014-11-19  5:33         ` Benjamin Marzinski
2013-07-16  7:13 ` [PATCH 23/30] Do not print error when rport is blocked Hannes Reinecke
2013-07-16  7:13 ` [PATCH 24/30] multipath: reference the udev context when starting event queue Hannes Reinecke
2013-07-16  7:13 ` [PATCH 25/30] multipathd: valgrind fixes Hannes Reinecke
2013-07-16  7:13 ` [PATCH 26/30] multipathd: increase stacksize for uevent listener Hannes Reinecke
2013-07-16  7:13 ` [PATCH 27/30] Specify checker_timeout in seconds Hannes Reinecke
2013-07-16  7:13 ` [PATCH 28/30] multipath: fix setting of fast_io_fail_tmo Hannes Reinecke
2013-07-16  7:13 ` [PATCH 29/30] multipath: reset queue_if_no_path if flush failed Hannes Reinecke
2013-07-16  7:13 ` [PATCH 30/30] libmultipath: read path state directly from sysfs Hannes Reinecke
2013-07-16 20:21 ` [PATCH 00/30] SLES resync, second try Christophe Varoqui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130719211719.GA2113@dhcp80-209.msp.redhat.com \
    --to=bmarzins@redhat.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.