All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/2] multipath-tools: intermittent IO error accounting to improve reliability
@ 2017-09-17  3:40 Guan Junxiong
  2017-09-17  3:40 ` [PATCH V4 1/2] " Guan Junxiong
  2017-09-17  3:40 ` [PATCH V4 2/2] multipath-tools: discard san_path_err_XXX feature Guan Junxiong
  0 siblings, 2 replies; 17+ messages in thread
From: Guan Junxiong @ 2017-09-17  3:40 UTC (permalink / raw)
  To: dm-devel, christophe.varoqui, mwilck
  Cc: guanjunxiong, chengjike.cheng, mmandala, niuhaoxin, shenhong09

Hi ALL,

This patchset add a new method of path state checking based on accounting
IO error. This is useful in many scenarios such as intermittent IO error
an a path due to network congestion, or a shaky link.

PATCH 1/2 implements the algorithm that sends a couple of continuous IOs
at a fix rate of 10 Hz.
PATCH 2/2 discard the original algorithm because of this:
the detect sample interval of that path checkers is so big/coarse that
it doesn't see what happens in the middle of the sample interval. We have
the PATCH 1/2 as a better method.


Changes from V3:
* discard the 
* fail the path in the kernel before enqueueing the path for checking
  rather than after knowing the checking result to make it more
  reliable. (Martin)
* use posix_memalign instead of manual alignment for direct IO buffer. (Martin) 
* use PATH_MAX to avoid certain compiler warning when opening file
  rather than FILE_NAME_SIZE. (Martin)
* discard unnecessary sanity check when getting block size (Martin)
* do not return 0 in send_each_aync_io if io_starttime of a path is
  not set(Martin)
* Wait 10ms instead of 60 second if every path is down. (Martin)
* rename handle_async_io_timeout to poll_async_io_timeout and use polling
  method because io_getevents does not return 0 if there are timeout IO
  and normal IO.
* rename hit_io_err_recover_time ro hit_io_err_recheck_time 
* modify the multipath.conf.5 and commit comments to keep sync with the
  above changes


Changes from V2:
* fix uncondistional rescedule forverver
* use script/checkpatch.pl in Linux to cleanup informal coding style
* fix "continous" and "internel" typos


Changes from V1:
* send continous IO instead of a single IO in a sample interval (Martin)
* when recover time expires, we reschedule the checking process (Hannes)
* Use the error rate threshold as a permillage instead of IO number(Martin)
* Use a common io_context for libaio for all paths (Martin)
* Other small fixes (Martin)






Junxiong Guan (2):
  multipath-tools: intermittent IO error accounting to improve
    reliability
  multipath-tools: discard san_path_err_XXX feature

 libmultipath/Makefile      |   5 +-
 libmultipath/config.c      |   3 -
 libmultipath/config.h      |  18 +-
 libmultipath/configure.c   |   6 +-
 libmultipath/dict.c        |  74 ++---
 libmultipath/io_err_stat.c | 743 +++++++++++++++++++++++++++++++++++++++++++++
 libmultipath/io_err_stat.h |  15 +
 libmultipath/propsel.c     |  54 ++--
 libmultipath/propsel.h     |   6 +-
 libmultipath/structs.h     |  14 +-
 libmultipath/uevent.c      |  32 ++
 libmultipath/uevent.h      |   2 +
 multipath/multipath.conf.5 |  62 ++--
 multipathd/main.c          | 130 ++++----
 14 files changed, 971 insertions(+), 193 deletions(-)
 create mode 100644 libmultipath/io_err_stat.c
 create mode 100644 libmultipath/io_err_stat.h

-- 
2.11.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-10-12  6:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-17  3:40 [PATCH V4 0/2] multipath-tools: intermittent IO error accounting to improve reliability Guan Junxiong
2017-09-17  3:40 ` [PATCH V4 1/2] " Guan Junxiong
2017-09-18 12:53   ` Muneendra Kumar M
2017-09-18 14:36     ` Guan Junxiong
2017-09-18 19:51       ` Martin Wilck
2017-09-19  1:32         ` Guan Junxiong
2017-09-19 10:59           ` Muneendra Kumar M
2017-09-19 12:53             ` Guan Junxiong
2017-09-20 12:58               ` Muneendra Kumar M
2017-09-21 10:04                 ` Guan Junxiong
2017-09-21 10:10                   ` Muneendra Kumar M
     [not found]                   ` <615cdd5a955944e49986dca01bf406a5@BRMWP-EXMB12.corp.brocade.com>
2017-10-09  0:42                     ` Guan Junxiong
2017-10-09 11:39                       ` Muneendra Kumar M
2017-10-12  6:35                       ` Muneendra Kumar M
2017-10-12  6:46                         ` Guan Junxiong
2017-10-12  6:59                           ` Muneendra Kumar M
2017-09-17  3:40 ` [PATCH V4 2/2] multipath-tools: discard san_path_err_XXX feature Guan Junxiong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.