All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org
Subject: Re: [RFC PATCH] scsi: Add failfast mode to avoid infinite retry loop
Date: Tue, 20 Aug 2013 16:13:50 +0900	[thread overview]
Message-ID: <5213172E.1060905@hitachi.com> (raw)
In-Reply-To: <1376922616.2069.9.camel@dabdike.int.hansenpartnership.com>

(2013/08/19 23:30), James Bottomley wrote:
> On Mon, 2013-08-19 at 18:39 +0900, Eiichi Tsukata wrote:
>> Hello,
>>
>> This patch adds scsi device failfast mode to avoid infinite retry loop.
>>
>> Currently, scsi error handling in scsi_decide_disposition() and
>> scsi_io_completion() unconditionally retries on some errors. This is because
>> retryable errors are thought to be temporary and the scsi device will soon
>> recover from those errors. Normally, such retry policy is appropriate because
>> the device will soon recover from temporary error state.
>> But there is no guarantee that device is able to recover from error state
>> immediately. Some hardware error may prevent device from recovering.
>> Therefore hardware error can results in infinite command retry loop. In fact,
>> CHECK_CONDITION error with the sense-key = UNIT_ATTENTION caused infinite
>> retry loop in our environment. As the comments in kernel source code says,
>> UNIT_ATTENTION means the device must have been a power glitch and expected
>> to immediately recover from the state. But it seems that hardware error
>> caused permanent UNIT_ATTENTION error.
>>
>> To solve the above problem, this patch introduces scsi device "failfast mode".
>> If failfast mode is enabled, retry counts of all scsi commands are limited to
>> scsi->allowed(== SD_MAX_RETRIES == 5). All commands are prohibited to retry
>> infinitely, and immediately fails when the retry count exceeds upper limit.
>> Failfast mode is useful on mission critical systems which are required
>> to keep running flawlessly because they need to failover to the secondary
>> system once they detect failures.
>> On default, failfast mode is disabled because failfast policy is not suitable
>> for most use cases which can accept I/O latency due to device hardware error.
>>
>> To enable failfast mode(default disabled):
>>           # echo 1>  /sys/bus/scsi/devices/X:X:X:X/failfast
>> To disable:
>>           # echo 0>  /sys/bus/scsi/devices/X:X:X:X/failfast
>>
>> Furthermore, I'm planning to make the upper limit count configurable.
>> Currently, I have two plans to implement it:
>> (1) set same upper limit count on all errors.
>> (2) set upper limit count on each error.
>> The first implementation is simple and easy to implement but not flexible.
>> Someone wants to set different upper limit count on each errors depends on the
>> scsi device they use. The second implementation satisfies such requirement
>> but can be too fine-grained and annoying to configure because scsi error
>> codes are so much. The default 5 times retry may too much on some errors but
>> too few on other errors.
>>
>> Which would be the appropriate implementation?
>> Any comments or suggestions are welcome as usual.
>
> I'm afraid you'll need to propose another solution.  We have a large
> selection of commands which, by design, retry until the command exceeds
> it's timeout.  UA is one of those (as are most of the others you're
> limiting).  How do you kick this device out of its UA return (because
> that's the recovery that needs to happen)?
>
> James
>
>

Thanks for reviewing, James.

Originally, I planned that once the retry count exceeds its limit,
a monitoring tool stops the server with the scsi prink error message
as a trigger.
Current failfast mode implementation is that the command fails when
retry command exceeds its limit. However, I noticed that only printing error messages
on retry counts excess without changing retry logic will be enough
to stop the server and take fail over.  Though there is no guarantee that
userspace application can work properly on disk failure condition.
So, now I'm considering that just calling panic() on retry excess is better.

For that reason, I propose the solution that adding "panic_on_error" option to
sysfs parameter and if panic_on_error mode is enabled the server panics
immediately once it detects retry excess. Of course, it is disabled on default.

I would appreciate it if you could give me some comments.

Eiichi

  reply	other threads:[~2013-08-20  7:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-19  9:39 [RFC PATCH] scsi: Add failfast mode to avoid infinite retry loop Eiichi Tsukata
2013-08-19 14:30 ` James Bottomley
2013-08-20  7:13   ` Eiichi Tsukata [this message]
2013-08-20 18:09     ` Ewan Milne
2013-08-23  9:10       ` Eiichi Tsukata
2013-08-23 12:26         ` Ric Wheeler
2013-08-26 10:03           ` Eiichi Tsukata
2013-08-23 13:19         ` James Bottomley
2013-08-23 19:36           ` Ewan Milne
2013-08-26  9:34             ` Eiichi Tsukata
2013-08-26  9:32           ` Eiichi Tsukata

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5213172E.1060905@hitachi.com \
    --to=eiichi.tsukata.xh@hitachi.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.