linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <htejun@gmail.com>
To: Bruce Stenning <b.stenning@indigovision.com>
Cc: Mark Lord <kernel@teksavvy.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
	Jeff Garzik <jgarzik@pobox.com>
Subject: Re: sata_mv port lockup on hotplug (kernel 2.6.38.2)
Date: Tue, 6 Sep 2011 12:45:44 +0900	[thread overview]
Message-ID: <20110906034544.GB18425@mtj.dyndns.org> (raw)
In-Reply-To: <C1438B59050E1B4C9482FF3266AD6BA32D8D9854F7@gretna.indigovision.com>

Hello,

On Fri, Sep 02, 2011 at 05:22:38PM +0100, Bruce Stenning wrote:
> Unfortunately it has so far been quite difficult to reproduce when specifically
> attempting to.  In normal use cases I reproduced it twice by unplugging a drive
> from a RAID array with redundancy intact.  This was out of around a dozen
> cycles of waiting until redundancy was restored while the unit was under load,
> popping the disk, reinserting, and triggering a RAID rebuild.

Hmm... that's unfortunate.

> I have only twice managed to trigger a lockup deliberately.  In both cases the
> tracing showed a scheduled EH which was subsequently not enacted.
> 
> How long could it take for the EH to be enacted?  In the lockups that I
> have reproduced it did not seem to have recovered minutes later, but perhaps
> if I had waited longer...?  I have noticed that error recovery sometimes backs
> off for 8s and even 33s, but it always warns when that sort of delay is coming
> up.

It should happen pretty quickly.  In such cases, fastdrain is
activated and all pending commands are aborted if they complete in 3
secs and then EH should kick in.  The backoff is from reset path only
to give breathing time for devices which take long time to spin up and
doesn't apply in this case.

> I shall continue to try to track down why the scheduled EH does not happen.

Can you please add some debug printk's to scsi_schedule_eh() and see
whether scsi_eh_wakeup() is invoked from there?  It seems likely that
the problem is caused by race conditions around
SHOST_[CANCEL_]RECOVERY flags.

Thanks.

-- 
tejun

  reply	other threads:[~2011-09-06  3:45 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <C1438B59050E1B4C9482FF3266AD6BA32C7493557A@gretna.indigovision.com>
     [not found] ` <C1438B59050E1B4C9482FF3266AD6BA32C7493559F@gretna.indigovision.com>
2011-04-06 21:00   ` sata_mv port lockup on hotplug (kernel 2.6.38.2) Alejandro Riveira Fernández
     [not found]   ` <4D9CD275.9000002@teksavvy.com>
2011-04-08 12:43     ` Bruce Stenning
2011-04-09  0:48       ` Mark Lord
2011-04-12 10:30         ` Bruce Stenning
2011-04-12 14:07           ` Mark Lord
2011-04-12 14:55             ` Mark Lord
2011-04-23  0:56               ` Tejun Heo
2011-04-25 11:07                 ` Bruce Stenning
2011-04-25 11:30                   ` Tejun Heo
2011-04-26 18:27                     ` Mark Lord
2011-04-25 14:54                 ` Bruce Stenning
2011-04-25 16:22                   ` Tejun Heo
2011-04-26 13:13                     ` Bruce Stenning
2011-04-26 13:50                       ` Tejun Heo
2011-04-26 15:41                         ` Bruce Stenning
2011-04-26 15:52                           ` Tejun Heo
2011-04-26 16:05                             ` Bruce Stenning
2011-04-30 14:01                               ` Tejun Heo
2011-05-17 15:30                                 ` Bruce Stenning
2011-05-25  9:41                                   ` Tejun Heo
2011-05-25 13:36                                     ` Bruce Stenning
2011-05-25 22:27                                       ` Mark Lord
2011-05-30 13:07                                         ` Bruce Stenning
2011-05-31  2:04                                           ` Mark Lord
2011-06-10 12:28                                     ` Mark Lord
2011-06-10 14:01                                       ` Tejun Heo
2011-06-10 17:32                                         ` Mark Lord
2011-06-10 17:39                                           ` Jeff Garzik
2011-06-10 20:49                                             ` Mark Lord
2011-06-10 21:20                                               ` Jeff Garzik
2011-06-11 12:19                                           ` Tejun Heo
2011-06-12 17:10                                             ` Mark Lord
2011-06-13 10:48                                               ` Tejun Heo
2011-08-29 15:49                                                 ` Bruce Stenning
2011-09-02  1:13                                                   ` Tejun Heo
2011-09-02 16:22                                                     ` Bruce Stenning
2011-09-06  3:45                                                       ` Tejun Heo [this message]
2011-09-06 12:19                                                         ` Bruce Stenning
2011-09-06 16:33                                                           ` Tejun Heo
2011-09-06 16:42                                                             ` Tejun Heo
2011-09-07 12:09                                                               ` Bruce Stenning
2011-09-08  1:16                                                                 ` Tejun Heo
2011-09-06 12:26                                                         ` Bruce Stenning
2011-09-06 14:40                                                         ` Bruce Stenning
2011-09-06 15:22                                                         ` Bruce Stenning
2011-04-26 18:37                           ` Mark Lord
2011-04-13 15:10           ` Mark Lord
2011-04-13 16:13             ` Bruce Stenning
2011-04-14 16:50             ` Bruce Stenning
2011-04-15  0:28               ` Mark Lord
2011-04-15 13:21                 ` Bruce Stenning
2011-04-15 15:36                   ` Mark Lord
2011-04-18 10:30                     ` Bruce Stenning
2011-04-23  0:56                     ` Tejun Heo
2011-04-15  3:15       ` Mikael Abrahamsson
2011-04-15  3:32         ` Mark Lord
2011-04-15  7:43           ` Mikael Abrahamsson
2011-04-15 12:35             ` Mark Lord
2011-04-15 14:04               ` Mikael Abrahamsson
2011-04-15 15:30                 ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110906034544.GB18425@mtj.dyndns.org \
    --to=htejun@gmail.com \
    --cc=b.stenning@indigovision.com \
    --cc=jgarzik@pobox.com \
    --cc=kernel@teksavvy.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).