All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Justin T. Gibbs" <gibbs@scsiguy.com>
To: James Bottomley <James.Bottomley@SteelEye.com>,
	Xose Vazquez Perez <xose@wanadoo.es>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Tosatti <marcelo.tosatti@cyclades.com>,
	linux-scsi <linux-scsi@vger.kernel.org>
Subject: Re: AIC7xxx kernel problem with 2.4.2[234] kernels
Date: Mon, 19 Jan 2004 11:38:17 -0700	[thread overview]
Message-ID: <3747775408.1074537497@aslan.btc.adaptec.com> (raw)
In-Reply-To: <1074532919.1895.32.camel@mulgrave>

> On Mon, 2004-01-19 at 08:32, Xose Vazquez Perez wrote:
>> It looks like the _kernel_ driver is going to be without a maintainer
>> unless somebody works on it, porting ADAPTEC fixes/features to the kernel driver.
> 
> As I told you in private email, this is *not* the way I see it.  At the
> moment, Ataptec is the maintainer of that driver unless they choose
> formally to relinquish it.

Can you provide your definition of "maintainer"?  I know that I am maintainer
of the drivers distributed from my website, but I don't feel I have ever
been maintainer of the drivers in the 2.4.X, 2.5.X, or 2.6.X trees.

> There is a glimmering of a resolution of the problem in an early
> notification API for command timeouts.

I'm open to ideas, but from this one line summary, this sounds like a
workaround and not a real solution.  Can you say more about your proposal?

In my mind, an easy resolution would be to:

1) Let me fix the SCSI layer so that the error recovery handler override
   already there will actually work - cleanly.

2) Let my drivers use that mechanism.

While working on 1, I would appreciate being able to "maintain" these
drivers with their current error recovery workaround in place.

> Although throwing away successful completions when error recovery is in
> progress isn't a bug (scsi commands are either idempotent or non
> retryable), it's certainly not ideal.

Most SCSI commands are only idempotent if replayed in the same order
as originally issued (consider FSes that rely on write ordering to
keep their meta-data coherent).  Some commands are retriable but only if
they have actually failed.  The mid-layer has no concept currently of these
issues, yet it acts on behalf of the peripheral drivers that can better
understand how the device they control behaves and act accordingly.

Bugs are defects that render non-ideal behavior.  The only question is
what types of non-ideal behaviors you are willing to tolerate.

> I'm thinking about a better
> framework where we would quiesce the device but pull back from
> activating the eh thread if all commands return.  This would also fix
> the tag starvation issue that many drivers tackle independently too.

That wouldn't help things.  For example, lets say that there is one command
active on the bus holding up the completion of 32 others.  "Waiting for a bit"
will never release the other 32 commands.  You must abort the bus hog.  Once
you abort the problem command, you get flooded with the completions of the
32 others.  The bus is recovered.  You can now safely go about your business.
An HBA watchdog handler can properly deal with this situation since it has
state that the mid-layer does not.

As for tag starvation, just inserting a periodic ordered tag on devices
that show signs of starvation is a much better approach than shutting
down the flow of commands to the whole controller at the first sign of
trouble.  Luckily, most vendors stopped making drives with tag starvation
issues in the mid-90's.  For this reason, the tag starvation code in
my drivers is off by default, but can be enabled via a module or kernel
command line option.

--
Justin


  reply	other threads:[~2004-01-19 18:32 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-19 13:32 AIC7xxx kernel problem with 2.4.2[234] kernels Xose Vazquez Perez
2004-01-19 17:21 ` James Bottomley
2004-01-19 18:38   ` Justin T. Gibbs [this message]
2004-01-20  0:50     ` James Bottomley
2004-01-20  2:02       ` Justin T. Gibbs
2004-01-20  4:45         ` James Bottomley
2004-01-20  5:43           ` Justin T. Gibbs
2004-01-22  5:14             ` James Bottomley
2004-01-20 11:24           ` Chiaki
2004-01-20  7:15         ` Linus Torvalds
2004-01-20  8:30           ` Andre Hedrick
2004-01-21 20:37           ` Guennadi Liakhovetski
  -- strict thread matches above, loose matches on Subject: below --
2004-01-16 21:43 Stephen Smoogen
2004-01-16 22:39 ` Justin T. Gibbs
2004-01-16 22:59   ` Stephen Smoogen
2004-01-21 19:59     ` Stephen Smoogen
2004-02-18 19:42       ` Stephen Smoogen
2004-01-16 23:17   ` Marcelo Tosatti
2004-01-18  1:11   ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3747775408.1074537497@aslan.btc.adaptec.com \
    --to=gibbs@scsiguy.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=xose@wanadoo.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.