All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: smp dead lock of io_request_lock/queue_lock patch
@ 2004-01-12 15:07 Martin Peschke3
  2004-01-12 15:12 ` Arjan van de Ven
  0 siblings, 1 reply; 23+ messages in thread
From: Martin Peschke3 @ 2004-01-12 15:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Doug Ledford, Arjan Van de Ven, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno


Hi,

I understand that people tend to be careful with optimizations
or enhancements. In that regard, this particular patch is a
disputable example.

But, my request has been meant in a broader sense.
This iorl-patch is just one out of roughly a dozen on
both RH's side and SuSE's side. I have been asking
for getting together on the list and trying to match and merge
these piles of patches, which are possibly not all as disputable
as the patch discussed in this thread, i.e. pure (partially
vintage) bugfixes.

If people agree in that course also about a clean, common
iorl-patch, that would be another step forward, in my opinion.


Mit freundlichen Grüßen / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349


Jens Axboe <axboe@suse.de>@vger.kernel.org on 12/01/2004 15:13:30

Sent by:    linux-scsi-owner@vger.kernel.org


To:    Martin Peschke3/Germany/IBM@IBMDE
cc:    Doug Ledford <dledford@redhat.com>, Arjan Van de Ven
       <arjanv@redhat.com>, Peter Yao <peter@exavio.com.cn>,
       linux-kernel@vger.kernel.org, linux-scsi mailing list
       <linux-scsi@vger.kernel.org>
Subject:    Re: smp dead lock of io_request_lock/queue_lock patch


On Mon, Jan 12 2004, Martin Peschke3 wrote:
> Hi,
>
> is there a way to merge all (or at least the common denominator) of
> Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
> The SCSI part of Marcelo's kernel seems to be rather backlevel,
> considering all those fixes and enhancements added by the mentioned
> distributors and their SCSI experts. As this discussion underlines,
> there are a lot of common problems and sometimes even common
> approaches.  I am convinced that a number of patches ought to be
> incorporated into the mainline kernel. Though, I must admit
> that I am at a loss with how this could be achieved given the
> unresolved question of 2.4 SCSI maintenance
> (which has certainly played a part in building up those piles
> of SCSI patches).

I have mixed feelings about that. One on side, I'd love to merge the
scalability patches in mainline. We've had a significant number of bugs
in this area in the past, and it seems a shame that we all have to fix
them independently because we deviate from mainline. On the other hand,
2.4 is pretty much closed. There wont be a big number of new distro 2.4
kernels.

Had you asked me 6 months ago I probably would have advocated merging
them, but right now I think it's a waste of time.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 15:07 smp dead lock of io_request_lock/queue_lock patch Martin Peschke3
@ 2004-01-12 15:12 ` Arjan van de Ven
  2004-01-12 19:48   ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Arjan van de Ven @ 2004-01-12 15:12 UTC (permalink / raw)
  To: Martin Peschke3
  Cc: Jens Axboe, Doug Ledford, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

On Mon, Jan 12, 2004 at 04:07:40PM +0100, Martin Peschke3 wrote:
> as the patch discussed in this thread, i.e. pure (partially
> vintage) bugfixes.

Both SuSE and Red Hat submit bugfixes they put in the respective trees to
marcelo already. There will not be many "pure bugfixes" that you can find in
vendor trees but not in marcelo's tree.

> If people agree in that course also about a clean, common
> iorl-patch, that would be another step forward, in my opinion.

None of the vendors in question is still doing development based on 2.4 (in
fact I suspect no linux vendor/distro still is) so if you want vendors to go
to a joint patch for such a specific enhancement, which WILL include
development, I really don't see the point. Neither SuSE nor Red Hat will be
jumping from joy to do a lot of work to replace a patch that works in their
respective existing products with something else with no clear gain at all,
while requiring significant work and stability risks.

Greetings,
    Arjan van de Ven

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 15:12 ` Arjan van de Ven
@ 2004-01-12 19:48   ` Christoph Hellwig
  2004-01-12 19:51     ` Doug Ledford
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-12 19:48 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Martin Peschke3, Jens Axboe, Doug Ledford, Peter Yao,
	linux-kernel, linux-scsi mailing list, ihno

On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > as the patch discussed in this thread, i.e. pure (partially
> > vintage) bugfixes.
> 
> Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> marcelo already. There will not be many "pure bugfixes" that you can find in
> vendor trees but not in marcelo's tree.

I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
for ages.  In fact I asked Jens & Doug two times whether they could sort out
the distro patches for the 2.4 stack and post them, but it seems they're busy
enough with real work so this never happened.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 19:48   ` Christoph Hellwig
@ 2004-01-12 19:51     ` Doug Ledford
  2004-01-12 20:03       ` Christoph Hellwig
                         ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Doug Ledford @ 2004-01-12 19:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arjan Van de Ven, Martin Peschke3, Jens Axboe, Peter Yao,
	linux-kernel, linux-scsi mailing list, ihno

On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > as the patch discussed in this thread, i.e. pure (partially
> > > vintage) bugfixes.
> > 
> > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > marcelo already. There will not be many "pure bugfixes" that you can find in
> > vendor trees but not in marcelo's tree.
> 
> I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> for ages.  In fact I asked Jens & Doug two times whether they could sort out
> the distro patches for the 2.4 stack and post them, but it seems they're busy
> enough with real work so this never happened.

More or less.  But part of it also is that a lot of the patches I've
written are on top of other patches that people don't want (aka, the
iorl patch).  I've got a mlqueue patch that actually *really* should go
into mainline because it solves a slew of various problems in one go,
but the current version of the patch depends on some semantic changes
made by the iorl patch.  So, sorting things out can sometimes be
difficult.  But, I've been told to go ahead and do what I can as far as
getting the stuff out, so I'm taking some time to try and get a bk tree
out there so people can see what I'm talking about.  Once I've got the
full tree out there, then it might be possible to start picking and
choosing which things to port against mainline so that they don't depend
on patches like the iorl patch.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 19:51     ` Doug Ledford
@ 2004-01-12 20:03       ` Christoph Hellwig
  2004-01-12 21:12         ` Jens Axboe
  2004-01-13 20:55       ` Marcelo Tosatti
  2004-01-15 17:17       ` Bill Davidsen
  2 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-12 20:03 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Arjan Van de Ven, Martin Peschke3, Jens Axboe, Peter Yao,
	linux-kernel, linux-scsi mailing list, ihno

On Mon, Jan 12, 2004 at 02:51:42PM -0500, Doug Ledford wrote:
> More or less.  But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch).

I'm wondering whether we want it now that 2.4 is basically frozen, but
I don't think there was a strong case against it say 4 or 5 month ago.
OTOH given that success (or lack thereof) I had pushing core changes
through Marcelo the chances it had even if scsi folks ACKed wouldn't
have been too high.

> I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch.  So, sorting things out can sometimes be
> difficult.  But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about.  Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

I personally just don't care enough about 2.4 anymore, so I don't think
I'll invest major amounts of time into it.  Even though the scsi changes
you've done are fairly huge I'm wondering whether we should just throw
it all in anyway - given that you said you'll have to care for the 2.4
scsi stack for a longer time for RH and no one else seems to be interested
doing maintaince.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 20:03       ` Christoph Hellwig
@ 2004-01-12 21:12         ` Jens Axboe
  0 siblings, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2004-01-12 21:12 UTC (permalink / raw)
  To: Christoph Hellwig, Doug Ledford, Arjan Van de Ven,
	Martin Peschke3, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno

On Mon, Jan 12 2004, Christoph Hellwig wrote:
> On Mon, Jan 12, 2004 at 02:51:42PM -0500, Doug Ledford wrote:
> > More or less.  But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch).
> 
> I'm wondering whether we want it now that 2.4 is basically frozen, but
> I don't think there was a strong case against it say 4 or 5 month ago.
> OTOH given that success (or lack thereof) I had pushing core changes
> through Marcelo the chances it had even if scsi folks ACKed wouldn't
> have been too high.

That's the key point, is it appropriate to merge now...

But I can completely back Doug on the point he made wrt pusing stuff
back to mainline - it was hard, because we deviated too much. And that
is also what I stressed would be the most important argument for merging
the iorl + scsi core changes.

> > I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch.  So, sorting things out can sometimes be
> > difficult.  But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about.  Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
> 
> I personally just don't care enough about 2.4 anymore, so I don't think
> I'll invest major amounts of time into it.  Even though the scsi changes
> you've done are fairly huge I'm wondering whether we should just throw
> it all in anyway - given that you said you'll have to care for the 2.4
> scsi stack for a longer time for RH and no one else seems to be interested
> doing maintaince.

Ditto.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 19:51     ` Doug Ledford
  2004-01-12 20:03       ` Christoph Hellwig
@ 2004-01-13 20:55       ` Marcelo Tosatti
  2004-01-17 13:10         ` Doug Ledford
  2004-01-15 17:17       ` Bill Davidsen
  2 siblings, 1 reply; 23+ messages in thread
From: Marcelo Tosatti @ 2004-01-13 20:55 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno



On Mon, 12 Jan 2004, Doug Ledford wrote:

> On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > as the patch discussed in this thread, i.e. pure (partially
> > > > vintage) bugfixes.
> > >
> > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > vendor trees but not in marcelo's tree.
> >
> > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > for ages.  In fact I asked Jens & Doug two times whether they could sort out
> > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > enough with real work so this never happened.
>
> More or less.  But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch).  I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch.  So, sorting things out can sometimes be
> difficult.  But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about.  Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

Hi,

Merging "straight" _bugfixes_ (I think all we agree that merging
performance enhancements at this point in 2.4 is not interesting) which,
as you said, get reviewed by Jens/James/others is OK.

What is this mlqueue patchset fixing ? To tell you the truth, I'm not
aware of the SCSI stack problems.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-12 19:51     ` Doug Ledford
  2004-01-12 20:03       ` Christoph Hellwig
  2004-01-13 20:55       ` Marcelo Tosatti
@ 2004-01-15 17:17       ` Bill Davidsen
  2004-01-17 13:12         ` Doug Ledford
  2 siblings, 1 reply; 23+ messages in thread
From: Bill Davidsen @ 2004-01-15 17:17 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

Doug Ledford wrote:

> More or less.  But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch).  I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch.  So, sorting things out can sometimes be
> difficult.  But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about.  Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

If it leads to a more stable kernel, I don't see why iorl can't go in 
(user perspective) because RH is going to maintain it instead of trying 
to find a developer who is competent and willing to do the boring task 
of backfitting bugfixes to sub-optimal code.

The only problem I see would be getting testing before calling it 
stable. Putting out a "giant SCSI patch" for test, then into a -test 
kernel should solve that. The fact that RH is stuck supporting this for 
at least five years is certainly both motivation and field test for any 
changes ;-)

Clearly Marcello has the decision, but it looks from here as if 
stability would be improved by something like this. Assuming that no 
other vendor objects, of course.

-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-13 20:55       ` Marcelo Tosatti
@ 2004-01-17 13:10         ` Doug Ledford
  2004-01-17 16:58           ` Christoph Hellwig
                             ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 13:10 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> On Mon, 12 Jan 2004, Doug Ledford wrote:
> 
> > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > vintage) bugfixes.
> > > >
> > > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > > vendor trees but not in marcelo's tree.
> > >
> > > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > > for ages.  In fact I asked Jens & Doug two times whether they could sort out
> > > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > > enough with real work so this never happened.
> >
> > More or less.  But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch).  I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch.  So, sorting things out can sometimes be
> > difficult.  But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about.  Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
> 
> Hi,
> 
> Merging "straight" _bugfixes_ 

OK, I've got some new bk trees established on linux-scsi.bkbits.net. 
There is a 2.4-for-marcelo, which is where I will stick stuff I think is
ready to go to you.  There is a 2.4-drivers.  Then there is a
2.4-everything.  The everything tree will be a place to put any and all
patches that aren't just experimental patches.  So, just about any Red
Hat SCSI patch is fair game.  Same for the mods SuSE has made.  Pushing
changes from here to 2.4-for-marcelo on a case by case basis is more or
less my plan.  Right now, these trees are all just perfect clones of
Marcelo's tree.  As I make changes and commit things, I'll update the
lists with the new stuff.

> (I think all we agree that merging
> performance enhancements at this point in 2.4 is not interesting)

Well, I can actually make a fairly impressive argument for the iorl
patch if you ask me.  It's your decision if you want it, but here's my
take on the issue:

1)  Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
current 2.4.21 based RHEL 3 release.  So, it's getting *lots* of testing
outside the community.  Including things like all the specweb
benchmarks, Oracle TPC benchmarks, and all of our customers are using
this patch.  That's a *lot* of real world, beat it to death testing.

2)  As I recall, SuSE has shipped either their own iorl type patch, or
they used one of our versions of the patch, don't really know which. 
But, I know they've got the same basic functionality.  Between 1 and 2,
that greatly increases the field testing of the iorl patch and greatly
reduces the breadth of testing of the regular kernel scsi stack.

3)  OK, you can call it a performance patch if you want, but that
doesn't really do it justice.  So, here's a little history.  I used the
lockmeter patch when I was writing the iorl patch to get a clue just how
much difference it made.  Obviously, on single disk systems it makes
very little difference (but it does make some difference because the
host controller and scsi_request_fn use different locks even on a single
device and that does help some).  But, on a test machine with 4 CPUs and
something like 14 disks, without the patch a disk benchmark was using
100% of CPU and hitting something like 98% contention on the
io_request_lock.  On the same system with the patch, benchmark CPU usage
dropped to something like 12% and because we weren't spinning any more
scsi_request_fn and make_request dropped off the profile radar entirely
and contention was basically down to low single digits.  Consider my
home server has 7 disks in a raid5 array, it certainly makes a
difference to me ;-)

4)  The last issue.  2.6 already has individual host locks for drivers. 
The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6. 
Right now it takes some fairly convoluted #ifdef statements to get the
locking right in a driver that supports both 2.4 and 2.6.  Adding the
iorl patch allows driver authors to basically state that they don't
support anything prior to whatever version of 2.4 it goes into and
remove a bunch of #ifdef crap.

>  which,
> as you said, get reviewed by Jens/James/others is OK.

That's the reason for the bk trees.  Anyone can check them out that way.

> What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> aware of the SCSI stack problems.

OK.  This is a different problem entirely.  The current 2.4 scsi stack
has a problem handling QUEUE_FULL and BUSY status commands.  Amongst
other things, it retries the command immediately and will basically hang
the SCSI bus if the low level driver doesn't intervene.  Most well
written drivers have workaround code in them (including my aic7xxx_old
driver).  The mlqueue patch changes it so that the mid layer implements
delayed retries of QUEUE_FULL and BUSY commands.  It also makes the mid
layer recognize the situation where you get a QUEUE_FULL or BUSY command
status with 0 outstanding commands.  This used to cause the mid layer to
hang on that device because there would be no command completion to kick
the queue back into gear.  Now, with the mlqueue patch, it detects 0
outstanding commands and sets a 1/5 second timer that restarts command
flow after the delay.  This solved several IBM bugs in one go.  Again,
this is a problem that older drivers with workaround code didn't need,
but newer drivers such as the IBM zfcp (is that right?) and the open
source Emulex driver don't have the standard workarounds for the bug and
were getting hit by it.  By fixing the problem properly, a lot of
drivers could actually remove their workaround code and use the mid
layer code.  The first version of this patch just solves the hang and
immediate retry problem.  Ongoing work will fix the patch to do more
things correctly, such as replaying commands in their original order,
stuff like that.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-15 17:17       ` Bill Davidsen
@ 2004-01-17 13:12         ` Doug Ledford
  2004-01-17 15:16           ` Bill Davidsen
  0 siblings, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 13:12 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
> Doug Ledford wrote:
> 
> > More or less.  But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch).  I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch.  So, sorting things out can sometimes be
> > difficult.  But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about.  Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
> 
> If it leads to a more stable kernel, I don't see why iorl can't go in 
> (user perspective) because RH is going to maintain it instead of trying 
> to find a developer who is competent and willing to do the boring task 
> of backfitting bugfixes to sub-optimal code.

We actually intended to leave it out of RHEL3.  But, once we started
doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
didn't put the patch back in we could kiss all of our benchmark results
goodbye.  Seriously, it makes that much difference on server systems.

> The only problem I see would be getting testing before calling it 
> stable. Putting out a "giant SCSI patch" for test, then into a -test 
> kernel should solve that. The fact that RH is stuck supporting this for 
> at least five years is certainly both motivation and field test for any 
> changes ;-)

See me last email on the testing issue.

> Clearly Marcello has the decision, but it looks from here as if 
> stability would be improved by something like this. Assuming that no 
> other vendor objects, of course.

Stability, maybe.  Ease of writing drivers that work in both 2.4 and 2.6
using the same locking logic, absolutely.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 13:12         ` Doug Ledford
@ 2004-01-17 15:16           ` Bill Davidsen
  2004-01-17 16:07             ` Doug Ledford
  0 siblings, 1 reply; 23+ messages in thread
From: Bill Davidsen @ 2004-01-17 15:16 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

Doug Ledford wrote:
> On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
> 
>>Doug Ledford wrote:
>>
>>
>>>More or less.  But part of it also is that a lot of the patches I've
>>>written are on top of other patches that people don't want (aka, the
>>>iorl patch).  I've got a mlqueue patch that actually *really* should go
>>>into mainline because it solves a slew of various problems in one go,
>>>but the current version of the patch depends on some semantic changes
>>>made by the iorl patch.  So, sorting things out can sometimes be
>>>difficult.  But, I've been told to go ahead and do what I can as far as
>>>getting the stuff out, so I'm taking some time to try and get a bk tree
>>>out there so people can see what I'm talking about.  Once I've got the
>>>full tree out there, then it might be possible to start picking and
>>>choosing which things to port against mainline so that they don't depend
>>>on patches like the iorl patch.
>>
>>If it leads to a more stable kernel, I don't see why iorl can't go in 
>>(user perspective) because RH is going to maintain it instead of trying 
>>to find a developer who is competent and willing to do the boring task 
>>of backfitting bugfixes to sub-optimal code.
> 
> 
> We actually intended to leave it out of RHEL3.  But, once we started
> doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
> didn't put the patch back in we could kiss all of our benchmark results
> goodbye.  Seriously, it makes that much difference on server systems.

I'm running 30+ usenet servers, currently on older RH versions. Better 
performance would certainly be a plus, although stability WRT lockups 
has been an issue, as has operation when the number of threads gets very 
high. First tests of RHEL looked very promising for stability, I'm 
hoping to get the okay to do serious production testing next week.

>>The only problem I see would be getting testing before calling it 
>>stable. Putting out a "giant SCSI patch" for test, then into a -test 
>>kernel should solve that. The fact that RH is stuck supporting this for 
>>at least five years is certainly both motivation and field test for any 
>>changes ;-)
> 
> 
> See me last email on the testing issue.
> 
> 
>>Clearly Marcello has the decision, but it looks from here as if 
>>stability would be improved by something like this. Assuming that no 
>>other vendor objects, of course.
> 
> 
> Stability, maybe.  Ease of writing drivers that work in both 2.4 and 2.6
> using the same locking logic, absolutely.
> 


-- 
bill davidsen <davidsen@tmr.com>
   CTO TMR Associates, Inc
   Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 15:16           ` Bill Davidsen
@ 2004-01-17 16:07             ` Doug Ledford
  0 siblings, 0 replies; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 16:07 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Christoph Hellwig, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, 2004-01-17 at 10:16, Bill Davidsen wrote:
> Doug Ledford wrote:
> > On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
> > 
> >>Doug Ledford wrote:
> >>
> >>
> >>>More or less.  But part of it also is that a lot of the patches I've
> >>>written are on top of other patches that people don't want (aka, the
> >>>iorl patch).  I've got a mlqueue patch that actually *really* should go
> >>>into mainline because it solves a slew of various problems in one go,
> >>>but the current version of the patch depends on some semantic changes
> >>>made by the iorl patch.  So, sorting things out can sometimes be
> >>>difficult.  But, I've been told to go ahead and do what I can as far as
> >>>getting the stuff out, so I'm taking some time to try and get a bk tree
> >>>out there so people can see what I'm talking about.  Once I've got the
> >>>full tree out there, then it might be possible to start picking and
> >>>choosing which things to port against mainline so that they don't depend
> >>>on patches like the iorl patch.
> >>
> >>If it leads to a more stable kernel, I don't see why iorl can't go in 
> >>(user perspective) because RH is going to maintain it instead of trying 
> >>to find a developer who is competent and willing to do the boring task 
> >>of backfitting bugfixes to sub-optimal code.
> > 
> > 
> > We actually intended to leave it out of RHEL3.  But, once we started
> > doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
> > didn't put the patch back in we could kiss all of our benchmark results
> > goodbye.  Seriously, it makes that much difference on server systems.
> 
> I'm running 30+ usenet servers, currently on older RH versions. Better 
> performance would certainly be a plus, although stability WRT lockups 
> has been an issue, as has operation when the number of threads gets very 
> high. First tests of RHEL looked very promising for stability, I'm 
> hoping to get the okay to do serious production testing next week.

Well, when you think about it, the iorl patch makes lockups less likely,
and when they do occur they are likely to effect a smaller subset of the
scsi subsystem.  When you have one lock and you get a lockup, the whole
system stops.  When you have one lock per device and one lock per
controller, when they deadlock, they only deadlock a subset of the scsi
subsystem.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 13:10         ` Doug Ledford
@ 2004-01-17 16:58           ` Christoph Hellwig
  2004-01-17 19:07             ` Doug Ledford
  2004-01-20  2:19           ` [2.4] scsi per-host lock was " Marcelo Tosatti
  2004-01-25  0:31           ` Kurt Garloff
  2 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-17 16:58 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> 4)  The last issue.  2.6 already has individual host locks for drivers. 
> The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
> to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6. 
> Right now it takes some fairly convoluted #ifdef statements to get the
> locking right in a driver that supports both 2.4 and 2.6.  Adding the
> iorl patch allows driver authors to basically state that they don't
> support anything prior to whatever version of 2.4 it goes into and
> remove a bunch of #ifdef crap.

Well, no.  For one thing all the iorl patches miss the scsi_assign_lock
interface from 2.6 which makes drivers a big ifdef hell (especially
as the AS2.1 patch uses a different name for the lock as 3.0), and even
if it was there the use of that function is strongly discuraged in 2.6
in favour of just using the host_lock.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 16:58           ` Christoph Hellwig
@ 2004-01-17 19:07             ` Doug Ledford
  2004-01-17 19:17               ` Christoph Hellwig
  2004-01-20  7:53               ` Jens Axboe
  0 siblings, 2 replies; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 19:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, 2004-01-17 at 11:58, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> > 4)  The last issue.  2.6 already has individual host locks for drivers. 
> > The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
> > to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6. 
> > Right now it takes some fairly convoluted #ifdef statements to get the
> > locking right in a driver that supports both 2.4 and 2.6.  Adding the
> > iorl patch allows driver authors to basically state that they don't
> > support anything prior to whatever version of 2.4 it goes into and
> > remove a bunch of #ifdef crap.
> 
> Well, no.  For one thing all the iorl patches miss the scsi_assign_lock
> interface from 2.6 which makes drivers a big ifdef hell (especially
> as the AS2.1 patch uses a different name for the lock as 3.0), and even
> if it was there the use of that function is strongly discuraged in 2.6
> in favour of just using the host_lock.

Yeah, I saw that too.  Of course, it's not like that's my fault :-P  I
had the 2.4.9 version of the iorl patch before 2.5 had a host lock, so
2.5 *could* have followed the 2.4.9-iorl patch convention of using
host->lock as the name instead of host->host_lock and then we wouldn't
be here.  But, because 2.6 uses host->host_lock, and because usage of
the scsi_assign_lock() is discouraged, I made the iorl patch in RHEL3
follow the 2.6 convention of using host->host_lock.  In hindsight, I
should have just followed the 2.4.9-iorl patch convention.  Then a
driver could have just done this:

#ifdef SCSI_HAS_HOST_LOCK
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
	adapter->lock_ptr = &adapter->lock;
	host->lock = &adapter->lock;
#else
	adapter->lock_ptr = &adapter->lock;
	host->host_lock = &adapter->lock;
#endif
#else
	adapter->lock_ptr = &io_request_lock;
#endif

Then you just always use adapter->lock_ptr for spin locks in the driver
and you magically work in all kernel releases.  Now, by going to
host->host_lock in 2.4 we get rid of one of the if statements.  This
isn't impossible to do if both Red Hat and SuSE just release their next
update kernel with host->lock changed to host->host_lock.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 19:07             ` Doug Ledford
@ 2004-01-17 19:17               ` Christoph Hellwig
  2004-01-17 19:21                 ` Doug Ledford
  2004-01-20  7:53               ` Jens Axboe
  1 sibling, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-17 19:17 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, Jan 17, 2004 at 02:07:33PM -0500, Doug Ledford wrote:
> #ifdef SCSI_HAS_HOST_LOCK
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> 	adapter->lock_ptr = &adapter->lock;
> 	host->lock = &adapter->lock;
> #else
> 	adapter->lock_ptr = &adapter->lock;
> 	host->host_lock = &adapter->lock;
> #endif
> #else
> 	adapter->lock_ptr = &io_request_lock;
> #endif

Still looks wrong for the 2.6 case which should just be;

	adapter->lock_ptr = shost->host_lock;

as I just stated in the review for the megaraid update.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 19:17               ` Christoph Hellwig
@ 2004-01-17 19:21                 ` Doug Ledford
  2004-01-17 19:29                   ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 19:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, 2004-01-17 at 14:17, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 02:07:33PM -0500, Doug Ledford wrote:
> > #ifdef SCSI_HAS_HOST_LOCK
> > #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> > 	adapter->lock_ptr = &adapter->lock;
> > 	host->lock = &adapter->lock;
> > #else
> > 	adapter->lock_ptr = &adapter->lock;
> > 	host->host_lock = &adapter->lock;
> > #endif
> > #else
> > 	adapter->lock_ptr = &io_request_lock;
> > #endif
> 
> Still looks wrong for the 2.6 case which should just be;
> 
> 	adapter->lock_ptr = shost->host_lock;
> 
> as I just stated in the review for the megaraid update.

I should pay more attention to what goes on in 2.6.  That was a crap
decision.  There are drivers out there that use global data and need one
lock across all controllers.  Putting a lock in the scsi host struct for
per controller locks is a waste of space.  Let the driver decide if it
needs a global driver lock, a per controller lock (in which case it
should be part of the driver private host struct), or just the global
io_request_lock.  It really is up to the driver and putting it into the
scsi host struct doesn't make sense.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 19:21                 ` Doug Ledford
@ 2004-01-17 19:29                   ` Christoph Hellwig
  2004-01-17 20:36                     ` Doug Ledford
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-17 19:29 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, Jan 17, 2004 at 02:21:43PM -0500, Doug Ledford wrote:
> I should pay more attention to what goes on in 2.6.  That was a crap
> decision.  There are drivers out there that use global data and need one
> lock across all controllers.  Putting a lock in the scsi host struct for
> per controller locks is a waste of space.  Let the driver decide if it
> needs a global driver lock, a per controller lock (in which case it
> should be part of the driver private host struct), or just the global
> io_request_lock.  It really is up to the driver and putting it into the
> scsi host struct doesn't make sense.

I tend to disagree.  Giving the driver control over the lock used by the
midlayer is just wrong.  If it has global state (which a good driver better
shouldn't) it's up to the driver to protect it.  That we still have a way
to override that lock is the big bug rather, and in fact only three drivers
are doing that, and they all override it with a lock that's per-HBA anyway.

> 
> -- 
>   Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
>          Red Hat, Inc.
>          1801 Varsity Dr.
>          Raleigh, NC 27606
> 
> 
---end quoted text---

-- 
Christoph Hellwig <hch@lst.de>		-	Freelance Hacker
Contact me for driver hacking and kernel development consulting

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 19:29                   ` Christoph Hellwig
@ 2004-01-17 20:36                     ` Doug Ledford
  2004-01-17 20:54                       ` Christoph Hellwig
  0 siblings, 1 reply; 23+ messages in thread
From: Doug Ledford @ 2004-01-17 20:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marcelo Tosatti, Arjan Van de Ven, Martin Peschke3, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

On Sat, 2004-01-17 at 14:29, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 02:21:43PM -0500, Doug Ledford wrote:
> > I should pay more attention to what goes on in 2.6.  That was a crap
> > decision.  There are drivers out there that use global data and need one
> > lock across all controllers.  Putting a lock in the scsi host struct for
> > per controller locks is a waste of space.  Let the driver decide if it
> > needs a global driver lock, a per controller lock (in which case it
> > should be part of the driver private host struct), or just the global
> > io_request_lock.  It really is up to the driver and putting it into the
> > scsi host struct doesn't make sense.
> 
> I tend to disagree.  Giving the driver control over the lock used by the
> midlayer is just wrong.  If it has global state (which a good driver better
> shouldn't) it's up to the driver to protect it.  That we still have a way
> to override that lock is the big bug rather, and in fact only three drivers
> are doing that, and they all override it with a lock that's per-HBA anyway.

/me calls Christoph various unrepeatable names :-P

I'm exactly the opposite.  In my opinion, the mid layer should be using
the device locks for its internal operations and calling into the low
level drivers with no locks held.  The fact that the mid layer still
tries to lock drivers for the drivers is a bug IMO.  Every time I see a
driver do spin_unlock_irq(host->host_lock); do driver work;
spin_lock_irq(host->host_lock); return; it really points out that the
mid layer has no business locking down drivers for them.


-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 20:36                     ` Doug Ledford
@ 2004-01-17 20:54                       ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2004-01-17 20:54 UTC (permalink / raw)
  To: Doug Ledford; +Cc: linux-scsi mailing list

On Sat, Jan 17, 2004 at 03:36:33PM -0500, Doug Ledford wrote:
> I'm exactly the opposite.  In my opinion, the mid layer should be using
> the device locks for its internal operations

well, I strongly disagree with that.

> and calling into the low
> level drivers with no locks held.  The fact that the mid layer still
> tries to lock drivers for the drivers is a bug IMO.  Every time I see a
> driver do spin_unlock_irq(host->host_lock); do driver work;
> spin_lock_irq(host->host_lock); return; it really points out that the
> mid layer has no business locking down drivers for them.

I completely agree with that, though.  I didn't have time fixing up
all drivers before we got 2.6, though so this won't happen before 2.7.

Btw, I just noticed how little this has to do with the original discussion,
Cc list thus cut down quite a lot.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [2.4] scsi per-host lock was Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 13:10         ` Doug Ledford
  2004-01-17 16:58           ` Christoph Hellwig
@ 2004-01-20  2:19           ` Marcelo Tosatti
  2004-01-20  3:21             ` Doug Ledford
  2004-01-25  0:31           ` Kurt Garloff
  2 siblings, 1 reply; 23+ messages in thread
From: Marcelo Tosatti @ 2004-01-20  2:19 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Jens Axboe, hch, James Bottomley, linux-kernel, akpm


Doug,

Where can I find a copy of uptodate iorl and mlqueue patchsets ?

I cant enjoy the feeling of splitting a global lock at this point in 2.4's
lifetime, but it has been in RedHat's and SuSE's trees for long.

A reason to have it its the dependacy of mlqueue patch. You said most
drivers are handling the BUSY/QUEUE_FULL hangs by themselves, you mention
"IBM zfcp" / Emulex which dont. What other drivers suffer from the same
problem ?


On Sat, 17 Jan 2004, Doug Ledford wrote:

> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > On Mon, 12 Jan 2004, Doug Ledford wrote:
> >
> > > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > > vintage) bugfixes.
> > > > >
> > > > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > > > vendor trees but not in marcelo's tree.
> > > >
> > > > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > > > for ages.  In fact I asked Jens & Doug two times whether they could sort out
> > > > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > > > enough with real work so this never happened.
> > >
> > > More or less.  But part of it also is that a lot of the patches I've
> > > written are on top of other patches that people don't want (aka, the
> > > iorl patch).  I've got a mlqueue patch that actually *really* should go
> > > into mainline because it solves a slew of various problems in one go,
> > > but the current version of the patch depends on some semantic changes
> > > made by the iorl patch.  So, sorting things out can sometimes be
> > > difficult.  But, I've been told to go ahead and do what I can as far as
> > > getting the stuff out, so I'm taking some time to try and get a bk tree
> > > out there so people can see what I'm talking about.  Once I've got the
> > > full tree out there, then it might be possible to start picking and
> > > choosing which things to port against mainline so that they don't depend
> > > on patches like the iorl patch.
> >
> > Hi,
> >
> > Merging "straight" _bugfixes_
>
> OK, I've got some new bk trees established on linux-scsi.bkbits.net.
> There is a 2.4-for-marcelo, which is where I will stick stuff I think is
> ready to go to you.  There is a 2.4-drivers.  Then there is a
> 2.4-everything.  The everything tree will be a place to put any and all
> patches that aren't just experimental patches.  So, just about any Red
> Hat SCSI patch is fair game.  Same for the mods SuSE has made.  Pushing
> changes from here to 2.4-for-marcelo on a case by case basis is more or
> less my plan.  Right now, these trees are all just perfect clones of
> Marcelo's tree.  As I make changes and commit things, I'll update the
> lists with the new stuff.
>
> > (I think all we agree that merging
> > performance enhancements at this point in 2.4 is not interesting)
>
> Well, I can actually make a fairly impressive argument for the iorl
> patch if you ask me.  It's your decision if you want it, but here's my
> take on the issue:
>
> 1)  Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
> release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
> current 2.4.21 based RHEL 3 release.  So, it's getting *lots* of testing
> outside the community.  Including things like all the specweb
> benchmarks, Oracle TPC benchmarks, and all of our customers are using
> this patch.  That's a *lot* of real world, beat it to death testing.
>
> 2)  As I recall, SuSE has shipped either their own iorl type patch, or
> they used one of our versions of the patch, don't really know which.
> But, I know they've got the same basic functionality.  Between 1 and 2,
> that greatly increases the field testing of the iorl patch and greatly
> reduces the breadth of testing of the regular kernel scsi stack.
>
> 3)  OK, you can call it a performance patch if you want, but that
> doesn't really do it justice.  So, here's a little history.  I used the
> lockmeter patch when I was writing the iorl patch to get a clue just how
> much difference it made.  Obviously, on single disk systems it makes
> very little difference (but it does make some difference because the
> host controller and scsi_request_fn use different locks even on a single
> device and that does help some).  But, on a test machine with 4 CPUs and
> something like 14 disks, without the patch a disk benchmark was using
> 100% of CPU and hitting something like 98% contention on the
> io_request_lock.  On the same system with the patch, benchmark CPU usage
> dropped to something like 12% and because we weren't spinning any more
> scsi_request_fn and make_request dropped off the profile radar entirely
> and contention was basically down to low single digits.  Consider my
> home server has 7 disks in a raid5 array, it certainly makes a
> difference to me ;-)
>
> 4)  The last issue.  2.6 already has individual host locks for drivers.
> The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
> to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
> Right now it takes some fairly convoluted #ifdef statements to get the
> locking right in a driver that supports both 2.4 and 2.6.  Adding the
> iorl patch allows driver authors to basically state that they don't
> support anything prior to whatever version of 2.4 it goes into and
> remove a bunch of #ifdef crap.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [2.4] scsi per-host lock was Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-20  2:19           ` [2.4] scsi per-host lock was " Marcelo Tosatti
@ 2004-01-20  3:21             ` Doug Ledford
  0 siblings, 0 replies; 23+ messages in thread
From: Doug Ledford @ 2004-01-20  3:21 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Jens Axboe, hch, James Bottomley, linux-kernel, akpm

On Mon, 2004-01-19 at 21:19, Marcelo Tosatti wrote:
> Doug,
> 
> Where can I find a copy of uptodate iorl and mlqueue patchsets ?

Red Hat's 2.4.21-9.EL kernel?  ;-)

Seriously, I've already set up 3 new bk trees on linux-scsi.bkbits.net. 
They are 2.4-for-marcelo, 2.4-drivers, and 2.4-everything.  Right now
the trees are still just clones of your current tree.  The plan,
however, is to sort through all of the scsi patches we put into our
kernels and put them into the various trees.  Obviously, the everything
tree will have both the iorl and mlqueue patches plus the other stuff. 
The for-marcelo tree is stuff that is agreed to be appropriate for
mainline.  The drivers tree is where I'll start sticking driver updates
that come across linux-scsi for the 2.4 kernel.

> I cant enjoy the feeling of splitting a global lock at this point in 2.4's
> lifetime, but it has been in RedHat's and SuSE's trees for long.
> 
> A reason to have it its the dependacy of mlqueue patch. You said most
> drivers are handling the BUSY/QUEUE_FULL hangs by themselves, you mention
> "IBM zfcp" / Emulex which dont. What other drivers suffer from the same
> problem ?

Well, that's a hard to answer question.  First, the problem only shows
up on drivers that actually enabled tagged queuing and use a reasonably
deep queue depth (normally).  So, that automatically strikes out most of
the older ISA and crap hardware drivers since they only do a command at
a time.  The better drivers; aic7xxx (new and old), ncr53c8xx, sym1 and
sym2, qla2x00, etc.; work around the problem.  It's mainly newer drivers
that vendors are starting to write and who don't know the history of the
broken mid layer handling of QUEUE_FULL and BUSY situations.  So, I
cited the two I know of off the top of my head, and there may be others
but I would actually have to look at the source code to identify them
(when 98% of the scsi market is QLogic and NCR/SYM and aic7xxx, those
other drivers fall through the cracks).  Of course, one of the other
issues is that each driver that tries to handle this situation itself
does it in a different manner, and the correctness of those varying
implementations is also an issue to consider.


-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 19:07             ` Doug Ledford
  2004-01-17 19:17               ` Christoph Hellwig
@ 2004-01-20  7:53               ` Jens Axboe
  1 sibling, 0 replies; 23+ messages in thread
From: Jens Axboe @ 2004-01-20  7:53 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Christoph Hellwig, Marcelo Tosatti, Arjan Van de Ven,
	Martin Peschke3, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno

On Sat, Jan 17 2004, Doug Ledford wrote:
> On Sat, 2004-01-17 at 11:58, Christoph Hellwig wrote:
> > On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> > > 4)  The last issue.  2.6 already has individual host locks for drivers. 
> > > The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
> > > to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6. 
> > > Right now it takes some fairly convoluted #ifdef statements to get the
> > > locking right in a driver that supports both 2.4 and 2.6.  Adding the
> > > iorl patch allows driver authors to basically state that they don't
> > > support anything prior to whatever version of 2.4 it goes into and
> > > remove a bunch of #ifdef crap.
> > 
> > Well, no.  For one thing all the iorl patches miss the scsi_assign_lock
> > interface from 2.6 which makes drivers a big ifdef hell (especially
> > as the AS2.1 patch uses a different name for the lock as 3.0), and even
> > if it was there the use of that function is strongly discuraged in 2.6
> > in favour of just using the host_lock.
> 
> Yeah, I saw that too.  Of course, it's not like that's my fault :-P  I
> had the 2.4.9 version of the iorl patch before 2.5 had a host lock, so
> 2.5 *could* have followed the 2.4.9-iorl patch convention of using
> host->lock as the name instead of host->host_lock and then we wouldn't
> be here.  But, because 2.6 uses host->host_lock, and because usage of

First of all, ->lock is a _bad_ name. And secondly, we probably did this
at around the same time (the 2.5 block code was already "done" by the
time 2.5 opened, if you recall it was one of the first things merged in
2.5.1-pre1). The fact that I chose the better name (which is really a
rare incident, my names typically suck) is just history :-)

> the scsi_assign_lock() is discouraged, I made the iorl patch in RHEL3
> follow the 2.6 convention of using host->host_lock.  In hindsight, I
> should have just followed the 2.4.9-iorl patch convention.  Then a
> driver could have just done this:
> 
> #ifdef SCSI_HAS_HOST_LOCK
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> 	adapter->lock_ptr = &adapter->lock;
> 	host->lock = &adapter->lock;
> #else
> 	adapter->lock_ptr = &adapter->lock;
> 	host->host_lock = &adapter->lock;
> #endif
> #else
> 	adapter->lock_ptr = &io_request_lock;
> #endif
> 
> Then you just always use adapter->lock_ptr for spin locks in the driver
> and you magically work in all kernel releases.  Now, by going to
> host->host_lock in 2.4 we get rid of one of the if statements.  This
> isn't impossible to do if both Red Hat and SuSE just release their next
> update kernel with host->lock changed to host->host_lock.

I can certainly help make that happen, I completely agree it's needed.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: smp dead lock of io_request_lock/queue_lock patch
  2004-01-17 13:10         ` Doug Ledford
  2004-01-17 16:58           ` Christoph Hellwig
  2004-01-20  2:19           ` [2.4] scsi per-host lock was " Marcelo Tosatti
@ 2004-01-25  0:31           ` Kurt Garloff
  2 siblings, 0 replies; 23+ messages in thread
From: Kurt Garloff @ 2004-01-25  0:31 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Marcelo Tosatti, Christoph Hellwig, Arjan Van de Ven,
	Martin Peschke3, Jens Axboe, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno

[-- Attachment #1: Type: text/plain, Size: 2223 bytes --]

Doug,

On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > Merging "straight" _bugfixes_ 
> 
> OK, I've got some new bk trees established on linux-scsi.bkbits.net. 
> There is a 2.4-for-marcelo, which is where I will stick stuff I think is
> ready to go to you.  There is a 2.4-drivers.  Then there is a
> 2.4-everything.  The everything tree will be a place to put any and all
> patches that aren't just experimental patches.  So, just about any Red
> Hat SCSI patch is fair game.  Same for the mods SuSE has made.

Not everything we have is straight bugfixes :-\

> Pushing
> changes from here to 2.4-for-marcelo on a case by case basis is more or
> less my plan.  Right now, these trees are all just perfect clones of
> Marcelo's tree.  As I make changes and commit things, I'll update the
> lists with the new stuff.

Find the collection of changes we've done to SCSI scanning at
http://www.suse.de/~garloff/linux/scsi-scan/scsi-scanning.html
(currently syncing, might take an hour before the patches are visible)

It contains 
* various blacklist updates
* various new parameters that influence the way SCSI scanning is done
  (one of the areas that generated amazingly many support calls and
   HW vendor requests)
* a refactoring of the code to make it resemble 2.6
* support for REPORT_LUNS (backport from 2.6 as well)

The amount of parameters is larger than in 2.6, you can also instruct
the code not to use REPORT_LUNS at all, you can blacklist devices
for REPORT_LUNS (BLIST_NOREPLUN) and you can tell the code to try
REPORT_LUNS on SCSI-2 devices (if the host adapter reports to support
more than 8 LUNs, so you avoid the usual USB screwups).

I'd certainly appreciate if this is merged into more 2.4 trees and I'll
try to find the time to push some improvements into 2.6. If anybody is
faster, I would not mind either ;-)

Destructive criticism appreciated ;-)

Doug, would you mind to post the BUSY and QUEUE_FULL fixes for review?

Regards,
-- 
Kurt Garloff  <garloff@suse.de>                            Cologne, DE 
SUSE LINUX AG, Nuernberg, DE                          SUSE Labs (Head)

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2004-01-25  0:32 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-12 15:07 smp dead lock of io_request_lock/queue_lock patch Martin Peschke3
2004-01-12 15:12 ` Arjan van de Ven
2004-01-12 19:48   ` Christoph Hellwig
2004-01-12 19:51     ` Doug Ledford
2004-01-12 20:03       ` Christoph Hellwig
2004-01-12 21:12         ` Jens Axboe
2004-01-13 20:55       ` Marcelo Tosatti
2004-01-17 13:10         ` Doug Ledford
2004-01-17 16:58           ` Christoph Hellwig
2004-01-17 19:07             ` Doug Ledford
2004-01-17 19:17               ` Christoph Hellwig
2004-01-17 19:21                 ` Doug Ledford
2004-01-17 19:29                   ` Christoph Hellwig
2004-01-17 20:36                     ` Doug Ledford
2004-01-17 20:54                       ` Christoph Hellwig
2004-01-20  7:53               ` Jens Axboe
2004-01-20  2:19           ` [2.4] scsi per-host lock was " Marcelo Tosatti
2004-01-20  3:21             ` Doug Ledford
2004-01-25  0:31           ` Kurt Garloff
2004-01-15 17:17       ` Bill Davidsen
2004-01-17 13:12         ` Doug Ledford
2004-01-17 15:16           ` Bill Davidsen
2004-01-17 16:07             ` Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.