All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: smp dead lock of io_request_lock/queue_lock patch
@ 2004-01-12 15:07 Martin Peschke3
  2004-01-12 15:12 ` Arjan van de Ven
  0 siblings, 1 reply; 48+ messages in thread
From: Martin Peschke3 @ 2004-01-12 15:07 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Doug Ledford, Arjan Van de Ven, Peter Yao, linux-kernel,
	linux-scsi mailing list, ihno


Hi,

I understand that people tend to be careful with optimizations
or enhancements. In that regard, this particular patch is a
disputable example.

But, my request has been meant in a broader sense.
This iorl-patch is just one out of roughly a dozen on
both RH's side and SuSE's side. I have been asking
for getting together on the list and trying to match and merge
these piles of patches, which are possibly not all as disputable
as the patch discussed in this thread, i.e. pure (partially
vintage) bugfixes.

If people agree in that course also about a clean, common
iorl-patch, that would be another step forward, in my opinion.


Mit freundlichen Grüßen / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349


Jens Axboe <axboe@suse.de>@vger.kernel.org on 12/01/2004 15:13:30

Sent by:    linux-scsi-owner@vger.kernel.org


To:    Martin Peschke3/Germany/IBM@IBMDE
cc:    Doug Ledford <dledford@redhat.com>, Arjan Van de Ven
       <arjanv@redhat.com>, Peter Yao <peter@exavio.com.cn>,
       linux-kernel@vger.kernel.org, linux-scsi mailing list
       <linux-scsi@vger.kernel.org>
Subject:    Re: smp dead lock of io_request_lock/queue_lock patch


On Mon, Jan 12 2004, Martin Peschke3 wrote:
> Hi,
>
> is there a way to merge all (or at least the common denominator) of
> Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
> The SCSI part of Marcelo's kernel seems to be rather backlevel,
> considering all those fixes and enhancements added by the mentioned
> distributors and their SCSI experts. As this discussion underlines,
> there are a lot of common problems and sometimes even common
> approaches.  I am convinced that a number of patches ought to be
> incorporated into the mainline kernel. Though, I must admit
> that I am at a loss with how this could be achieved given the
> unresolved question of 2.4 SCSI maintenance
> (which has certainly played a part in building up those piles
> of SCSI patches).

I have mixed feelings about that. One on side, I'd love to merge the
scalability patches in mainline. We've had a significant number of bugs
in this area in the past, and it seems a shame that we all have to fix
them independently because we deviate from mainline. On the other hand,
2.4 is pretty much closed. There wont be a big number of new distro 2.4
kernels.

Had you asked me 6 months ago I probably would have advocated merging
them, but right now I think it's a waste of time.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





^ permalink raw reply	[flat|nested] 48+ messages in thread
* Re: smp dead lock of io_request_lock/queue_lock patch
@ 2004-01-19 21:36 ` Martin Peschke3
  0 siblings, 0 replies; 48+ messages in thread
From: Martin Peschke3 @ 2004-01-19 21:36 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Marcelo Tosatti, Christoph Hellwig, Arjan Van de Ven, Jens Axboe,
	Peter Yao, linux-kernel, linux-scsi mailing list, ihno

Doug Ledford wrote:
> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> > aware of the SCSI stack problems.
>
> OK.  This is a different problem entirely.  The current 2.4 scsi stack
> has a problem handling QUEUE_FULL and BUSY status commands.  Amongst
> other things, it retries the command immediately and will basically hang
> the SCSI bus if the low level driver doesn't intervene.  Most well
> written drivers have workaround code in them (including my aic7xxx_old
> driver). The mlqueue patch changes it so that the mid layer implements
> delayed retries of QUEUE_FULL and BUSY commands.  It also makes the mid
> layer recognize the situation where you get a QUEUE_FULL or BUSY command
> status with 0 outstanding commands.  This used to cause the mid layer to
> hang on that device because there would be no command completion to kick
> the queue back into gear.

http://marc.theaimsgroup.com/?t=106001205900001&r=1&w=2
We also stumbled about this problem last year. Our patch wasn't
as sophisticated as Doug's new patch. However, it worked,
and some fix is certainly needed.

> Now, with the mlqueue patch, it detects 0
> outstanding commands and sets a 1/5 second timer that restarts command
> flow after the delay.  This solved several IBM bugs in one go.

This includes a panic
http://marc.theaimsgroup.com/?t=105167914900002&r=1&w=2
, hung SCSI hosts due to an error handling deadlock related to
commands snoozing in the mlqueue (can't find a link),
as well as the problem mentioned above.

> Again,
> this is a problem that older drivers with workaround code didn't need,
> but newer drivers such as the IBM zfcp (is that right?) and the open
> source Emulex driver don't have the standard workarounds for the bug and
> were getting hit by it.

Doug, you are right with regard to zfcp. But given the above listing
of is issues, I don't see how we could perfectly work around :-)

> By fixing the problem properly, a lot of
> drivers could actually remove their workaround code and use the mid
> layer code.  The first version of this patch just solves the hang and
> immediate retry problem.  Ongoing work will fix the patch to do more
> things correctly, such as replaying commands in their original order,
> stuff like that.

Sounds good.


Mit freundlichen Grüßen / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349


Doug Ledford <dledford@redhat.com> on 17/01/2004 14:10:00

To:    Marcelo Tosatti <marcelo.tosatti@cyclades.com>
cc:    Christoph Hellwig <hch@infradead.org>, Arjan Van de Ven
       <arjanv@redhat.com>, Martin Peschke3/Germany/IBM@IBMDE, Jens Axboe
       <axboe@suse.de>, Peter Yao <peter@exavio.com.cn>,
       linux-kernel@vger.kernel.org, linux-scsi mailing list
       <linux-scsi@vger.kernel.org>, ihno@suse.de
Subject:    Re: smp dead lock of io_request_lock/queue_lock patch


On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> On Mon, 12 Jan 2004, Doug Ledford wrote:
>
> > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > vintage) bugfixes.
> > > >
> > > > Both SuSE and Red Hat submit bugfixes they put in the respective
trees to
> > > > marcelo already. There will not be many "pure bugfixes" that you
can find in
> > > > vendor trees but not in marcelo's tree.
> > >
> > > I haven't seen SCSI patches sumission for 2.4 from the vendors on
linux-scsi
> > > for ages.  In fact I asked Jens & Doug two times whether they could
sort out
> > > the distro patches for the 2.4 stack and post them, but it seems
they're busy
> > > enough with real work so this never happened.
> >
> > More or less.  But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch).  I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch.  So, sorting things out can sometimes be
> > difficult.  But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about.  Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't
depend
> > on patches like the iorl patch.
>
> Hi,
>
> Merging "straight" _bugfixes_

OK, I've got some new bk trees established on linux-scsi.bkbits.net.
There is a 2.4-for-marcelo, which is where I will stick stuff I think is
ready to go to you.  There is a 2.4-drivers.  Then there is a
2.4-everything.  The everything tree will be a place to put any and all
patches that aren't just experimental patches.  So, just about any Red
Hat SCSI patch is fair game.  Same for the mods SuSE has made.  Pushing
changes from here to 2.4-for-marcelo on a case by case basis is more or
less my plan.  Right now, these trees are all just perfect clones of
Marcelo's tree.  As I make changes and commit things, I'll update the
lists with the new stuff.

> (I think all we agree that merging
> performance enhancements at this point in 2.4 is not interesting)

Well, I can actually make a fairly impressive argument for the iorl
patch if you ask me.  It's your decision if you want it, but here's my
take on the issue:

1)  Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
current 2.4.21 based RHEL 3 release.  So, it's getting *lots* of testing
outside the community.  Including things like all the specweb
benchmarks, Oracle TPC benchmarks, and all of our customers are using
this patch.  That's a *lot* of real world, beat it to death testing.

2)  As I recall, SuSE has shipped either their own iorl type patch, or
they used one of our versions of the patch, don't really know which.
But, I know they've got the same basic functionality.  Between 1 and 2,
that greatly increases the field testing of the iorl patch and greatly
reduces the breadth of testing of the regular kernel scsi stack.

3)  OK, you can call it a performance patch if you want, but that
doesn't really do it justice.  So, here's a little history.  I used the
lockmeter patch when I was writing the iorl patch to get a clue just how
much difference it made.  Obviously, on single disk systems it makes
very little difference (but it does make some difference because the
host controller and scsi_request_fn use different locks even on a single
device and that does help some).  But, on a test machine with 4 CPUs and
something like 14 disks, without the patch a disk benchmark was using
100% of CPU and hitting something like 98% contention on the
io_request_lock.  On the same system with the patch, benchmark CPU usage
dropped to something like 12% and because we weren't spinning any more
scsi_request_fn and make_request dropped off the profile radar entirely
and contention was basically down to low single digits.  Consider my
home server has 7 disks in a raid5 array, it certainly makes a
difference to me ;-)

4)  The last issue.  2.6 already has individual host locks for drivers.
The iorl patch for 2.4 adds the same thing.  So, adding the iorl patch
to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
Right now it takes some fairly convoluted #ifdef statements to get the
locking right in a driver that supports both 2.4 and 2.6.  Adding the
iorl patch allows driver authors to basically state that they don't
support anything prior to whatever version of 2.4 it goes into and
remove a bunch of #ifdef crap.

>  which,
> as you said, get reviewed by Jens/James/others is OK.

That's the reason for the bk trees.  Anyone can check them out that way.

> What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> aware of the SCSI stack problems.

OK.  This is a different problem entirely.  The current 2.4 scsi stack
has a problem handling QUEUE_FULL and BUSY status commands.  Amongst
other things, it retries the command immediately and will basically hang
the SCSI bus if the low level driver doesn't intervene.  Most well
written drivers have workaround code in them (including my aic7xxx_old
driver).  The mlqueue patch changes it so that the mid layer implements
delayed retries of QUEUE_FULL and BUSY commands.  It also makes the mid
layer recognize the situation where you get a QUEUE_FULL or BUSY command
status with 0 outstanding commands.  This used to cause the mid layer to
hang on that device because there would be no command completion to kick
the queue back into gear.  Now, with the mlqueue patch, it detects 0
outstanding commands and sets a 1/5 second timer that restarts command
flow after the delay.  This solved several IBM bugs in one go.  Again,
this is a problem that older drivers with workaround code didn't need,
but newer drivers such as the IBM zfcp (is that right?) and the open
source Emulex driver don't have the standard workarounds for the bug and
were getting hit by it.  By fixing the problem properly, a lot of
drivers could actually remove their workaround code and use the mid
layer code.  The first version of this patch just solves the hang and
immediate retry problem.  Ongoing work will fix the patch to do more
things correctly, such as replaying commands in their original order,
stuff like that.

--
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606







^ permalink raw reply	[flat|nested] 48+ messages in thread
[parent not found: <1d6yN-6HH-17@gated-at.bofh.it>]
* smp dead lock of io_request_lock/queue_lock patch
@ 2004-01-12 16:32 Peter Yao
  2004-01-12  9:08 ` Arjan van de Ven
  0 siblings, 1 reply; 48+ messages in thread
From: Peter Yao @ 2004-01-12 16:32 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-scsi

Hi,
I found a smp dead lock in io_request_lock/queue_lock patch in redhat's
2.4.18-4 kernel. I don't know how this patch is going on, just put my
fix for it here. :)
The dead lock is for scsi host->lock and scsi q->queue_lock between
scsi_restart_operations@scsi_error.c and scsi_request_fn@scsi_lib.c.


Index: scsi_error.c
===================================================================
RCS file:
/home/cvsroot/ieee1394_driver/linux-2.4.18-3/drivers/scsi/scsi_error.c,v
retrieving revision 1.13
retrieving revision 1.13.8.1
diff -Llinux-2.4.18-3/drivers/scsi/scsi_error.c
-Llinux-2.4.18-3/drivers/scsi/scsi_error.c -u -d -r1.13 -r1.13.8.1
--- linux-2.4.18-3/drivers/scsi/scsi_error.c
+++ linux-2.4.18-3/drivers/scsi/scsi_error.c
@@ -1293,11 +1293,11 @@
 			break;
 		}
 		q = &SDpnt->request_queue;
-		spin_lock(q->queue_lock);
 		spin_unlock(host->lock);
+		spin_lock(q->queue_lock);
 		q->request_fn(q);
-		spin_lock(host->lock);
 		spin_unlock(q->queue_lock);
+		spin_lock(host->lock);
 	}
 	spin_unlock_irqrestore(host->lock, flags);
 }


^ permalink raw reply	[flat|nested] 48+ messages in thread
* Re: smp dead lock of io_request_lock/queue_lock patch
@ 2004-01-12 14:07 ` Martin Peschke3
  0 siblings, 0 replies; 48+ messages in thread
From: Martin Peschke3 @ 2004-01-12 14:07 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jens Axboe, Arjan Van de Ven, Peter Yao, linux-kernel,
	linux-scsi mailing list

Hi,

is there a way to merge all (or at least the common denominator) of
Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
The SCSI part of Marcelo's kernel seems to be rather backlevel,
considering all those fixes and enhancements added by the mentioned
distributors and their SCSI experts. As this discussion underlines,
there are a lot of common problems and sometimes even common
approaches.  I am convinced that a number of patches ought to be
incorporated into the mainline kernel. Though, I must admit
that I am at a loss with how this could be achieved given the
unresolved question of 2.4 SCSI maintenance
(which has certainly played a part in building up those piles
of SCSI patches).

Mit freundlichen Grüßen / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349


Doug Ledford <dledford@redhat.com>@vger.kernel.org on 12/01/2004 14:27:53

Sent by:    linux-scsi-owner@vger.kernel.org


To:    Jens Axboe <axboe@suse.de>
cc:    Arjan Van de Ven <arjanv@redhat.com>, Peter Yao
       <peter@exavio.com.cn>, linux-kernel@vger.kernel.org, linux-scsi
       mailing list <linux-scsi@vger.kernel.org>
Subject:    Re: smp dead lock of io_request_lock/queue_lock patch


On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
> On Mon, Jan 12 2004, Arjan van de Ven wrote:
> > On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> > >
> > > ... and still exists in your 2.4.21 based kernel.
> >
> > The RHL 2.4.21 kernels don't have the locking patch at all...
>
> But RHEL3 does:
>
> http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
>
> and the bug is there.

But in RHEL3 the bug is fixed already (not in a released kernel, but the
fix went into our internal kernel some time back and will be in our next
update kernel).  From my internal bk tree for this stuff:

[dledford@compaq RHEL3-scsi]$ bk changes -r1.23
ChangeSet@1.23, 2003-11-10 17:19:54-05:00, dledford@compaq.xsintricity.com
  drivers/scsi/scsi_error.c
      Don't panic if the eh thread is dead, instead do the same thing that
      scsi_softirq_handler does and just complete the command as a failed
      command.
      Change when we wake the eh thread in scsi_times_out to accomodate
      the changes to the mlqueue operations.
      Clear blocked status on the host and all devices in
scsi_restart_operations
->    Don't grab the host_lock in scsi_restart_operations, we aren't doing
      anything that needs it.  Just goose the queues unconditionally,
      scsi_request_fn() will know to not send commands if they shouldn't
      go for some reason.
      Make sure we account SCSI_STATE_MLQUEUE commands as not being failed
      commands in scsi_unjam_host.

But, Jens is right, it's a real bug.  I just fixed it in a different
way.  And my fix is dependent on other changes in our scsi stack as
well.

--
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2004-03-08 21:27 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-12 15:07 smp dead lock of io_request_lock/queue_lock patch Martin Peschke3
2004-01-12 15:12 ` Arjan van de Ven
2004-01-12 19:48   ` Christoph Hellwig
2004-01-12 19:51     ` Doug Ledford
2004-01-12 20:03       ` Christoph Hellwig
2004-01-12 21:12         ` Jens Axboe
2004-01-13 20:55       ` Marcelo Tosatti
2004-01-17 13:10         ` Doug Ledford
2004-01-17 16:58           ` Christoph Hellwig
2004-01-17 19:07             ` Doug Ledford
2004-01-17 19:17               ` Christoph Hellwig
2004-01-17 19:21                 ` Doug Ledford
2004-01-17 19:29                   ` Christoph Hellwig
2004-01-17 20:36                     ` Doug Ledford
2004-01-17 20:54                       ` Christoph Hellwig
2004-01-20  7:53               ` Jens Axboe
2004-01-20  2:19           ` [2.4] scsi per-host lock was " Marcelo Tosatti
2004-01-20  3:21             ` Doug Ledford
2004-01-25  0:31           ` Kurt Garloff
2004-01-15 17:17       ` Bill Davidsen
2004-01-17 13:12         ` Doug Ledford
2004-01-17 15:16           ` Bill Davidsen
2004-01-17 16:07             ` Doug Ledford
  -- strict thread matches above, loose matches on Subject: below --
2004-01-19 21:36 Martin Peschke3
2004-01-19 21:36 ` Martin Peschke3
2004-03-08 21:25 ` Doug Ledford
     [not found] <1d6yN-6HH-17@gated-at.bofh.it>
     [not found] ` <1dasC-5Ww-5@gated-at.bofh.it>
     [not found]   ` <1ejkf-724-13@gated-at.bofh.it>
     [not found]     ` <1elvB-Jt-25@gated-at.bofh.it>
2004-01-16 15:40       ` Bill Davidsen
2004-01-12 16:32 Peter Yao
2004-01-12  9:08 ` Arjan van de Ven
2004-01-12  9:19   ` Jens Axboe
2004-01-12  9:19     ` Jens Axboe
2004-01-12  9:20       ` Arjan van de Ven
2004-01-12  9:22         ` Jens Axboe
2004-01-12 13:27           ` Doug Ledford
2004-01-15 17:01             ` Bill Davidsen
2004-01-15 17:05               ` Jens Axboe
2004-01-15 17:09               ` Arjan van de Ven
2004-01-15 19:30               ` Doug Ledford
2004-01-12 14:07 Martin Peschke3
2004-01-12 14:07 ` Martin Peschke3
2004-01-12 14:11 ` Arjan van de Ven
2004-01-12 14:13 ` Jens Axboe
2004-01-12 15:08   ` Doug Ledford
2004-01-12 15:24     ` James Bottomley
2004-01-12 15:43       ` Jens Axboe
2004-01-12 15:52         ` Doug Ledford
2004-01-12 16:04           ` James Bottomley
2004-01-12 16:05             ` Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.