* Re: [patch] block/IDE/interrupt lockup
@ 2002-03-30 9:35 Manfred Spraul
2002-03-30 18:28 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Manfred Spraul @ 2002-03-30 9:35 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Marcelo Tosatti
> - spin_unlock_irq(&io_request_lock);
> + spin_unlock_irqrestore(&io_request_lock, flags);
> rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
Great patch.
kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
an obvious bug into a rare, difficult to find bug. What about trying to
fix it?
I agree that this won't happen during boot, but what about a hotplug PCI
ide controller?
--
Manfred
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] block/IDE/interrupt lockup
2002-03-30 9:35 [patch] block/IDE/interrupt lockup Manfred Spraul
@ 2002-03-30 18:28 ` Andrew Morton
2002-03-30 18:52 ` Alan Cox
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-03-30 18:28 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, Marcelo Tosatti
Manfred Spraul wrote:
>
> > - spin_unlock_irq(&io_request_lock);
> > + spin_unlock_irqrestore(&io_request_lock, flags);
> > rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
>
> Great patch.
> kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
> an obvious bug into a rare, difficult to find bug. What about trying to
> fix it?
Gimme a break, Manfred. The patch fixes the new bug. Which was
hardly obvious. The longstanding (as in years-old) bug was
pointed out to the maintainer.
It may not even be a bug. Certainly I don't think it's
worth my time to fiddle with it. But you're at liberty to.
> I agree that this won't happen during boot, but what about a hotplug PCI
> ide controller?
The kernel calls request_irq() inside cli() in lots of places.
That's the same bug: "if you called cli(), how come you're
allowing kmalloc to clear it?".
In 2.4, this is a design wart. In 2.5, it will go BUG() if
the page allocator performs I/O.
-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] block/IDE/interrupt lockup
2002-03-30 18:28 ` Andrew Morton
@ 2002-03-30 18:52 ` Alan Cox
2002-03-30 19:06 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2002-03-30 18:52 UTC (permalink / raw)
To: Andrew Morton; +Cc: Manfred Spraul, linux-kernel, Marcelo Tosatti
> The kernel calls request_irq() inside cli() in lots of places.
> That's the same bug: "if you called cli(), how come you're
> allowing kmalloc to clear it?".
Those places should if possible be fixed. I take patches. If we can get 2.4
to BUG() on those kmalloc violations and clean them up it sounds like
progress
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] block/IDE/interrupt lockup
2002-03-30 18:52 ` Alan Cox
@ 2002-03-30 19:06 ` Andrew Morton
2002-03-30 23:23 ` Keith Owens
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-03-30 19:06 UTC (permalink / raw)
To: Alan Cox; +Cc: Manfred Spraul, linux-kernel, Marcelo Tosatti
Alan Cox wrote:
>
> > The kernel calls request_irq() inside cli() in lots of places.
> > That's the same bug: "if you called cli(), how come you're
> > allowing kmalloc to clear it?".
>
> Those places should if possible be fixed. I take patches. If we can get 2.4
> to BUG() on those kmalloc violations and clean them up it sounds like
> progress
What I'd like is a debugging function `can_sleep()'. This
is good for documentary purposes, and will catch bugs.
So kmalloc() would gain:
if (gfp_flags & __GFP_WAIT)
can_sleep();
can_sleep() would do the following:
- If CONFIG_PREEMPT, check the locking depth (minus BKL depth),
whine if non-zero.
- If inside cli(), whine.
- If inside __cli(), also whine (not really a bug, but a design error).
- whining will include generation of a backtrace.
I suspect a 2.4 version would generate too many bug reports :)
It would have to implement its own lock depth accounting if
we want the sleep-inside-spinlock checking.
There's some arch-dependent stuff in there. I'll do a 2.5
patch. I suspect it'll generate showers of stuff. We can
feed fixes back into 2.4.
-
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] block/IDE/interrupt lockup
2002-03-30 19:06 ` Andrew Morton
@ 2002-03-30 23:23 ` Keith Owens
0 siblings, 0 replies; 7+ messages in thread
From: Keith Owens @ 2002-03-30 23:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
On Sat, 30 Mar 2002 11:06:25 -0800,
Andrew Morton <akpm@zip.com.au> wrote:
>What I'd like is a debugging function `can_sleep()'. This
>is good for documentary purposes, and will catch bugs.
>
>So kmalloc() would gain:
>
> if (gfp_flags & __GFP_WAIT)
> can_sleep();
can_sleep_if(gfp_flags & __GFP_WAIT) would be better. can_sleep_if()
is
do { } while(0)
for no debugging, for debugging it is
if (unlikely(condition)) {
whine(__stringify(condition))
}
One line instead of two, no references to variables when debugging is
off, automatically adds unlikely.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] block/IDE/interrupt lockup
@ 2002-04-01 9:23 Manfred Spraul
0 siblings, 0 replies; 7+ messages in thread
From: Manfred Spraul @ 2002-04-01 9:23 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, Marcelo Tosatti
[-- Attachment #1: Type: text/plain, Size: 454 bytes --]
I've attached an alternative patch:
ide assumes that blk_init_queue doesn't sleep or enable interrupts. As a
quick fix, make block_grow_request_list() nonblocking:
both spin_lock_irqsave() and SLAB_ATOMIC allocations. Just
spin_lock_irqsave() with SLAB_KERNEL allocations doesn't fix the
problem.
The better fix would be cleaning up init_irq() in
drivers/ide/ide-probe.c, but that's something for 2.5 or someone who
understand the ide code.
--
Manfred
[-- Attachment #2: patch-alternative --]
[-- Type: text/plain, Size: 1060 bytes --]
--- 2.4/drivers/block/ll_rw_blk.c Mon Apr 1 10:53:25 2002
+++ build-2.4/drivers/block/ll_rw_blk.c Mon Apr 1 11:00:21 2002
@@ -336,14 +336,17 @@
*/
int blk_grow_request_list(request_queue_t *q, int nr_requests)
{
- spin_lock_irq(&io_request_lock);
+ unsigned long flags;
+ /* Several broken drivers assume that this function doesn't sleep,
+ * this causes system hangs during boot.
+ * As a temporary fix, make the the function non-blocking.
+ */
+ spin_lock_irqsave(&io_request_lock, flags);
while (q->nr_requests < nr_requests) {
struct request *rq;
int rw;
- spin_unlock_irq(&io_request_lock);
- rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- spin_lock_irq(&io_request_lock);
+ rq = kmem_cache_alloc(request_cachep, SLAB_ATOMIC);
if (rq == NULL)
break;
memset(rq, 0, sizeof(*rq));
@@ -356,7 +359,7 @@
q->batch_requests = q->nr_requests / 4;
if (q->batch_requests > 32)
q->batch_requests = 32;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
return q->nr_requests;
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* [patch] block/IDE/interrupt lockup
@ 2002-03-30 5:45 Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2002-03-30 5:45 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Jens Axboe, Andre Hedrick, lkml
Marcelo,
my blk_grow_request_list() patch in -pre5 is buggy. It
can cause boot-time lockups. The window is fairly small,
but I just hit it.
drivers/ide/ide-probe.c:init_irq() does cli().
It calls down to blk_init_free_list() and
blk_grow_request_list().
blk_grow_request_list() does spin_unlock_irq(). Which
is illegal inside cli(). An interrupt comes in and
the CPU locks up in irq_enter(), spinning on global_irq_lock,
which this CPU already holds.
Below is the patch. (That's the last spin_lock_irq()
anyone will be seeing from me :))
Andre, init_irq() is somewhat broken - it appears to
be assuming that cli() will disable interrupts, but it's
calling functions which can sleep. If these functions
_do_ sleep, interrupts will be enabled, which is presumably
not what IDE wants to happen.
--- 2.4.19-pre5/drivers/block/ll_rw_blk.c~ide-lockup Fri Mar 29 21:19:11 2002
+++ 2.4.19-pre5-akpm/drivers/block/ll_rw_blk.c Fri Mar 29 21:20:04 2002
@@ -336,14 +336,16 @@ void generic_unplug_device(void *data)
*/
int blk_grow_request_list(request_queue_t *q, int nr_requests)
{
- spin_lock_irq(&io_request_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&io_request_lock, flags);
while (q->nr_requests < nr_requests) {
struct request *rq;
int rw;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- spin_lock_irq(&io_request_lock);
+ spin_lock_irqsave(&io_request_lock, flags);
if (rq == NULL)
break;
memset(rq, 0, sizeof(*rq));
@@ -356,7 +358,7 @@ int blk_grow_request_list(request_queue_
q->batch_requests = q->nr_requests / 4;
if (q->batch_requests > 32)
q->batch_requests = 32;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
return q->nr_requests;
}
-
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2002-04-01 9:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-30 9:35 [patch] block/IDE/interrupt lockup Manfred Spraul
2002-03-30 18:28 ` Andrew Morton
2002-03-30 18:52 ` Alan Cox
2002-03-30 19:06 ` Andrew Morton
2002-03-30 23:23 ` Keith Owens
-- strict thread matches above, loose matches on Subject: below --
2002-04-01 9:23 Manfred Spraul
2002-03-30 5:45 Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).