linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [patch] block/IDE/interrupt lockup
@ 2002-03-30  9:35 Manfred Spraul
  2002-03-30 18:28 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Manfred Spraul @ 2002-03-30  9:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Marcelo Tosatti

> -	spin_unlock_irq(&io_request_lock);
> +	spin_unlock_irqrestore(&io_request_lock, flags);
>  	rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);

Great patch.
kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
an obvious bug into a rare, difficult to find bug. What about trying to
fix it?

I agree that this won't happen during boot, but what about a hotplug PCI
ide controller?

--
    Manfred


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] block/IDE/interrupt lockup
  2002-03-30  9:35 [patch] block/IDE/interrupt lockup Manfred Spraul
@ 2002-03-30 18:28 ` Andrew Morton
  2002-03-30 18:52   ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-03-30 18:28 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, Marcelo Tosatti

Manfred Spraul wrote:
> 
> > -     spin_unlock_irq(&io_request_lock);
> > +     spin_unlock_irqrestore(&io_request_lock, flags);
> >       rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
> 
> Great patch.
> kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
> an obvious bug into a rare, difficult to find bug. What about trying to
> fix it?

Gimme a break, Manfred.  The patch fixes the new bug. Which was
hardly obvious.  The longstanding (as in years-old) bug was
pointed out to the maintainer.  

It may not even be a bug.  Certainly I don't think it's
worth my time to fiddle with it.  But you're at liberty to.

> I agree that this won't happen during boot, but what about a hotplug PCI
> ide controller?

The kernel calls request_irq() inside cli() in lots of places.
That's the same bug: "if you called cli(), how come you're
allowing kmalloc to clear it?".

In 2.4, this is a design wart.  In 2.5, it will go BUG() if
the page allocator performs I/O.

-

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] block/IDE/interrupt lockup
  2002-03-30 18:28 ` Andrew Morton
@ 2002-03-30 18:52   ` Alan Cox
  2002-03-30 19:06     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2002-03-30 18:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Manfred Spraul, linux-kernel, Marcelo Tosatti

> The kernel calls request_irq() inside cli() in lots of places.
> That's the same bug: "if you called cli(), how come you're
> allowing kmalloc to clear it?".

Those places should if possible be fixed. I take patches. If we can get 2.4
to BUG() on those kmalloc violations and clean them up it sounds like
progress

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] block/IDE/interrupt lockup
  2002-03-30 18:52   ` Alan Cox
@ 2002-03-30 19:06     ` Andrew Morton
  2002-03-30 23:23       ` Keith Owens
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2002-03-30 19:06 UTC (permalink / raw)
  To: Alan Cox; +Cc: Manfred Spraul, linux-kernel, Marcelo Tosatti

Alan Cox wrote:
> 
> > The kernel calls request_irq() inside cli() in lots of places.
> > That's the same bug: "if you called cli(), how come you're
> > allowing kmalloc to clear it?".
> 
> Those places should if possible be fixed. I take patches. If we can get 2.4
> to BUG() on those kmalloc violations and clean them up it sounds like
> progress

What I'd like is a debugging function `can_sleep()'.  This
is good for documentary purposes, and will catch bugs.

So kmalloc() would gain:

	if (gfp_flags & __GFP_WAIT)
		can_sleep();

can_sleep() would do the following:

- If CONFIG_PREEMPT, check the locking depth (minus BKL depth),
  whine if non-zero.

- If inside cli(), whine.

- If inside __cli(), also whine (not really a bug, but a design error).

- whining will include generation of a backtrace.

I suspect a 2.4 version would generate too many bug reports :)
It would have to implement its own lock depth accounting if
we want the sleep-inside-spinlock checking.

There's some arch-dependent stuff in there.  I'll do a 2.5
patch.  I suspect it'll generate showers of stuff.  We can
feed fixes back into 2.4.

-

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] block/IDE/interrupt lockup
  2002-03-30 19:06     ` Andrew Morton
@ 2002-03-30 23:23       ` Keith Owens
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Owens @ 2002-03-30 23:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat, 30 Mar 2002 11:06:25 -0800, 
Andrew Morton <akpm@zip.com.au> wrote:
>What I'd like is a debugging function `can_sleep()'.  This
>is good for documentary purposes, and will catch bugs.
>
>So kmalloc() would gain:
>
>	if (gfp_flags & __GFP_WAIT)
>		can_sleep();

can_sleep_if(gfp_flags & __GFP_WAIT) would be better.  can_sleep_if()
is 
  do { } while(0)
for no debugging, for debugging it is
  if (unlikely(condition)) {
  	whine(__stringify(condition))
  }

One line instead of two, no references to variables when debugging is
off, automatically adds unlikely.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] block/IDE/interrupt lockup
@ 2002-04-01  9:23 Manfred Spraul
  0 siblings, 0 replies; 7+ messages in thread
From: Manfred Spraul @ 2002-04-01  9:23 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel, Marcelo Tosatti

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

I've attached an alternative patch:
ide assumes that blk_init_queue doesn't sleep or enable interrupts. As a
quick fix, make block_grow_request_list() nonblocking:
both spin_lock_irqsave() and SLAB_ATOMIC allocations. Just
spin_lock_irqsave() with SLAB_KERNEL allocations doesn't fix the
problem.

The better fix would be cleaning up init_irq() in
drivers/ide/ide-probe.c, but that's something for 2.5 or someone who
understand the ide code.

--
	Manfred

[-- Attachment #2: patch-alternative --]
[-- Type: text/plain, Size: 1060 bytes --]

--- 2.4/drivers/block/ll_rw_blk.c	Mon Apr  1 10:53:25 2002
+++ build-2.4/drivers/block/ll_rw_blk.c	Mon Apr  1 11:00:21 2002
@@ -336,14 +336,17 @@
  */
 int blk_grow_request_list(request_queue_t *q, int nr_requests)
 {
-	spin_lock_irq(&io_request_lock);
+	unsigned long flags;
+	/* Several broken drivers assume that this function doesn't sleep,
+	 * this causes system hangs during boot.
+	 * As a temporary fix, make the the function non-blocking.
+	 */
+	spin_lock_irqsave(&io_request_lock, flags);
 	while (q->nr_requests < nr_requests) {
 		struct request *rq;
 		int rw;
 
-		spin_unlock_irq(&io_request_lock);
-		rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
-		spin_lock_irq(&io_request_lock);
+		rq = kmem_cache_alloc(request_cachep, SLAB_ATOMIC);
 		if (rq == NULL)
 			break;
 		memset(rq, 0, sizeof(*rq));
@@ -356,7 +359,7 @@
 	q->batch_requests = q->nr_requests / 4;
 	if (q->batch_requests > 32)
 		q->batch_requests = 32;
-	spin_unlock_irq(&io_request_lock);
+	spin_unlock_irqrestore(&io_request_lock, flags);
 	return q->nr_requests;
 }
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch] block/IDE/interrupt lockup
@ 2002-03-30  5:45 Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2002-03-30  5:45 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Jens Axboe, Andre Hedrick, lkml

Marcelo,

my blk_grow_request_list() patch in -pre5 is buggy.  It
can cause boot-time lockups.  The window is fairly small,
but I just hit it.

drivers/ide/ide-probe.c:init_irq() does cli().
It calls down to blk_init_free_list() and
blk_grow_request_list().

blk_grow_request_list() does spin_unlock_irq().  Which
is illegal inside cli().  An interrupt comes in and
the CPU locks up in irq_enter(), spinning on global_irq_lock,
which this CPU already holds.

Below is the patch.  (That's the last spin_lock_irq()
anyone will be seeing from me :))

Andre, init_irq() is somewhat broken - it appears to
be assuming that cli() will disable interrupts, but it's
calling functions which can sleep.   If these functions
_do_ sleep, interrupts will be enabled, which is presumably
not what IDE wants to happen.


--- 2.4.19-pre5/drivers/block/ll_rw_blk.c~ide-lockup	Fri Mar 29 21:19:11 2002
+++ 2.4.19-pre5-akpm/drivers/block/ll_rw_blk.c	Fri Mar 29 21:20:04 2002
@@ -336,14 +336,16 @@ void generic_unplug_device(void *data)
  */
 int blk_grow_request_list(request_queue_t *q, int nr_requests)
 {
-	spin_lock_irq(&io_request_lock);
+	unsigned long flags;
+
+	spin_lock_irqsave(&io_request_lock, flags);
 	while (q->nr_requests < nr_requests) {
 		struct request *rq;
 		int rw;
 
-		spin_unlock_irq(&io_request_lock);
+		spin_unlock_irqrestore(&io_request_lock, flags);
 		rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
-		spin_lock_irq(&io_request_lock);
+		spin_lock_irqsave(&io_request_lock, flags);
 		if (rq == NULL)
 			break;
 		memset(rq, 0, sizeof(*rq));
@@ -356,7 +358,7 @@ int blk_grow_request_list(request_queue_
 	q->batch_requests = q->nr_requests / 4;
 	if (q->batch_requests > 32)
 		q->batch_requests = 32;
-	spin_unlock_irq(&io_request_lock);
+	spin_unlock_irqrestore(&io_request_lock, flags);
 	return q->nr_requests;
 }
 

-

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-04-01  9:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-30  9:35 [patch] block/IDE/interrupt lockup Manfred Spraul
2002-03-30 18:28 ` Andrew Morton
2002-03-30 18:52   ` Alan Cox
2002-03-30 19:06     ` Andrew Morton
2002-03-30 23:23       ` Keith Owens
  -- strict thread matches above, loose matches on Subject: below --
2002-04-01  9:23 Manfred Spraul
2002-03-30  5:45 Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).