linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Dumb question: BKL on reboot ?
@ 2003-08-20 10:22 Hannes Reinecke
  2003-08-20 18:29 ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2003-08-20 10:22 UTC (permalink / raw)
  To: Linux Kernel

Hiya,

I've got a dumb question: Why is the BKL held on entering sys_reboot() 
in kernel/sys.c:405 ?
It is getting especially interesting on SMP, when one cpu is entering 
sys_reboot, acquires the BKL and then waits (via machine_restart) for 
all other cpus to shut down. If any of the other cpus is executing a 
task which also needs the BKL, we have a nice deadlock.
We've seen this here on 2-way s390, where the other cpu tried to execute 
kupdated() (what did it try that for? Anyway...), which of course 
resulted in a deadlock.

Any enlightenment welcome.

Cheers,

Hannes
P.S.: Please cc me directly, I'm not subscribed.
-- 
Dr. Hannes Reinecke			hare@suse.de
SuSE Linux AG				S390 & zSeries
Deutschherrnstr. 15-19			+49 911 74053 688
90429 Nürnberg				http://www.suse.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 10:22 Dumb question: BKL on reboot ? Hannes Reinecke
@ 2003-08-20 18:29 ` Andrew Morton
  2003-08-20 18:35   ` David S. Miller
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2003-08-20 18:29 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-kernel

Hannes Reinecke <Hannes.Reinecke@suse.de> wrote:
>
> Hiya,
> 
> I've got a dumb question: Why is the BKL held on entering sys_reboot() 
> in kernel/sys.c:405 ?

Probably for no good reason.

> It is getting especially interesting on SMP, when one cpu is entering 
> sys_reboot, acquires the BKL and then waits (via machine_restart) for 
> all other cpus to shut down. If any of the other cpus is executing a 
> task which also needs the BKL, we have a nice deadlock.
> We've seen this here on 2-way s390, where the other cpu tried to execute 
> kupdated() (what did it try that for? Anyway...), which of course 
> resulted in a deadlock.

I guess that dropping the BKL around the machine_restart() call would be an
appropriate 2.4 fix.

Where exactly does the rebooting CPU get stuck in machine_restart()?  If
someone has done lock_kernel() with local interrupts disabled then yes,
it'll deadlock.  But that's unlikely?  Confused.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 18:29 ` Andrew Morton
@ 2003-08-20 18:35   ` David S. Miller
  2003-08-20 20:23     ` Dave Hansen
  0 siblings, 1 reply; 16+ messages in thread
From: David S. Miller @ 2003-08-20 18:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Hannes.Reinecke, linux-kernel

On Wed, 20 Aug 2003 11:29:18 -0700
Andrew Morton <akpm@osdl.org> wrote:

> Where exactly does the rebooting CPU get stuck in machine_restart()?  If
> someone has done lock_kernel() with local interrupts disabled then yes,
> it'll deadlock.  But that's unlikely?  Confused.

In fact I think that's illegal and we should bug check
such a state.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 18:35   ` David S. Miller
@ 2003-08-20 20:23     ` Dave Hansen
  2003-08-21  8:05       ` Hannes Reinecke
  2003-08-21 15:33       ` Andrea Arcangeli
  0 siblings, 2 replies; 16+ messages in thread
From: Dave Hansen @ 2003-08-20 20:23 UTC (permalink / raw)
  To: David S. Miller; +Cc: Andrew Morton, Hannes.Reinecke, Linux Kernel Mailing List

On Wed, 2003-08-20 at 11:35, David S. Miller wrote:
> On Wed, 20 Aug 2003 11:29:18 -0700
> Andrew Morton <akpm@osdl.org> wrote:
> 
> > Where exactly does the rebooting CPU get stuck in machine_restart()?  If
> > someone has done lock_kernel() with local interrupts disabled then yes,
> > it'll deadlock.  But that's unlikely?  Confused.

I thought it was legal to do that.  The normal interrupts-off spinlock
deadlock happens because a thread does a spin_lock() with no irq
disable, then gets interrupted.  The interrupt handler tries to take the
lock, and gets stuck.  

If the BKL is put in that situation, the interrupt handler will see
current->lock_depth > 0, and not actually take the spinlock; it will
just increment the lock_depth.  It won't deadlock.

Or, are you saying that a CPU could have the BKL, and have
stop_this_cpu() called on it?  I guess we could add
release_kernel_lock() to stop_this_cpu().
-- 
Dave Hansen
haveblue@us.ibm.com


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 20:23     ` Dave Hansen
@ 2003-08-21  8:05       ` Hannes Reinecke
  2003-08-21 15:41         ` Andrea Arcangeli
  2003-08-21 15:33       ` Andrea Arcangeli
  1 sibling, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2003-08-21  8:05 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Linux Kernel Mailing List

Dave Hansen wrote:
[ .. ]
> 
> Or, are you saying that a CPU could have the BKL, and have
> stop_this_cpu() called on it?  I guess we could add
> release_kernel_lock() to stop_this_cpu().
Exactly what happened here. CPU#1 entered sys_reboot, got BKL and 
prepared to stop. It will never release BKL, leaving a nice big window 
for CPU#0 to deadlock.

releasing BKL before calling do_machine_restart seems to do the trick, 
though.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke			hare@suse.de
SuSE Linux AG				S390 & zSeries
Deutschherrnstr. 15-19			+49 911 74053 688
90429 Nürnberg				http://www.suse.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 20:23     ` Dave Hansen
  2003-08-21  8:05       ` Hannes Reinecke
@ 2003-08-21 15:33       ` Andrea Arcangeli
  1 sibling, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-21 15:33 UTC (permalink / raw)
  To: Dave Hansen
  Cc: David S. Miller, Andrew Morton, Hannes.Reinecke,
	Linux Kernel Mailing List

On Wed, Aug 20, 2003 at 01:23:44PM -0700, Dave Hansen wrote:
> On Wed, 2003-08-20 at 11:35, David S. Miller wrote:
> > On Wed, 20 Aug 2003 11:29:18 -0700
> > Andrew Morton <akpm@osdl.org> wrote:
> > 
> > > Where exactly does the rebooting CPU get stuck in machine_restart()?  If
> > > someone has done lock_kernel() with local interrupts disabled then yes,
> > > it'll deadlock.  But that's unlikely?  Confused.
> 
> I thought it was legal to do that.  The normal interrupts-off spinlock
> deadlock happens because a thread does a spin_lock() with no irq
> disable, then gets interrupted.  The interrupt handler tries to take the
> lock, and gets stuck.  

that's the regular spinlock case but it has nothing to do with the big
kernel lock during smp_call_function. holding the big kernel lock while
broadcasting the IPI with smp_call_function is perfectly legal. Infact
if something it's safer (and it can't explain any deadlock), and since
it's a so low performance path, where scalability doesn't matter,
leaving the BKL around that code can be acceptable.

The only illegal thing is to call smp_call_function with _local_irqs_
disabled (as Andrew noted above). the big kernel lock doesn't matter for
the smp_call_function callers.

> If the BKL is put in that situation, the interrupt handler will see
> current->lock_depth > 0, and not actually take the spinlock; it will
> just increment the lock_depth.  It won't deadlock.
> 
> Or, are you saying that a CPU could have the BKL, and have
> stop_this_cpu() called on it?  I guess we could add
> release_kernel_lock() to stop_this_cpu().

the *real* bug IMHO is that machine_restart isn't shutting down the
other cpus properly in smp, it has nothing to do with lock_kernel in
kernel/, the bug is in arch/. Dropping the big kernel lock in kernel/
can hide the bug, but it will showup again later in any other spinlock.

The problem is that machine_restart in arch/s390 (and arch/i386 too) can
easily hang a cpu in the IPI handler while it had zillon of spinlocks
held, that may later interfere with the main "reboot" cpu.

So at the very least a better fix than the one that drops the big kernel
lock from kernel/ is to execute current->lock_depth = -1 before entering
the infinite loop in the arch/ code.

A real fix would be to use the unplug cpu framework to allow all cpus to
reach the quiescent point, in this case it's easily doable even w/o
using RCU and w/o the pluggable cpu framework in 2.4, by simply spawning
a kernel thread bind to interesting cpu, and having this kernel thread
setting the bitflag after it become runnable (cut and pasting the
spawn_ksoftirqd will do the trick). The kernel thread will only execute
the infinite loop then. Hanging a cpu forever in a IPI is simply
deadlock prone for all other cpus, and that's the real bug.

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-21  8:05       ` Hannes Reinecke
@ 2003-08-21 15:41         ` Andrea Arcangeli
  2003-08-21 15:55           ` Hannes Reinecke
  0 siblings, 1 reply; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-21 15:41 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List

On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote:
> Dave Hansen wrote:
> [ .. ]
> >
> >Or, are you saying that a CPU could have the BKL, and have
> >stop_this_cpu() called on it?  I guess we could add
> >release_kernel_lock() to stop_this_cpu().
> Exactly what happened here. CPU#1 entered sys_reboot, got BKL and 
> prepared to stop. It will never release BKL, leaving a nice big window 
> for CPU#0 to deadlock.

if that was really the case it shouldn't deadlock, because CPU0
shouldn't depend on a big kernel lock to be able to re-enable the irqs,
and to receive the IPI. Nor any IPI handler could ever be allowed to
take the big kernel lock.

Unless you can demonstrate a dependency on the big kernel lock for CPU0
to re-enable the irqs eventually, I can't see how the above could
deadlock.

The only clear deadlock prone thing I can see from the code is the IPI
handler looping forever w/o allowing the stopped cpu to release all its
underlying spinlocks. Not sure exactly how this generates the deadlock
in practice, but you only need to hit one of the held spinlocks
(including possibly the BKL) in the context of the reboot "cpu" to hang
forever during reboot. Looping forever and giving the OK to reboot the
machine from the context of a kernel thread instead of an IPI will fix
it as far as I can tell.

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-21 15:41         ` Andrea Arcangeli
@ 2003-08-21 15:55           ` Hannes Reinecke
  2003-08-21 16:39             ` Andrea Arcangeli
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2003-08-21 15:55 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Dave Hansen, Linux Kernel Mailing List

Andrea Arcangeli wrote:
> On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote:
[ .. ]
>>
>>Exactly what happened here. CPU#1 entered sys_reboot, got BKL and 
>>prepared to stop. It will never release BKL, leaving a nice big window 
>>for CPU#0 to deadlock.
> 
> 
> if that was really the case it shouldn't deadlock, because CPU0
> shouldn't depend on a big kernel lock to be able to re-enable the irqs,
> and to receive the IPI. Nor any IPI handler could ever be allowed to
> take the big kernel lock.
> 
> Unless you can demonstrate a dependency on the big kernel lock for CPU0
> to re-enable the irqs eventually, I can't see how the above could
> deadlock.
>

What happened was this:

================================================================
TASK HAS CPU (0): 0x904000 (kupdated):
  LOWCORE INFO:
   -psw      : 0x400100180000000 0x0000000007aa22
   -function : sync_old_buffers+58
   -prefix   : 0x009c8000
   -cpu timer: 0xfffffa77 0x33465f80
   -clock cmp: 0x00b9e503 0x235324f4
   -general registers:
      0000000000000000 0x000000002e1000
      0x0000000007aa22 0x0000000004919e
      0x00000000907f28 0x0000000022b428
      0000000000000000 0000000000000000
      0x00000000224340 0000000000000000
      0x00000000970000 0x00000000233b00
      0x00000000904000 0x000000001deb60
      0x00000000907e78 0x00000000907dd8
[ .. ]
  STACK:
  0 kupdate+424 [0x7af4c]
  1 arch_kernel_thread+70 [0x18d72]
================================================================
TASK HAS CPU (1): 0x3318000 (reboot):
  LOWCORE INFO:
   -psw      : 0x700100180000000 0x0000007ffe0d8e
   -function : not found in System map
   -prefix   : 0x009c2000
   -cpu timer: 0xfff79e2d 0x014244c0
   -clock cmp: 0x00b9e503 0x3a5d9504
   -general registers:
      0x00000000000001 0000000000000000
      0000000000000000 0000000000000000
      0x0000000001f268 0000000000000000
      0x00000000000003 0x000000000002c0
      0x000000000490f0 0x00000003318000
      0x00000000016422 0000000000000000
      0x00000000000001 0x000000001da280
      0x0000000001f268 0x0000000331bae0
[ .. ]
  STACK:
  0 machine_restart_smp+62 [0x1f5f2]
  1 machine_restart+48 [0x19884]
  2 sys_reboot+306 [0x49222]
  3 sysc_noemu+16 [0x16242]

Not sure why CPU#0 wanted to execute kupdate(), but hey ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke			hare@suse.de
SuSE Linux AG				S390 & zSeries
Deutschherrnstr. 15-19			+49 911 74053 688
90429 Nürnberg				http://www.suse.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-21 15:55           ` Hannes Reinecke
@ 2003-08-21 16:39             ` Andrea Arcangeli
  2003-08-21 16:58               ` Andrea Arcangeli
  2003-08-22  6:39               ` Hannes Reinecke
  0 siblings, 2 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-21 16:39 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List

On Thu, Aug 21, 2003 at 05:55:49PM +0200, Hannes Reinecke wrote:
> Andrea Arcangeli wrote:
> >On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote:
> [ .. ]
> >>
> >>Exactly what happened here. CPU#1 entered sys_reboot, got BKL and 
> >>prepared to stop. It will never release BKL, leaving a nice big window 
> >>for CPU#0 to deadlock.
> >
> >
> >if that was really the case it shouldn't deadlock, because CPU0
> >shouldn't depend on a big kernel lock to be able to re-enable the irqs,
> >and to receive the IPI. Nor any IPI handler could ever be allowed to
> >take the big kernel lock.
> >
> >Unless you can demonstrate a dependency on the big kernel lock for CPU0
> >to re-enable the irqs eventually, I can't see how the above could
> >deadlock.
> >
> 
> What happened was this:
> 
> ================================================================
> TASK HAS CPU (0): 0x904000 (kupdated):
>  LOWCORE INFO:
>   -psw      : 0x400100180000000 0x0000000007aa22
>   -function : sync_old_buffers+58
>   -prefix   : 0x009c8000
>   -cpu timer: 0xfffffa77 0x33465f80
>   -clock cmp: 0x00b9e503 0x235324f4
>   -general registers:
>      0000000000000000 0x000000002e1000
>      0x0000000007aa22 0x0000000004919e
>      0x00000000907f28 0x0000000022b428
>      0000000000000000 0000000000000000
>      0x00000000224340 0000000000000000
>      0x00000000970000 0x00000000233b00
>      0x00000000904000 0x000000001deb60
>      0x00000000907e78 0x00000000907dd8
> [ .. ]
>  STACK:
>  0 kupdate+424 [0x7af4c]
>  1 arch_kernel_thread+70 [0x18d72]
> ================================================================
> TASK HAS CPU (1): 0x3318000 (reboot):
>  LOWCORE INFO:
>   -psw      : 0x700100180000000 0x0000007ffe0d8e
>   -function : not found in System map
>   -prefix   : 0x009c2000
>   -cpu timer: 0xfff79e2d 0x014244c0
>   -clock cmp: 0x00b9e503 0x3a5d9504
>   -general registers:
>      0x00000000000001 0000000000000000
>      0000000000000000 0000000000000000
>      0x0000000001f268 0000000000000000
>      0x00000000000003 0x000000000002c0
>      0x000000000490f0 0x00000003318000
>      0x00000000016422 0000000000000000
>      0x00000000000001 0x000000001da280
>      0x0000000001f268 0x0000000331bae0
> [ .. ]
>  STACK:
>  0 machine_restart_smp+62 [0x1f5f2]
>  1 machine_restart+48 [0x19884]
>  2 sys_reboot+306 [0x49222]
>  3 sysc_noemu+16 [0x16242]

This can't be the lock_kernel, you see, there's no lock_kernel
invocation at all in the machine_restart_smp path.

> 
> Not sure why CPU#0 wanted to execute kupdate(), but hey ...

kupdate always runs in the background, once every few seconds.

I perfectly see why it hangs on s390, that's a bug in the arch code
and it's unrelated to whatever lock_kernel and even unrelated to the
potential deadlock prone shutdown of cpus that I mentioned in my
previous email:

static void do_machine_restart(void * __unused)
{
	clear_bit(smp_processor_id(), &cpu_restart_map);
	if (smp_processor_id() == 0) {
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

if the `reboot` program doesn't run by luck on cpu 0, it will be the one
that will hang instead of the others. the above check must be changed to:

	smp_processor_id() == reboot_cpu

and you should snapshot reboot_cpu in machine_restart_smp. then it won't
hang anymore.

The x86 code may be safe in hanging in the IPI if it's extremely careful
in not touching any lock that maybe hold by other cpus, between the IPI
delivery and the hard reboot. So probably we don't need to replace the
IPI with a kernel thread in 2.4, despite I find the kernel thread way
more robust.

Overall the lock_kernel I think is _needed_ (not only if something it's
safer), to serialize reboot against reboot and I wouldn't bother
changing it to another spinlock (at least not in 2.4)

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-21 16:39             ` Andrea Arcangeli
@ 2003-08-21 16:58               ` Andrea Arcangeli
  2003-08-22  6:39               ` Hannes Reinecke
  1 sibling, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-21 16:58 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List

On Thu, Aug 21, 2003 at 06:39:38PM +0200, Andrea Arcangeli wrote:
> static void do_machine_restart(void * __unused)
> {
> 	clear_bit(smp_processor_id(), &cpu_restart_map);
> 	if (smp_processor_id() == 0) {
> 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> if the `reboot` program doesn't run by luck on cpu 0, it will be the one
> that will hang instead of the others. the above check must be changed to:
> 
> 	smp_processor_id() == reboot_cpu
> 
> and you should snapshot reboot_cpu in machine_restart_smp. then it won't
> hang anymore.

this untested patch should do the trick:

--- 2.4.22pre7aa1/arch/s390x/kernel/smp.c.~1~	2003-07-19 02:34:04.000000000 +0200
+++ 2.4.22pre7aa1/arch/s390x/kernel/smp.c	2003-08-21 18:48:05.000000000 +0200
@@ -240,11 +240,12 @@ void smp_send_stop(void)
  * Reboot, halt and power_off routines for SMP.
  */
 static volatile unsigned long cpu_restart_map;
+static unsigned long restart_cpu;
 
 static void do_machine_restart(void * __unused)
 {
 	clear_bit(smp_processor_id(), &cpu_restart_map);
-	if (smp_processor_id() == 0) {
+	if (smp_processor_id() == restart_cpu) {
 		/* Wait for all other cpus to enter do_machine_restart. */
 		while (cpu_restart_map != 0);
 		/* Store status of other cpus. */
@@ -263,6 +264,7 @@ static void do_machine_restart(void * __
 
 void machine_restart_smp(char * __unused) 
 {
+	restart_cpu = smp_processor_id();
 	cpu_restart_map = cpu_online_map;
         smp_call_function(do_machine_restart, NULL, 0, 0);
 	do_machine_restart(NULL);

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-21 16:39             ` Andrea Arcangeli
  2003-08-21 16:58               ` Andrea Arcangeli
@ 2003-08-22  6:39               ` Hannes Reinecke
  2003-08-22 13:57                 ` Andrea Arcangeli
  2003-08-24 21:43                 ` Andrea Arcangeli
  1 sibling, 2 replies; 16+ messages in thread
From: Hannes Reinecke @ 2003-08-22  6:39 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Dave Hansen, Linux Kernel Mailing List

Andrea Arcangeli wrote:
> On Thu, Aug 21, 2003 at 05:55:49PM +0200, Hannes Reinecke wrote:
> 
>>Andrea Arcangeli wrote:
>>
>>>On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote:
>>
>>
>>
>>What happened was this:
>>
>>================================================================
>>TASK HAS CPU (0): 0x904000 (kupdated):
>> LOWCORE INFO:
>>  -psw      : 0x400100180000000 0x0000000007aa22
>>  -function : sync_old_buffers+58
>>  -prefix   : 0x009c8000
>>  -cpu timer: 0xfffffa77 0x33465f80
>>  -clock cmp: 0x00b9e503 0x235324f4
>>  -general registers:
>>     0000000000000000 0x000000002e1000
>>     0x0000000007aa22 0x0000000004919e
>>     0x00000000907f28 0x0000000022b428
>>     0000000000000000 0000000000000000
>>     0x00000000224340 0000000000000000
>>     0x00000000970000 0x00000000233b00
>>     0x00000000904000 0x000000001deb60
>>     0x00000000907e78 0x00000000907dd8
>>[ .. ]
>> STACK:
>> 0 kupdate+424 [0x7af4c]
>> 1 arch_kernel_thread+70 [0x18d72]
>>================================================================
>>TASK HAS CPU (1): 0x3318000 (reboot):
>> LOWCORE INFO:
>>  -psw      : 0x700100180000000 0x0000007ffe0d8e
>>  -function : not found in System map
>>  -prefix   : 0x009c2000
>>  -cpu timer: 0xfff79e2d 0x014244c0
>>  -clock cmp: 0x00b9e503 0x3a5d9504
>>  -general registers:
>>     0x00000000000001 0000000000000000
>>     0000000000000000 0000000000000000
>>     0x0000000001f268 0000000000000000
>>     0x00000000000003 0x000000000002c0
>>     0x000000000490f0 0x00000003318000
>>     0x00000000016422 0000000000000000
>>     0x00000000000001 0x000000001da280
>>     0x0000000001f268 0x0000000331bae0
>>[ .. ]
>> STACK:
>> 0 machine_restart_smp+62 [0x1f5f2]
>> 1 machine_restart+48 [0x19884]
>> 2 sys_reboot+306 [0x49222]
>> 3 sysc_noemu+16 [0x16242]
> 
> 
> This can't be the lock_kernel, you see, there's no lock_kernel
> invocation at all in the machine_restart_smp path.

Oh? sys_reboot() does call lock_kernel(). kernel/sys.c:303.

Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you 
suggestion and see how far I'll progress.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke			hare@suse.de
SuSE Linux AG				S390 & zSeries
Deutschherrnstr. 15-19			+49 911 74053 688
90429 Nürnberg				http://www.suse.de


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-22  6:39               ` Hannes Reinecke
@ 2003-08-22 13:57                 ` Andrea Arcangeli
  2003-08-24 21:43                 ` Andrea Arcangeli
  1 sibling, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-22 13:57 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List

> >This can't be the lock_kernel, you see, there's no lock_kernel
> >invocation at all in the machine_restart_smp path.
> 
> Oh? sys_reboot() does call lock_kernel(). kernel/sys.c:303.

yes, I know, I meant that it's not spinning on the lock, sys_reboot gets
through lock_kernel w/o problems. It gets the lock successfully.

> Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you 
> suggestion and see how far I'll progress.

thanks, that's the right fix as far as I can tell and the lock_kernel
would better stay to serialize concurrent sys_reboot.

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-22  6:39               ` Hannes Reinecke
  2003-08-22 13:57                 ` Andrea Arcangeli
@ 2003-08-24 21:43                 ` Andrea Arcangeli
  1 sibling, 0 replies; 16+ messages in thread
From: Andrea Arcangeli @ 2003-08-24 21:43 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List, uweigand

Hi Hannes,

On Fri, Aug 22, 2003 at 08:39:03AM +0200, Hannes Reinecke wrote:
> Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you 
> suggestion and see how far I'll progress.

Ulrich pointed me out that only cpu0 can call into the VM (thanks!), and
in turn the s390 code I tocuehd was infact correct (despite it looked
suspect given the trace and the special ==0 case). After a closer
inspection my (last ;) conclusion is this (cut-and-past from bugzilla)

----------------------
[..]
But the data.started counter is already enforcing a good enough
guarantee, that the CPU1 will execute
"signal_processor(smp_processor_id(), sigp_stop);" only when the CPU0 is
already executing inside the IPI handler. So I can't imagine the IPI on
CPU0 getting lost. 
 
the last explanation I can think, is that the CPU0 executes the IPI,
waits for cpu_restart_map to become 0 (i.e. CPU1 offline), and
eventually executes this code correctly: 
 
                do_store_status(); 
                /* 
                 * Finally call reipl. Because we waited for all other 
                 * cpus to enter this function we know that they do 
                 * not hold any s390irq-locks (the cpus have been 
                 * interrupted by an external interrupt and s390irq 
                 * locks are always held disabled). 
                 */ 
                reipl(S390_lowcore.ipl_device); 
        } 
        signal_processor(smp_processor_id(), sigp_stop); 
 
but the box doesn't reboot and the CPU0 returns from the IPI and goes
idle until kupdate kicks in (finds the first idle cpu and tries to take
the big kernel lock that is correctly held by CPU1 in sys_reboot). 
 
This seems a bug in the s390 lowlevel code above, it just doesn't reboot
the machine. 
----------------------

Maybe it's related to signal_processor(smp_processor_id(), sigp_stop)
failing inside IPI handlers or whatever similar arch specific, dunno.

The patch removing the BKL from sys_reboot shouldn't help either if my
above theory is correct: when the IPI runs in cpu0, it's not even
running on top of the BKL. It's just that the machine not rebooting,
eventually will have the first idle cpu (cpu0) execute kupdate in mean
in 2.5 sec, and it will eventually go in the state you found in lkcd
since the BKL is held by cpu1 in sys_reboot that is already offline (no
valid EIP in lkcd).

So if you want to test, probably one interesting info you could generate
to confirm my above theory, is to reproduce the very same deadlock even
with the BKL removal patch applied to sys_reboot. Not sure how easy it
is to reproduce it. If you can hang it again, it would not deadlock in
kupdate anymore: you should find cpu0 stuck in the idle loop instead of
sync_old_buffers.

It maybe something slightly different going wrong too, but still I'm
convinced the lock_kernel in sys_reboot is absoltely innocent and needed
(at least in 2.4).

Andrea

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 11:51   ` Hannes Reinecke
@ 2003-08-20 12:03     ` Andi Kleen
  0 siblings, 0 replies; 16+ messages in thread
From: Andi Kleen @ 2003-08-20 12:03 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Andi Kleen, linux-kernel

> Is it worth to try to push it into mainstream kernel?

Yes definitely. My patch was for 2.6.0test3. 

-Andi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
  2003-08-20 10:49 ` Andi Kleen
@ 2003-08-20 11:51   ` Hannes Reinecke
  2003-08-20 12:03     ` Andi Kleen
  0 siblings, 1 reply; 16+ messages in thread
From: Hannes Reinecke @ 2003-08-20 11:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel

Andi Kleen wrote:
> Hannes Reinecke <Hannes.Reinecke@suse.de> writes:
> 
> 
>>I've got a dumb question: Why is the BKL held on entering sys_reboot()
>>in kernel/sys.c:405 ?
> 
> 
> Interesting. I have a few SMP deadlocks on x86-64 in reboot too 
> and it's possible that it is the same problem.
> 
> I would hold it during exection of the notifiers, but drop
> it before calling into machine_*
> 
Indeed. Works on s390 and survived several reboots (test still 
continuing). I moved lock_kernel() into the sections that actually _do_ 
something; looks much cleaner IMHO.

THX for your help.

Is it worth to try to push it into mainstream kernel?

Cheers,

Hannes
--

--- linux-2.4.21/kernel/sys.c.orig      2003-08-20 10:52:39.000000000 +0200
+++ linux-2.4.21/kernel/sys.c   2003-08-20 10:52:39.000000000 +0200
@@ -340,11 +340,12 @@
                         magic2 != LINUX_REBOOT_MAGIC2B))
                 return -EINVAL;

-       lock_kernel();
         switch (cmd) {
         case LINUX_REBOOT_CMD_RESTART:
+               lock_kernel();
                 notifier_call_chain(&reboot_notifier_list, SYS_RESTART, 
NULL);
                 printk(KERN_EMERG "Restarting system.\n");
+               unlock_kernel();
                 machine_restart(NULL);
                 break;

@@ -357,20 +358,25 @@
                 break;

         case LINUX_REBOOT_CMD_HALT:
+               lock_kernel();
                 notifier_call_chain(&reboot_notifier_list, SYS_HALT, NULL);
                 printk(KERN_EMERG "System halted.\n");
+               unlock_kernel();
                 machine_halt();
                 do_exit(0);
                 break;

         case LINUX_REBOOT_CMD_POWER_OFF:
+               lock_kernel();
                 notifier_call_chain(&reboot_notifier_list, 
SYS_POWER_OFF, NULL);
                 printk(KERN_EMERG "Power down.\n");
+               unlock_kernel();
                 machine_power_off();
                 do_exit(0);
                 break;

         case LINUX_REBOOT_CMD_RESTART2:
+               lock_kernel();
                 if (strncpy_from_user(&buffer[0], (char *)arg, 
sizeof(buffer) -
1) < 0) {
                         unlock_kernel();
                         return -EFAULT;
@@ -379,14 +385,13 @@

                 notifier_call_chain(&reboot_notifier_list, SYS_RESTART, 
buffer);
                 printk(KERN_EMERG "Restarting system with command 
'%s'.\n", buff
er);
+               unlock_kernel();
                 machine_restart(buffer);
                 break;

         default:
-               unlock_kernel();
                 return -EINVAL;
         }
-       unlock_kernel();
         return 0;
  }


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Dumb question: BKL on reboot ?
       [not found] <3F434BD1.9050704@suse.de.suse.lists.linux.kernel>
@ 2003-08-20 10:49 ` Andi Kleen
  2003-08-20 11:51   ` Hannes Reinecke
  0 siblings, 1 reply; 16+ messages in thread
From: Andi Kleen @ 2003-08-20 10:49 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-kernel

Hannes Reinecke <Hannes.Reinecke@suse.de> writes:

> I've got a dumb question: Why is the BKL held on entering sys_reboot()
> in kernel/sys.c:405 ?

Interesting. I have a few SMP deadlocks on x86-64 in reboot too 
and it's possible that it is the same problem.

I would hold it during exection of the notifiers, but drop
it before calling into machine_*

-Andi

--- linux-2.6.0test3-amd64/kernel/sys.c-o	2003-07-28 19:18:18.000000000 +0200
+++ linux-2.6.0test3-amd64/kernel/sys.c	2003-08-20 12:48:59.000000000 +0200
@@ -409,6 +409,7 @@
 		system_running = 0;
 		device_shutdown();
 		printk(KERN_EMERG "Restarting system.\n");
+		unlock_kernel();
 		machine_restart(NULL);
 		break;
 
@@ -425,8 +426,8 @@
 		system_running = 0;
 		device_shutdown();
 		printk(KERN_EMERG "System halted.\n");
-		machine_halt();
 		unlock_kernel();
+		machine_halt();
 		do_exit(0);
 		break;
 
@@ -435,8 +436,8 @@
 		system_running = 0;
 		device_shutdown();
 		printk(KERN_EMERG "Power down.\n");
-		machine_power_off();
 		unlock_kernel();
+		machine_power_off();
 		do_exit(0);
 		break;
 
@@ -451,6 +452,7 @@
 		system_running = 0;
 		device_shutdown();
 		printk(KERN_EMERG "Restarting system with command '%s'.\n", buffer);
+		unlock_kernel();
 		machine_restart(buffer);
 		break;
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2003-08-24 21:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-20 10:22 Dumb question: BKL on reboot ? Hannes Reinecke
2003-08-20 18:29 ` Andrew Morton
2003-08-20 18:35   ` David S. Miller
2003-08-20 20:23     ` Dave Hansen
2003-08-21  8:05       ` Hannes Reinecke
2003-08-21 15:41         ` Andrea Arcangeli
2003-08-21 15:55           ` Hannes Reinecke
2003-08-21 16:39             ` Andrea Arcangeli
2003-08-21 16:58               ` Andrea Arcangeli
2003-08-22  6:39               ` Hannes Reinecke
2003-08-22 13:57                 ` Andrea Arcangeli
2003-08-24 21:43                 ` Andrea Arcangeli
2003-08-21 15:33       ` Andrea Arcangeli
     [not found] <3F434BD1.9050704@suse.de.suse.lists.linux.kernel>
2003-08-20 10:49 ` Andi Kleen
2003-08-20 11:51   ` Hannes Reinecke
2003-08-20 12:03     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).