* Dumb question: BKL on reboot ? @ 2003-08-20 10:22 Hannes Reinecke 2003-08-20 18:29 ` Andrew Morton 0 siblings, 1 reply; 16+ messages in thread From: Hannes Reinecke @ 2003-08-20 10:22 UTC (permalink / raw) To: Linux Kernel Hiya, I've got a dumb question: Why is the BKL held on entering sys_reboot() in kernel/sys.c:405 ? It is getting especially interesting on SMP, when one cpu is entering sys_reboot, acquires the BKL and then waits (via machine_restart) for all other cpus to shut down. If any of the other cpus is executing a task which also needs the BKL, we have a nice deadlock. We've seen this here on 2-way s390, where the other cpu tried to execute kupdated() (what did it try that for? Anyway...), which of course resulted in a deadlock. Any enlightenment welcome. Cheers, Hannes P.S.: Please cc me directly, I'm not subscribed. -- Dr. Hannes Reinecke hare@suse.de SuSE Linux AG S390 & zSeries Deutschherrnstr. 15-19 +49 911 74053 688 90429 Nürnberg http://www.suse.de ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 10:22 Dumb question: BKL on reboot ? Hannes Reinecke @ 2003-08-20 18:29 ` Andrew Morton 2003-08-20 18:35 ` David S. Miller 0 siblings, 1 reply; 16+ messages in thread From: Andrew Morton @ 2003-08-20 18:29 UTC (permalink / raw) To: Hannes Reinecke; +Cc: linux-kernel Hannes Reinecke <Hannes.Reinecke@suse.de> wrote: > > Hiya, > > I've got a dumb question: Why is the BKL held on entering sys_reboot() > in kernel/sys.c:405 ? Probably for no good reason. > It is getting especially interesting on SMP, when one cpu is entering > sys_reboot, acquires the BKL and then waits (via machine_restart) for > all other cpus to shut down. If any of the other cpus is executing a > task which also needs the BKL, we have a nice deadlock. > We've seen this here on 2-way s390, where the other cpu tried to execute > kupdated() (what did it try that for? Anyway...), which of course > resulted in a deadlock. I guess that dropping the BKL around the machine_restart() call would be an appropriate 2.4 fix. Where exactly does the rebooting CPU get stuck in machine_restart()? If someone has done lock_kernel() with local interrupts disabled then yes, it'll deadlock. But that's unlikely? Confused. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 18:29 ` Andrew Morton @ 2003-08-20 18:35 ` David S. Miller 2003-08-20 20:23 ` Dave Hansen 0 siblings, 1 reply; 16+ messages in thread From: David S. Miller @ 2003-08-20 18:35 UTC (permalink / raw) To: Andrew Morton; +Cc: Hannes.Reinecke, linux-kernel On Wed, 20 Aug 2003 11:29:18 -0700 Andrew Morton <akpm@osdl.org> wrote: > Where exactly does the rebooting CPU get stuck in machine_restart()? If > someone has done lock_kernel() with local interrupts disabled then yes, > it'll deadlock. But that's unlikely? Confused. In fact I think that's illegal and we should bug check such a state. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 18:35 ` David S. Miller @ 2003-08-20 20:23 ` Dave Hansen 2003-08-21 8:05 ` Hannes Reinecke 2003-08-21 15:33 ` Andrea Arcangeli 0 siblings, 2 replies; 16+ messages in thread From: Dave Hansen @ 2003-08-20 20:23 UTC (permalink / raw) To: David S. Miller; +Cc: Andrew Morton, Hannes.Reinecke, Linux Kernel Mailing List On Wed, 2003-08-20 at 11:35, David S. Miller wrote: > On Wed, 20 Aug 2003 11:29:18 -0700 > Andrew Morton <akpm@osdl.org> wrote: > > > Where exactly does the rebooting CPU get stuck in machine_restart()? If > > someone has done lock_kernel() with local interrupts disabled then yes, > > it'll deadlock. But that's unlikely? Confused. I thought it was legal to do that. The normal interrupts-off spinlock deadlock happens because a thread does a spin_lock() with no irq disable, then gets interrupted. The interrupt handler tries to take the lock, and gets stuck. If the BKL is put in that situation, the interrupt handler will see current->lock_depth > 0, and not actually take the spinlock; it will just increment the lock_depth. It won't deadlock. Or, are you saying that a CPU could have the BKL, and have stop_this_cpu() called on it? I guess we could add release_kernel_lock() to stop_this_cpu(). -- Dave Hansen haveblue@us.ibm.com ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 20:23 ` Dave Hansen @ 2003-08-21 8:05 ` Hannes Reinecke 2003-08-21 15:41 ` Andrea Arcangeli 2003-08-21 15:33 ` Andrea Arcangeli 1 sibling, 1 reply; 16+ messages in thread From: Hannes Reinecke @ 2003-08-21 8:05 UTC (permalink / raw) To: Dave Hansen; +Cc: Linux Kernel Mailing List Dave Hansen wrote: [ .. ] > > Or, are you saying that a CPU could have the BKL, and have > stop_this_cpu() called on it? I guess we could add > release_kernel_lock() to stop_this_cpu(). Exactly what happened here. CPU#1 entered sys_reboot, got BKL and prepared to stop. It will never release BKL, leaving a nice big window for CPU#0 to deadlock. releasing BKL before calling do_machine_restart seems to do the trick, though. Cheers, Hannes -- Dr. Hannes Reinecke hare@suse.de SuSE Linux AG S390 & zSeries Deutschherrnstr. 15-19 +49 911 74053 688 90429 Nürnberg http://www.suse.de ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-21 8:05 ` Hannes Reinecke @ 2003-08-21 15:41 ` Andrea Arcangeli 2003-08-21 15:55 ` Hannes Reinecke 0 siblings, 1 reply; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-21 15:41 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote: > Dave Hansen wrote: > [ .. ] > > > >Or, are you saying that a CPU could have the BKL, and have > >stop_this_cpu() called on it? I guess we could add > >release_kernel_lock() to stop_this_cpu(). > Exactly what happened here. CPU#1 entered sys_reboot, got BKL and > prepared to stop. It will never release BKL, leaving a nice big window > for CPU#0 to deadlock. if that was really the case it shouldn't deadlock, because CPU0 shouldn't depend on a big kernel lock to be able to re-enable the irqs, and to receive the IPI. Nor any IPI handler could ever be allowed to take the big kernel lock. Unless you can demonstrate a dependency on the big kernel lock for CPU0 to re-enable the irqs eventually, I can't see how the above could deadlock. The only clear deadlock prone thing I can see from the code is the IPI handler looping forever w/o allowing the stopped cpu to release all its underlying spinlocks. Not sure exactly how this generates the deadlock in practice, but you only need to hit one of the held spinlocks (including possibly the BKL) in the context of the reboot "cpu" to hang forever during reboot. Looping forever and giving the OK to reboot the machine from the context of a kernel thread instead of an IPI will fix it as far as I can tell. Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-21 15:41 ` Andrea Arcangeli @ 2003-08-21 15:55 ` Hannes Reinecke 2003-08-21 16:39 ` Andrea Arcangeli 0 siblings, 1 reply; 16+ messages in thread From: Hannes Reinecke @ 2003-08-21 15:55 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Dave Hansen, Linux Kernel Mailing List Andrea Arcangeli wrote: > On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote: [ .. ] >> >>Exactly what happened here. CPU#1 entered sys_reboot, got BKL and >>prepared to stop. It will never release BKL, leaving a nice big window >>for CPU#0 to deadlock. > > > if that was really the case it shouldn't deadlock, because CPU0 > shouldn't depend on a big kernel lock to be able to re-enable the irqs, > and to receive the IPI. Nor any IPI handler could ever be allowed to > take the big kernel lock. > > Unless you can demonstrate a dependency on the big kernel lock for CPU0 > to re-enable the irqs eventually, I can't see how the above could > deadlock. > What happened was this: ================================================================ TASK HAS CPU (0): 0x904000 (kupdated): LOWCORE INFO: -psw : 0x400100180000000 0x0000000007aa22 -function : sync_old_buffers+58 -prefix : 0x009c8000 -cpu timer: 0xfffffa77 0x33465f80 -clock cmp: 0x00b9e503 0x235324f4 -general registers: 0000000000000000 0x000000002e1000 0x0000000007aa22 0x0000000004919e 0x00000000907f28 0x0000000022b428 0000000000000000 0000000000000000 0x00000000224340 0000000000000000 0x00000000970000 0x00000000233b00 0x00000000904000 0x000000001deb60 0x00000000907e78 0x00000000907dd8 [ .. ] STACK: 0 kupdate+424 [0x7af4c] 1 arch_kernel_thread+70 [0x18d72] ================================================================ TASK HAS CPU (1): 0x3318000 (reboot): LOWCORE INFO: -psw : 0x700100180000000 0x0000007ffe0d8e -function : not found in System map -prefix : 0x009c2000 -cpu timer: 0xfff79e2d 0x014244c0 -clock cmp: 0x00b9e503 0x3a5d9504 -general registers: 0x00000000000001 0000000000000000 0000000000000000 0000000000000000 0x0000000001f268 0000000000000000 0x00000000000003 0x000000000002c0 0x000000000490f0 0x00000003318000 0x00000000016422 0000000000000000 0x00000000000001 0x000000001da280 0x0000000001f268 0x0000000331bae0 [ .. ] STACK: 0 machine_restart_smp+62 [0x1f5f2] 1 machine_restart+48 [0x19884] 2 sys_reboot+306 [0x49222] 3 sysc_noemu+16 [0x16242] Not sure why CPU#0 wanted to execute kupdate(), but hey ... Cheers, Hannes -- Dr. Hannes Reinecke hare@suse.de SuSE Linux AG S390 & zSeries Deutschherrnstr. 15-19 +49 911 74053 688 90429 Nürnberg http://www.suse.de ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-21 15:55 ` Hannes Reinecke @ 2003-08-21 16:39 ` Andrea Arcangeli 2003-08-21 16:58 ` Andrea Arcangeli 2003-08-22 6:39 ` Hannes Reinecke 0 siblings, 2 replies; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-21 16:39 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List On Thu, Aug 21, 2003 at 05:55:49PM +0200, Hannes Reinecke wrote: > Andrea Arcangeli wrote: > >On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote: > [ .. ] > >> > >>Exactly what happened here. CPU#1 entered sys_reboot, got BKL and > >>prepared to stop. It will never release BKL, leaving a nice big window > >>for CPU#0 to deadlock. > > > > > >if that was really the case it shouldn't deadlock, because CPU0 > >shouldn't depend on a big kernel lock to be able to re-enable the irqs, > >and to receive the IPI. Nor any IPI handler could ever be allowed to > >take the big kernel lock. > > > >Unless you can demonstrate a dependency on the big kernel lock for CPU0 > >to re-enable the irqs eventually, I can't see how the above could > >deadlock. > > > > What happened was this: > > ================================================================ > TASK HAS CPU (0): 0x904000 (kupdated): > LOWCORE INFO: > -psw : 0x400100180000000 0x0000000007aa22 > -function : sync_old_buffers+58 > -prefix : 0x009c8000 > -cpu timer: 0xfffffa77 0x33465f80 > -clock cmp: 0x00b9e503 0x235324f4 > -general registers: > 0000000000000000 0x000000002e1000 > 0x0000000007aa22 0x0000000004919e > 0x00000000907f28 0x0000000022b428 > 0000000000000000 0000000000000000 > 0x00000000224340 0000000000000000 > 0x00000000970000 0x00000000233b00 > 0x00000000904000 0x000000001deb60 > 0x00000000907e78 0x00000000907dd8 > [ .. ] > STACK: > 0 kupdate+424 [0x7af4c] > 1 arch_kernel_thread+70 [0x18d72] > ================================================================ > TASK HAS CPU (1): 0x3318000 (reboot): > LOWCORE INFO: > -psw : 0x700100180000000 0x0000007ffe0d8e > -function : not found in System map > -prefix : 0x009c2000 > -cpu timer: 0xfff79e2d 0x014244c0 > -clock cmp: 0x00b9e503 0x3a5d9504 > -general registers: > 0x00000000000001 0000000000000000 > 0000000000000000 0000000000000000 > 0x0000000001f268 0000000000000000 > 0x00000000000003 0x000000000002c0 > 0x000000000490f0 0x00000003318000 > 0x00000000016422 0000000000000000 > 0x00000000000001 0x000000001da280 > 0x0000000001f268 0x0000000331bae0 > [ .. ] > STACK: > 0 machine_restart_smp+62 [0x1f5f2] > 1 machine_restart+48 [0x19884] > 2 sys_reboot+306 [0x49222] > 3 sysc_noemu+16 [0x16242] This can't be the lock_kernel, you see, there's no lock_kernel invocation at all in the machine_restart_smp path. > > Not sure why CPU#0 wanted to execute kupdate(), but hey ... kupdate always runs in the background, once every few seconds. I perfectly see why it hangs on s390, that's a bug in the arch code and it's unrelated to whatever lock_kernel and even unrelated to the potential deadlock prone shutdown of cpus that I mentioned in my previous email: static void do_machine_restart(void * __unused) { clear_bit(smp_processor_id(), &cpu_restart_map); if (smp_processor_id() == 0) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ if the `reboot` program doesn't run by luck on cpu 0, it will be the one that will hang instead of the others. the above check must be changed to: smp_processor_id() == reboot_cpu and you should snapshot reboot_cpu in machine_restart_smp. then it won't hang anymore. The x86 code may be safe in hanging in the IPI if it's extremely careful in not touching any lock that maybe hold by other cpus, between the IPI delivery and the hard reboot. So probably we don't need to replace the IPI with a kernel thread in 2.4, despite I find the kernel thread way more robust. Overall the lock_kernel I think is _needed_ (not only if something it's safer), to serialize reboot against reboot and I wouldn't bother changing it to another spinlock (at least not in 2.4) Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-21 16:39 ` Andrea Arcangeli @ 2003-08-21 16:58 ` Andrea Arcangeli 2003-08-22 6:39 ` Hannes Reinecke 1 sibling, 0 replies; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-21 16:58 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List On Thu, Aug 21, 2003 at 06:39:38PM +0200, Andrea Arcangeli wrote: > static void do_machine_restart(void * __unused) > { > clear_bit(smp_processor_id(), &cpu_restart_map); > if (smp_processor_id() == 0) { > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > if the `reboot` program doesn't run by luck on cpu 0, it will be the one > that will hang instead of the others. the above check must be changed to: > > smp_processor_id() == reboot_cpu > > and you should snapshot reboot_cpu in machine_restart_smp. then it won't > hang anymore. this untested patch should do the trick: --- 2.4.22pre7aa1/arch/s390x/kernel/smp.c.~1~ 2003-07-19 02:34:04.000000000 +0200 +++ 2.4.22pre7aa1/arch/s390x/kernel/smp.c 2003-08-21 18:48:05.000000000 +0200 @@ -240,11 +240,12 @@ void smp_send_stop(void) * Reboot, halt and power_off routines for SMP. */ static volatile unsigned long cpu_restart_map; +static unsigned long restart_cpu; static void do_machine_restart(void * __unused) { clear_bit(smp_processor_id(), &cpu_restart_map); - if (smp_processor_id() == 0) { + if (smp_processor_id() == restart_cpu) { /* Wait for all other cpus to enter do_machine_restart. */ while (cpu_restart_map != 0); /* Store status of other cpus. */ @@ -263,6 +264,7 @@ static void do_machine_restart(void * __ void machine_restart_smp(char * __unused) { + restart_cpu = smp_processor_id(); cpu_restart_map = cpu_online_map; smp_call_function(do_machine_restart, NULL, 0, 0); do_machine_restart(NULL); Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-21 16:39 ` Andrea Arcangeli 2003-08-21 16:58 ` Andrea Arcangeli @ 2003-08-22 6:39 ` Hannes Reinecke 2003-08-22 13:57 ` Andrea Arcangeli 2003-08-24 21:43 ` Andrea Arcangeli 1 sibling, 2 replies; 16+ messages in thread From: Hannes Reinecke @ 2003-08-22 6:39 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Dave Hansen, Linux Kernel Mailing List Andrea Arcangeli wrote: > On Thu, Aug 21, 2003 at 05:55:49PM +0200, Hannes Reinecke wrote: > >>Andrea Arcangeli wrote: >> >>>On Thu, Aug 21, 2003 at 10:05:20AM +0200, Hannes Reinecke wrote: >> >> >> >>What happened was this: >> >>================================================================ >>TASK HAS CPU (0): 0x904000 (kupdated): >> LOWCORE INFO: >> -psw : 0x400100180000000 0x0000000007aa22 >> -function : sync_old_buffers+58 >> -prefix : 0x009c8000 >> -cpu timer: 0xfffffa77 0x33465f80 >> -clock cmp: 0x00b9e503 0x235324f4 >> -general registers: >> 0000000000000000 0x000000002e1000 >> 0x0000000007aa22 0x0000000004919e >> 0x00000000907f28 0x0000000022b428 >> 0000000000000000 0000000000000000 >> 0x00000000224340 0000000000000000 >> 0x00000000970000 0x00000000233b00 >> 0x00000000904000 0x000000001deb60 >> 0x00000000907e78 0x00000000907dd8 >>[ .. ] >> STACK: >> 0 kupdate+424 [0x7af4c] >> 1 arch_kernel_thread+70 [0x18d72] >>================================================================ >>TASK HAS CPU (1): 0x3318000 (reboot): >> LOWCORE INFO: >> -psw : 0x700100180000000 0x0000007ffe0d8e >> -function : not found in System map >> -prefix : 0x009c2000 >> -cpu timer: 0xfff79e2d 0x014244c0 >> -clock cmp: 0x00b9e503 0x3a5d9504 >> -general registers: >> 0x00000000000001 0000000000000000 >> 0000000000000000 0000000000000000 >> 0x0000000001f268 0000000000000000 >> 0x00000000000003 0x000000000002c0 >> 0x000000000490f0 0x00000003318000 >> 0x00000000016422 0000000000000000 >> 0x00000000000001 0x000000001da280 >> 0x0000000001f268 0x0000000331bae0 >>[ .. ] >> STACK: >> 0 machine_restart_smp+62 [0x1f5f2] >> 1 machine_restart+48 [0x19884] >> 2 sys_reboot+306 [0x49222] >> 3 sysc_noemu+16 [0x16242] > > > This can't be the lock_kernel, you see, there's no lock_kernel > invocation at all in the machine_restart_smp path. Oh? sys_reboot() does call lock_kernel(). kernel/sys.c:303. Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you suggestion and see how far I'll progress. Cheers, Hannes -- Dr. Hannes Reinecke hare@suse.de SuSE Linux AG S390 & zSeries Deutschherrnstr. 15-19 +49 911 74053 688 90429 Nürnberg http://www.suse.de ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-22 6:39 ` Hannes Reinecke @ 2003-08-22 13:57 ` Andrea Arcangeli 2003-08-24 21:43 ` Andrea Arcangeli 1 sibling, 0 replies; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-22 13:57 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List > >This can't be the lock_kernel, you see, there's no lock_kernel > >invocation at all in the machine_restart_smp path. > > Oh? sys_reboot() does call lock_kernel(). kernel/sys.c:303. yes, I know, I meant that it's not spinning on the lock, sys_reboot gets through lock_kernel w/o problems. It gets the lock successfully. > Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you > suggestion and see how far I'll progress. thanks, that's the right fix as far as I can tell and the lock_kernel would better stay to serialize concurrent sys_reboot. Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-22 6:39 ` Hannes Reinecke 2003-08-22 13:57 ` Andrea Arcangeli @ 2003-08-24 21:43 ` Andrea Arcangeli 1 sibling, 0 replies; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-24 21:43 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Dave Hansen, Linux Kernel Mailing List, uweigand Hi Hannes, On Fri, Aug 22, 2003 at 08:39:03AM +0200, Hannes Reinecke wrote: > Agreed, this smp_processor_id() == 0 thing is interesting. I'll try you > suggestion and see how far I'll progress. Ulrich pointed me out that only cpu0 can call into the VM (thanks!), and in turn the s390 code I tocuehd was infact correct (despite it looked suspect given the trace and the special ==0 case). After a closer inspection my (last ;) conclusion is this (cut-and-past from bugzilla) ---------------------- [..] But the data.started counter is already enforcing a good enough guarantee, that the CPU1 will execute "signal_processor(smp_processor_id(), sigp_stop);" only when the CPU0 is already executing inside the IPI handler. So I can't imagine the IPI on CPU0 getting lost. the last explanation I can think, is that the CPU0 executes the IPI, waits for cpu_restart_map to become 0 (i.e. CPU1 offline), and eventually executes this code correctly: do_store_status(); /* * Finally call reipl. Because we waited for all other * cpus to enter this function we know that they do * not hold any s390irq-locks (the cpus have been * interrupted by an external interrupt and s390irq * locks are always held disabled). */ reipl(S390_lowcore.ipl_device); } signal_processor(smp_processor_id(), sigp_stop); but the box doesn't reboot and the CPU0 returns from the IPI and goes idle until kupdate kicks in (finds the first idle cpu and tries to take the big kernel lock that is correctly held by CPU1 in sys_reboot). This seems a bug in the s390 lowlevel code above, it just doesn't reboot the machine. ---------------------- Maybe it's related to signal_processor(smp_processor_id(), sigp_stop) failing inside IPI handlers or whatever similar arch specific, dunno. The patch removing the BKL from sys_reboot shouldn't help either if my above theory is correct: when the IPI runs in cpu0, it's not even running on top of the BKL. It's just that the machine not rebooting, eventually will have the first idle cpu (cpu0) execute kupdate in mean in 2.5 sec, and it will eventually go in the state you found in lkcd since the BKL is held by cpu1 in sys_reboot that is already offline (no valid EIP in lkcd). So if you want to test, probably one interesting info you could generate to confirm my above theory, is to reproduce the very same deadlock even with the BKL removal patch applied to sys_reboot. Not sure how easy it is to reproduce it. If you can hang it again, it would not deadlock in kupdate anymore: you should find cpu0 stuck in the idle loop instead of sync_old_buffers. It maybe something slightly different going wrong too, but still I'm convinced the lock_kernel in sys_reboot is absoltely innocent and needed (at least in 2.4). Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 20:23 ` Dave Hansen 2003-08-21 8:05 ` Hannes Reinecke @ 2003-08-21 15:33 ` Andrea Arcangeli 1 sibling, 0 replies; 16+ messages in thread From: Andrea Arcangeli @ 2003-08-21 15:33 UTC (permalink / raw) To: Dave Hansen Cc: David S. Miller, Andrew Morton, Hannes.Reinecke, Linux Kernel Mailing List On Wed, Aug 20, 2003 at 01:23:44PM -0700, Dave Hansen wrote: > On Wed, 2003-08-20 at 11:35, David S. Miller wrote: > > On Wed, 20 Aug 2003 11:29:18 -0700 > > Andrew Morton <akpm@osdl.org> wrote: > > > > > Where exactly does the rebooting CPU get stuck in machine_restart()? If > > > someone has done lock_kernel() with local interrupts disabled then yes, > > > it'll deadlock. But that's unlikely? Confused. > > I thought it was legal to do that. The normal interrupts-off spinlock > deadlock happens because a thread does a spin_lock() with no irq > disable, then gets interrupted. The interrupt handler tries to take the > lock, and gets stuck. that's the regular spinlock case but it has nothing to do with the big kernel lock during smp_call_function. holding the big kernel lock while broadcasting the IPI with smp_call_function is perfectly legal. Infact if something it's safer (and it can't explain any deadlock), and since it's a so low performance path, where scalability doesn't matter, leaving the BKL around that code can be acceptable. The only illegal thing is to call smp_call_function with _local_irqs_ disabled (as Andrew noted above). the big kernel lock doesn't matter for the smp_call_function callers. > If the BKL is put in that situation, the interrupt handler will see > current->lock_depth > 0, and not actually take the spinlock; it will > just increment the lock_depth. It won't deadlock. > > Or, are you saying that a CPU could have the BKL, and have > stop_this_cpu() called on it? I guess we could add > release_kernel_lock() to stop_this_cpu(). the *real* bug IMHO is that machine_restart isn't shutting down the other cpus properly in smp, it has nothing to do with lock_kernel in kernel/, the bug is in arch/. Dropping the big kernel lock in kernel/ can hide the bug, but it will showup again later in any other spinlock. The problem is that machine_restart in arch/s390 (and arch/i386 too) can easily hang a cpu in the IPI handler while it had zillon of spinlocks held, that may later interfere with the main "reboot" cpu. So at the very least a better fix than the one that drops the big kernel lock from kernel/ is to execute current->lock_depth = -1 before entering the infinite loop in the arch/ code. A real fix would be to use the unplug cpu framework to allow all cpus to reach the quiescent point, in this case it's easily doable even w/o using RCU and w/o the pluggable cpu framework in 2.4, by simply spawning a kernel thread bind to interesting cpu, and having this kernel thread setting the bitflag after it become runnable (cut and pasting the spawn_ksoftirqd will do the trick). The kernel thread will only execute the infinite loop then. Hanging a cpu forever in a IPI is simply deadlock prone for all other cpus, and that's the real bug. Andrea ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <3F434BD1.9050704@suse.de.suse.lists.linux.kernel>]
* Re: Dumb question: BKL on reboot ? [not found] <3F434BD1.9050704@suse.de.suse.lists.linux.kernel> @ 2003-08-20 10:49 ` Andi Kleen 2003-08-20 11:51 ` Hannes Reinecke 0 siblings, 1 reply; 16+ messages in thread From: Andi Kleen @ 2003-08-20 10:49 UTC (permalink / raw) To: Hannes Reinecke; +Cc: linux-kernel Hannes Reinecke <Hannes.Reinecke@suse.de> writes: > I've got a dumb question: Why is the BKL held on entering sys_reboot() > in kernel/sys.c:405 ? Interesting. I have a few SMP deadlocks on x86-64 in reboot too and it's possible that it is the same problem. I would hold it during exection of the notifiers, but drop it before calling into machine_* -Andi --- linux-2.6.0test3-amd64/kernel/sys.c-o 2003-07-28 19:18:18.000000000 +0200 +++ linux-2.6.0test3-amd64/kernel/sys.c 2003-08-20 12:48:59.000000000 +0200 @@ -409,6 +409,7 @@ system_running = 0; device_shutdown(); printk(KERN_EMERG "Restarting system.\n"); + unlock_kernel(); machine_restart(NULL); break; @@ -425,8 +426,8 @@ system_running = 0; device_shutdown(); printk(KERN_EMERG "System halted.\n"); - machine_halt(); unlock_kernel(); + machine_halt(); do_exit(0); break; @@ -435,8 +436,8 @@ system_running = 0; device_shutdown(); printk(KERN_EMERG "Power down.\n"); - machine_power_off(); unlock_kernel(); + machine_power_off(); do_exit(0); break; @@ -451,6 +452,7 @@ system_running = 0; device_shutdown(); printk(KERN_EMERG "Restarting system with command '%s'.\n", buffer); + unlock_kernel(); machine_restart(buffer); break; ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 10:49 ` Andi Kleen @ 2003-08-20 11:51 ` Hannes Reinecke 2003-08-20 12:03 ` Andi Kleen 0 siblings, 1 reply; 16+ messages in thread From: Hannes Reinecke @ 2003-08-20 11:51 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel Andi Kleen wrote: > Hannes Reinecke <Hannes.Reinecke@suse.de> writes: > > >>I've got a dumb question: Why is the BKL held on entering sys_reboot() >>in kernel/sys.c:405 ? > > > Interesting. I have a few SMP deadlocks on x86-64 in reboot too > and it's possible that it is the same problem. > > I would hold it during exection of the notifiers, but drop > it before calling into machine_* > Indeed. Works on s390 and survived several reboots (test still continuing). I moved lock_kernel() into the sections that actually _do_ something; looks much cleaner IMHO. THX for your help. Is it worth to try to push it into mainstream kernel? Cheers, Hannes -- --- linux-2.4.21/kernel/sys.c.orig 2003-08-20 10:52:39.000000000 +0200 +++ linux-2.4.21/kernel/sys.c 2003-08-20 10:52:39.000000000 +0200 @@ -340,11 +340,12 @@ magic2 != LINUX_REBOOT_MAGIC2B)) return -EINVAL; - lock_kernel(); switch (cmd) { case LINUX_REBOOT_CMD_RESTART: + lock_kernel(); notifier_call_chain(&reboot_notifier_list, SYS_RESTART, NULL); printk(KERN_EMERG "Restarting system.\n"); + unlock_kernel(); machine_restart(NULL); break; @@ -357,20 +358,25 @@ break; case LINUX_REBOOT_CMD_HALT: + lock_kernel(); notifier_call_chain(&reboot_notifier_list, SYS_HALT, NULL); printk(KERN_EMERG "System halted.\n"); + unlock_kernel(); machine_halt(); do_exit(0); break; case LINUX_REBOOT_CMD_POWER_OFF: + lock_kernel(); notifier_call_chain(&reboot_notifier_list, SYS_POWER_OFF, NULL); printk(KERN_EMERG "Power down.\n"); + unlock_kernel(); machine_power_off(); do_exit(0); break; case LINUX_REBOOT_CMD_RESTART2: + lock_kernel(); if (strncpy_from_user(&buffer[0], (char *)arg, sizeof(buffer) - 1) < 0) { unlock_kernel(); return -EFAULT; @@ -379,14 +385,13 @@ notifier_call_chain(&reboot_notifier_list, SYS_RESTART, buffer); printk(KERN_EMERG "Restarting system with command '%s'.\n", buff er); + unlock_kernel(); machine_restart(buffer); break; default: - unlock_kernel(); return -EINVAL; } - unlock_kernel(); return 0; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Dumb question: BKL on reboot ? 2003-08-20 11:51 ` Hannes Reinecke @ 2003-08-20 12:03 ` Andi Kleen 0 siblings, 0 replies; 16+ messages in thread From: Andi Kleen @ 2003-08-20 12:03 UTC (permalink / raw) To: Hannes Reinecke; +Cc: Andi Kleen, linux-kernel > Is it worth to try to push it into mainstream kernel? Yes definitely. My patch was for 2.6.0test3. -Andi ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2003-08-24 21:42 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-08-20 10:22 Dumb question: BKL on reboot ? Hannes Reinecke 2003-08-20 18:29 ` Andrew Morton 2003-08-20 18:35 ` David S. Miller 2003-08-20 20:23 ` Dave Hansen 2003-08-21 8:05 ` Hannes Reinecke 2003-08-21 15:41 ` Andrea Arcangeli 2003-08-21 15:55 ` Hannes Reinecke 2003-08-21 16:39 ` Andrea Arcangeli 2003-08-21 16:58 ` Andrea Arcangeli 2003-08-22 6:39 ` Hannes Reinecke 2003-08-22 13:57 ` Andrea Arcangeli 2003-08-24 21:43 ` Andrea Arcangeli 2003-08-21 15:33 ` Andrea Arcangeli [not found] <3F434BD1.9050704@suse.de.suse.lists.linux.kernel> 2003-08-20 10:49 ` Andi Kleen 2003-08-20 11:51 ` Hannes Reinecke 2003-08-20 12:03 ` Andi Kleen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).