* [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() @ 2009-07-07 23:58 Anton Vorontsov 2009-07-08 0:50 ` Oleg Nesterov 0 siblings, 1 reply; 27+ messages in thread From: Anton Vorontsov @ 2009-07-07 23:58 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Andrew Morton, Oleg Nesterov, Peter Zijlstra, linux-kernel Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. This patch fixes the issue by removing SYSTEM_RUNNING checks, which might be dangerous to do, but looking at 'git blame' I can see that the first SYSTEM_RUNNING check was introduced in this commit: commit 8ba7b0a14b2ec19583bedbcdbea7f1c5008fc922 Author: Linus Torvalds <torvalds@g5.osdl.org> Date: Mon Mar 6 17:38:49 2006 -0800 Add early-boot-safety check to cond_resched() Just to be safe, we should not trigger a conditional reschedule during the early boot sequence. We've historically done some questionable early on, and the safety warnings in __might_sleep() are generally turned off during that period, so there might be problems lurking. This affects CONFIG_PREEMPT_VOLUNTARY, which takes over might_sleep() to cause a voluntary conditional reschedule. The commit doesn't mention that it fixes some particular problem, though it surely breaks netconsole for drivers that are using phylib. I tested the patch with CONFIG_PREEMPT_VOLUNTARY (on PowerPC), and it didn't cause any issues. Surely, the other way to fix the netconsole is to call schedule() or msleep() directly, but I'm curious if removing the checks makes sense nowadays. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> --- kernel/sched.c | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 7c9098d..3899baa 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6560,8 +6560,7 @@ static void __cond_resched(void) int __sched _cond_resched(void) { - if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && - system_state == SYSTEM_RUNNING) { + if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE)) { __cond_resched(); return 1; } @@ -6579,7 +6578,7 @@ EXPORT_SYMBOL(_cond_resched); */ int cond_resched_lock(spinlock_t *lock) { - int resched = need_resched() && system_state == SYSTEM_RUNNING; + int resched = need_resched(); int ret = 0; if (spin_needbreak(lock) || resched) { @@ -6599,7 +6598,7 @@ int __sched cond_resched_softirq(void) { BUG_ON(!in_softirq()); - if (need_resched() && system_state == SYSTEM_RUNNING) { + if (need_resched()) { local_bh_enable(); __cond_resched(); local_bh_disable(); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-07 23:58 [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Anton Vorontsov @ 2009-07-08 0:50 ` Oleg Nesterov 2009-07-08 6:24 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Oleg Nesterov @ 2009-07-08 0:50 UTC (permalink / raw) To: Anton Vorontsov Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, Peter Zijlstra, linux-kernel On 07/08, Anton Vorontsov wrote: > > but I'm curious if removing the checks makes sense > nowadays. ... > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -6560,8 +6560,7 @@ static void __cond_resched(void) > > int __sched _cond_resched(void) > { > - if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && > - system_state == SYSTEM_RUNNING) { > + if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE)) { > __cond_resched(); > return 1; > } and, with CONFIG_PREEMPT preempt_schedule() does not check system_state, so it looks really strange cond_resched() does check SYSTEM_RUNNING. debug_smp_processor_id() looks strange too: /* * It is valid to assume CPU-locality during early bootup: */ if (system_state != SYSTEM_RUNNING) goto out; this doesn't look right, smp_init() is called before we set SYSTEM_RUNNING. Hmm, and /* * Kernel threads bound to a single CPU can safely use * smp_processor_id(): */ if (cpumask_equal(¤t->cpus_allowed, cpumask_of(this_cpu))) goto out; perhaps this should use PF_THREAD_BOUND ? Oleg. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 0:50 ` Oleg Nesterov @ 2009-07-08 6:24 ` Peter Zijlstra 2009-07-08 12:03 ` Anton Vorontsov 2009-07-08 16:12 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Linus Torvalds 0 siblings, 2 replies; 27+ messages in thread From: Peter Zijlstra @ 2009-07-08 6:24 UTC (permalink / raw) To: Oleg Nesterov Cc: Anton Vorontsov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > /* > * It is valid to assume CPU-locality during early bootup: > */ > if (system_state != SYSTEM_RUNNING) > goto out; > > this doesn't look right, smp_init() is called before we set > SYSTEM_RUNNING. The thing is, there's also ton's of code that might end up calling cond_resched() and co before the scheduler is fully initialized. Doing so would indeed mess things up. Also, by definition we'd have to call smp_init() before SYSTEM_RUNNING, because you simply cannot declare a system up and running when your core functionality isn't initialized. So I'd really rather preserve these checks -- I can even remember running into some of these things a while back, but memory isn't providing specific cases. > Hmm, and > > /* > * Kernel threads bound to a single CPU can safely use > * smp_processor_id(): > */ > if (cpumask_equal(¤t->cpus_allowed, cpumask_of(this_cpu))) > goto out; > > perhaps this should use PF_THREAD_BOUND ? That might predate PF_THREAD_BOUND, also I think this is more generic, and I think we used it for that set_affinity dance we did Rusty 'fixed' a while back. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 6:24 ` Peter Zijlstra @ 2009-07-08 12:03 ` Anton Vorontsov 2009-07-08 12:12 ` Peter Zijlstra 2009-07-08 16:12 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Linus Torvalds 1 sibling, 1 reply; 27+ messages in thread From: Anton Vorontsov @ 2009-07-08 12:03 UTC (permalink / raw) To: Peter Zijlstra Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel On Wed, Jul 08, 2009 at 08:24:23AM +0200, Peter Zijlstra wrote: > On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > > > /* > > * It is valid to assume CPU-locality during early bootup: > > */ > > if (system_state != SYSTEM_RUNNING) > > goto out; > > > > this doesn't look right, smp_init() is called before we set > > SYSTEM_RUNNING. > > The thing is, there's also ton's of code that might end up calling > cond_resched() and co before the scheduler is fully initialized. Hm. Speaking of cond_resched*() only, then it should be pretty safe to convert the SYSTEM_RUNNING checks to scheduler_running, no? scheduler_running is set after sched_init(). Something like this diff --git a/kernel/sched.c b/kernel/sched.c index 7c9098d..555360b 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6561,7 +6561,7 @@ static void __cond_resched(void) int __sched _cond_resched(void) { if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && - system_state == SYSTEM_RUNNING) { + scheduler_running) { __cond_resched(); return 1; } @@ -6579,7 +6579,7 @@ EXPORT_SYMBOL(_cond_resched); */ int cond_resched_lock(spinlock_t *lock) { - int resched = need_resched() && system_state == SYSTEM_RUNNING; + int resched = need_resched() && scheduler_running; int ret = 0; if (spin_needbreak(lock) || resched) { @@ -6599,7 +6599,7 @@ int __sched cond_resched_softirq(void) { BUG_ON(!in_softirq()); - if (need_resched() && system_state == SYSTEM_RUNNING) { + if (need_resched() && scheduler_running) { local_bh_enable(); __cond_resched(); local_bh_disable(); ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 12:03 ` Anton Vorontsov @ 2009-07-08 12:12 ` Peter Zijlstra 2009-07-08 12:55 ` Anton Vorontsov 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2009-07-08 12:12 UTC (permalink / raw) To: avorontsov Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel On Wed, 2009-07-08 at 16:03 +0400, Anton Vorontsov wrote: > On Wed, Jul 08, 2009 at 08:24:23AM +0200, Peter Zijlstra wrote: > > On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > > > > > /* > > > * It is valid to assume CPU-locality during early bootup: > > > */ > > > if (system_state != SYSTEM_RUNNING) > > > goto out; > > > > > > this doesn't look right, smp_init() is called before we set > > > SYSTEM_RUNNING. > > > > The thing is, there's also ton's of code that might end up calling > > cond_resched() and co before the scheduler is fully initialized. > > Hm. Speaking of cond_resched*() only, then it should be pretty > safe to convert the SYSTEM_RUNNING checks to scheduler_running, > no? scheduler_running is set after sched_init(). Hmm, that might work, I'd have to audit sched_init_smp() as it seems to do way too much... ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 12:12 ` Peter Zijlstra @ 2009-07-08 12:55 ` Anton Vorontsov 2009-07-08 12:58 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Anton Vorontsov @ 2009-07-08 12:55 UTC (permalink / raw) To: Peter Zijlstra Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel On Wed, Jul 08, 2009 at 02:12:25PM +0200, Peter Zijlstra wrote: > On Wed, 2009-07-08 at 16:03 +0400, Anton Vorontsov wrote: > > On Wed, Jul 08, 2009 at 08:24:23AM +0200, Peter Zijlstra wrote: > > > On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > > > > > > > /* > > > > * It is valid to assume CPU-locality during early bootup: > > > > */ > > > > if (system_state != SYSTEM_RUNNING) > > > > goto out; > > > > > > > > this doesn't look right, smp_init() is called before we set > > > > SYSTEM_RUNNING. > > > > > > The thing is, there's also ton's of code that might end up calling > > > cond_resched() and co before the scheduler is fully initialized. > > > > Hm. Speaking of cond_resched*() only, then it should be pretty > > safe to convert the SYSTEM_RUNNING checks to scheduler_running, > > no? scheduler_running is set after sched_init(). > > Hmm, that might work, I'd have to audit sched_init_smp() as it seems to > do way too much... sched_init_smp() is called from the kernel_thread(), so if the scheduler is not functional prior to kernel_thread(), you're in trouble anyway, no? The point is that a lot of code is calling schedule() prior to sched_init_smp() (e.g. msleep(), mutexes), and there are no issues. So should be no issues with cond_resched()? -- Anton Vorontsov email: cbouatmailru@gmail.com irc://irc.freenode.net/bd2 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 12:55 ` Anton Vorontsov @ 2009-07-08 12:58 ` Peter Zijlstra 2009-07-08 20:45 ` [PATCH] sched: Make cond_resched*() available earlier Anton Vorontsov 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2009-07-08 12:58 UTC (permalink / raw) To: avorontsov Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel On Wed, 2009-07-08 at 16:55 +0400, Anton Vorontsov wrote: > On Wed, Jul 08, 2009 at 02:12:25PM +0200, Peter Zijlstra wrote: > > On Wed, 2009-07-08 at 16:03 +0400, Anton Vorontsov wrote: > > > On Wed, Jul 08, 2009 at 08:24:23AM +0200, Peter Zijlstra wrote: > > > > On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > > > > > > > > > /* > > > > > * It is valid to assume CPU-locality during early bootup: > > > > > */ > > > > > if (system_state != SYSTEM_RUNNING) > > > > > goto out; > > > > > > > > > > this doesn't look right, smp_init() is called before we set > > > > > SYSTEM_RUNNING. > > > > > > > > The thing is, there's also ton's of code that might end up calling > > > > cond_resched() and co before the scheduler is fully initialized. > > > > > > Hm. Speaking of cond_resched*() only, then it should be pretty > > > safe to convert the SYSTEM_RUNNING checks to scheduler_running, > > > no? scheduler_running is set after sched_init(). > > > > Hmm, that might work, I'd have to audit sched_init_smp() as it seems to > > do way too much... > > sched_init_smp() is called from the kernel_thread(), so > if the scheduler is not functional prior to kernel_thread(), > you're in trouble anyway, no? The point is that a lot of > code is calling schedule() prior to sched_init_smp() > (e.g. msleep(), mutexes), and there are no issues. So > should be no issues with cond_resched()? Yeah, it should be good, I just got paranoid looking at sched_init_smp(). ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] sched: Make cond_resched*() available earlier 2009-07-08 12:58 ` Peter Zijlstra @ 2009-07-08 20:45 ` Anton Vorontsov 0 siblings, 0 replies; 27+ messages in thread From: Anton Vorontsov @ 2009-07-08 20:45 UTC (permalink / raw) To: Peter Zijlstra Cc: Oleg Nesterov, Ingo Molnar, Linus Torvalds, Andrew Morton, linux-kernel Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. The system_state check exists to avoid very early calls to the scheduler. Though, using cond_resched() should be safe after scheduler is initialized, so instead of checking the system_state, we can test that the scheduler is running, therefore making cond_resched() and friends available much earlier (but not too much). Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> --- On Wed, Jul 08, 2009 at 02:58:46PM +0200, Peter Zijlstra wrote: [...] > > > > Hm. Speaking of cond_resched*() only, then it should be pretty > > > > safe to convert the SYSTEM_RUNNING checks to scheduler_running, > > > > no? scheduler_running is set after sched_init(). > > > > > > Hmm, that might work, I'd have to audit sched_init_smp() as it seems to > > > do way too much... > > > > sched_init_smp() is called from the kernel_thread(), so > > if the scheduler is not functional prior to kernel_thread(), > > you're in trouble anyway, no? The point is that a lot of > > code is calling schedule() prior to sched_init_smp() > > (e.g. msleep(), mutexes), and there are no issues. So > > should be no issues with cond_resched()? > > Yeah, it should be good, I just got paranoid looking at > sched_init_smp(). OK, thanks everyone for the reviews. Here is an updated patch. As usual, tested on UP PowerPC, with and without SMP, PREEMPT and PREEMPT_VOLUNTARY. kernel/sched.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 7c9098d..555360b 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6561,7 +6561,7 @@ static void __cond_resched(void) int __sched _cond_resched(void) { if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && - system_state == SYSTEM_RUNNING) { + scheduler_running) { __cond_resched(); return 1; } @@ -6579,7 +6579,7 @@ EXPORT_SYMBOL(_cond_resched); */ int cond_resched_lock(spinlock_t *lock) { - int resched = need_resched() && system_state == SYSTEM_RUNNING; + int resched = need_resched() && scheduler_running; int ret = 0; if (spin_needbreak(lock) || resched) { @@ -6599,7 +6599,7 @@ int __sched cond_resched_softirq(void) { BUG_ON(!in_softirq()); - if (need_resched() && system_state == SYSTEM_RUNNING) { + if (need_resched() && scheduler_running) { local_bh_enable(); __cond_resched(); local_bh_disable(); -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 6:24 ` Peter Zijlstra 2009-07-08 12:03 ` Anton Vorontsov @ 2009-07-08 16:12 ` Linus Torvalds 2009-07-08 21:10 ` Andrew Morton 1 sibling, 1 reply; 27+ messages in thread From: Linus Torvalds @ 2009-07-08 16:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Oleg Nesterov, Anton Vorontsov, Ingo Molnar, Andrew Morton, linux-kernel On Wed, 8 Jul 2009, Peter Zijlstra wrote: > On Wed, 2009-07-08 at 02:50 +0200, Oleg Nesterov wrote: > > > /* > > * It is valid to assume CPU-locality during early bootup: > > */ > > if (system_state != SYSTEM_RUNNING) > > goto out; > > > > this doesn't look right, smp_init() is called before we set > > SYSTEM_RUNNING. > > The thing is, there's also ton's of code that might end up calling > cond_resched() and co before the scheduler is fully initialized. Doing > so would indeed mess things up. I forget what triggered me to add some of them, but we definitely had bugs with the scheduler being called early, and having some really nasty oopses happen early in the boot (and that early, they don't get saved, so the reporting percentage goes down to something very low). So I think that the patch Anton posted is probably the RightThing(tm) to do, and in that sense I like it - but they make me very nervous. There's a lot of "cond_resched()"s sprinkled about in helper routines, and if they are ever called early, you're basically screwed. > Also, by definition we'd have to call smp_init() before SYSTEM_RUNNING, > because you simply cannot declare a system up and running when your core > functionality isn't initialized. Now, that part I'm not sure about. We can consider a UP system to be running, and brining up the other CPU's to be a matter of CPU hotplug. It's not what we do _now_, of course. We don't force hotplug support on people just because they want SMP, and we don't want to go through the whole "rewrite locks" things twice etc. > So I'd really rather preserve these checks -- I can even remember > running into some of these things a while back, but memory isn't > providing specific cases. Yes. I get nervous too about the patch. That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. Testing that the scheduler is initialized may be the more correct one. I think the SYSTEM_RUNNING one just comes from that being used for other debug issues. Linus ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 16:12 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Linus Torvalds @ 2009-07-08 21:10 ` Andrew Morton 2009-07-08 21:33 ` Anton Vorontsov 2009-07-09 23:20 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Pavel Machek 0 siblings, 2 replies; 27+ messages in thread From: Andrew Morton @ 2009-07-08 21:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: a.p.zijlstra, oleg, avorontsov, mingo, linux-kernel > On Wed, 8 Jul 2009 09:12:30 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > Testing that the scheduler is initialized may be the more correct one. I > think the SYSTEM_RUNNING one just comes from that being used for other > debug issues. Agreed. system_state is too general. If we specifically want to know whether it is safe to call schedule() then let's create a global boolean it_is_safe_to_call_schedule and test that, rather than testing something which indirectly and unreliably implies "it is safe to call schedule". If that boolean already exists then no-brainer. All that being said, I wonder if the netconsole code should be using msleep(1) instead. Spinning on cond_resched() is a bit rude. But one would have to verify that it is safe to call schedule() at this time, and for the netconsole caller, this is dubious. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 21:10 ` Andrew Morton @ 2009-07-08 21:33 ` Anton Vorontsov 2009-07-08 21:47 ` Andrew Morton 2009-07-09 23:20 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Pavel Machek 1 sibling, 1 reply; 27+ messages in thread From: Anton Vorontsov @ 2009-07-08 21:33 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, a.p.zijlstra, oleg, mingo, linux-kernel On Wed, Jul 08, 2009 at 02:10:24PM -0700, Andrew Morton wrote: > > On Wed, 8 Jul 2009 09:12:30 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > > Testing that the scheduler is initialized may be the more correct one. I > > think the SYSTEM_RUNNING one just comes from that being used for other > > debug issues. > > Agreed. system_state is too general. > > If we specifically want to know whether it is safe to call schedule() then > let's create a global boolean it_is_safe_to_call_schedule and test that, > rather than testing something which indirectly and unreliably implies "it > is safe to call schedule". If that boolean already exists then no-brainer. > > All that being said, I wonder if the netconsole code should be using > msleep(1) instead. Spinning on cond_resched() is a bit rude. But one > would have to verify that it is safe to call schedule() at this time, and > for the netconsole caller, this is dubious. What do you mean by "verify that it is safe"? If it works, can I assume that it's safe? ;-) It works, fwiw. Thanks, -- Anton Vorontsov email: cbouatmailru@gmail.com irc://irc.freenode.net/bd2 ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 21:33 ` Anton Vorontsov @ 2009-07-08 21:47 ` Andrew Morton 2009-07-08 22:20 ` [PATCH] netpoll: Fix carrier detection for drivers that are using phylib Anton Vorontsov 0 siblings, 1 reply; 27+ messages in thread From: Andrew Morton @ 2009-07-08 21:47 UTC (permalink / raw) To: avorontsov; +Cc: torvalds, a.p.zijlstra, oleg, mingo, linux-kernel, netdev (belatedly cc'ing netdev) Original diagnosis: : Using early netconsole and gianfar driver this error pops up: : : netconsole: timeout waiting for carrier : : It appears that net/core/netpoll.c:netpoll_setup() is using : cond_resched() in a loop waiting for a carrier. : : The thing is that cond_resched() is a no-op when system_state != : SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never : scheduled, therefore link detection doesn't work > On Thu, 9 Jul 2009 01:33:31 +0400 Anton Vorontsov <avorontsov@ru.mvista.com> wrote: > On Wed, Jul 08, 2009 at 02:10:24PM -0700, Andrew Morton wrote: > > > On Wed, 8 Jul 2009 09:12:30 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > > > Testing that the scheduler is initialized may be the more correct one. I > > > think the SYSTEM_RUNNING one just comes from that being used for other > > > debug issues. > > > > Agreed. system_state is too general. > > > > If we specifically want to know whether it is safe to call schedule() then > > let's create a global boolean it_is_safe_to_call_schedule and test that, > > rather than testing something which indirectly and unreliably implies "it > > is safe to call schedule". If that boolean already exists then no-brainer. > > > > All that being said, I wonder if the netconsole code should be using > > msleep(1) instead. Spinning on cond_resched() is a bit rude. But one > > would have to verify that it is safe to call schedule() at this time, and > > for the netconsole caller, this is dubious. > > What do you mean by "verify that it is safe"? If it works, > can I assume that it's safe? ;-) It works, fwiw. > netconsole is supposed to be available as early as possible in boot for obvious reasons. I'd say there's a decent risk now and in the future that netconsole will be initialised prior to the scheduler being available. In fact, if "netconsole: timeout waiting for carrier" newly added to netpoll_setup() a depedency on the scheduler being available then perhaps that was an incorrect change. ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-08 21:47 ` Andrew Morton @ 2009-07-08 22:20 ` Anton Vorontsov 2009-07-09 0:01 ` Linus Torvalds 2009-07-09 12:52 ` Matt Mackall 0 siblings, 2 replies; 27+ messages in thread From: Anton Vorontsov @ 2009-07-08 22:20 UTC (permalink / raw) To: Andrew Morton; +Cc: torvalds, a.p.zijlstra, oleg, mingo, linux-kernel, netdev Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. I belive that the main problem is in cond_resched()[1], but despite how the cond_resched() story ends, it might be a good idea to call msleep(1) instead of cond_resched(), as suggested by Andrew Morton. [1] http://lkml.org/lkml/2009/7/7/463 Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> --- On Wed, Jul 08, 2009 at 02:47:44PM -0700, Andrew Morton wrote: > (belatedly cc'ing netdev) > > Original diagnosis: > > : Using early netconsole and gianfar driver this error pops up: > : > : netconsole: timeout waiting for carrier > : > : It appears that net/core/netpoll.c:netpoll_setup() is using > : cond_resched() in a loop waiting for a carrier. > : > : The thing is that cond_resched() is a no-op when system_state != > : SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never > : scheduled, therefore link detection doesn't work > > > On Thu, 9 Jul 2009 01:33:31 +0400 Anton Vorontsov <avorontsov@ru.mvista.com> wrote: > > On Wed, Jul 08, 2009 at 02:10:24PM -0700, Andrew Morton wrote: > > > > On Wed, 8 Jul 2009 09:12:30 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > > > > Testing that the scheduler is initialized may be the more correct one. I > > > > think the SYSTEM_RUNNING one just comes from that being used for other > > > > debug issues. > > > > > > Agreed. system_state is too general. > > > > > > If we specifically want to know whether it is safe to call schedule() then > > > let's create a global boolean it_is_safe_to_call_schedule and test that, > > > rather than testing something which indirectly and unreliably implies "it > > > is safe to call schedule". If that boolean already exists then no-brainer. > > > > > > All that being said, I wonder if the netconsole code should be using > > > msleep(1) instead. Spinning on cond_resched() is a bit rude. But one > > > would have to verify that it is safe to call schedule() at this time, and > > > for the netconsole caller, this is dubious. > > > > What do you mean by "verify that it is safe"? If it works, > > can I assume that it's safe? ;-) It works, fwiw. > > > > netconsole is supposed to be available as early as possible in boot for > obvious reasons. I'd say there's a decent risk now and in the future that > netconsole will be initialised prior to the scheduler being available. > > In fact, if "netconsole: timeout waiting for carrier" newly added to > netpoll_setup() a depedency on the scheduler being available then perhaps > that was an incorrect change. 'git blame' says that carrier detection code didn't change since 2.6.12 (where git history starts), PHYLIB is using workqueue since its submission (2.6.13). And SYSTEM_RUNNING check was added in 2.6.16. So it's not a new dependency. The netpoll code is using msleep() just a few lines below cond_resched(), so we won't make things worse. ;-) Thanks! net/core/netpoll.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 9675f31..df30feb 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -740,7 +740,7 @@ int netpoll_setup(struct netpoll *np) np->name); break; } - cond_resched(); + msleep(1); } /* If carrier appears to come up instantly, we don't -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-08 22:20 ` [PATCH] netpoll: Fix carrier detection for drivers that are using phylib Anton Vorontsov @ 2009-07-09 0:01 ` Linus Torvalds 2009-07-09 3:08 ` David Miller ` (2 more replies) 2009-07-09 12:52 ` Matt Mackall 1 sibling, 3 replies; 27+ messages in thread From: Linus Torvalds @ 2009-07-09 0:01 UTC (permalink / raw) To: Anton Vorontsov Cc: Andrew Morton, a.p.zijlstra, oleg, mingo, linux-kernel, netdev On Thu, 9 Jul 2009, Anton Vorontsov wrote: > > The netpoll code is using msleep() just a few lines below cond_resched(), > so we won't make things worse. ;-) Yeah. That function is definitely sleeping. It does things like kmalloc(GFP_KERNEL), rtnl_lock() and synchronize_rcu() etc too, so an added msleep() is the least of our problems. Afaik, it's called from a bog-standard "module_init()", which happens late enough that everything works. In fact, I wonder if we should set SYSTEM_RUNNING much earlier - _before_ doing the whole "do_initcalls()". By then we've set up all the core stuff and enabled interrupts, so we really _are_ running. We just don't necessarily have drivers, filesystems etc loaded yet. But anything that happens late enough to be an initcall should be largely considered to be during "normal code". (The "early_initcall" cases are special - those really do happen pretty early). So ACK on Anton's patch, but I wonder if we _also_ should do the following? Looking at the people looking at SYSTEM_RUNNING, I do note some odd cases. Why the heck does kernel/perf_counter.c do it, for example? Linus --- init/main.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/init/main.c b/init/main.c index 2c5ade7..f10d9cd 100644 --- a/init/main.c +++ b/init/main.c @@ -788,6 +788,7 @@ static void __init do_initcalls(void) { initcall_t *call; + system_state = SYSTEM_RUNNING; for (call = __early_initcall_end; call < __initcall_end; call++) do_one_initcall(*call); @@ -839,7 +840,6 @@ static noinline int init_post(void) free_initmem(); unlock_kernel(); mark_rodata_ro(); - system_state = SYSTEM_RUNNING; numa_default_policy(); if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0) ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 0:01 ` Linus Torvalds @ 2009-07-09 3:08 ` David Miller 2009-07-09 7:56 ` Peter Zijlstra 2009-07-09 13:26 ` Matt Mackall 2 siblings, 0 replies; 27+ messages in thread From: David Miller @ 2009-07-09 3:08 UTC (permalink / raw) To: torvalds Cc: avorontsov, akpm, a.p.zijlstra, oleg, mingo, linux-kernel, netdev From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, 8 Jul 2009 17:01:14 -0700 (PDT) > So ACK on Anton's patch, I'll integrate it into my tree, thanks. > but I wonder if we _also_ should do the following? I think we should set SYSTEM_RUNNING earlier, makes sense to me. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 0:01 ` Linus Torvalds 2009-07-09 3:08 ` David Miller @ 2009-07-09 7:56 ` Peter Zijlstra 2009-07-09 12:56 ` Matt Mackall 2009-07-09 13:26 ` Matt Mackall 2 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2009-07-09 7:56 UTC (permalink / raw) To: Linus Torvalds Cc: Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Wed, 2009-07-08 at 17:01 -0700, Linus Torvalds wrote: > Looking at the people looking at SYSTEM_RUNNING, I do note some odd cases. > Why the heck does kernel/perf_counter.c do it, for example? Ah, those are the swcounter and other probe entry points. I've had several cases where we called into the perf counter code from those points before it was initialized, getting in kernel segfaults due to dereferencing uninitialized data etc.. I could keep a variable that tracked the perf_counter_init() state, and use that instead if you prefer? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 7:56 ` Peter Zijlstra @ 2009-07-09 12:56 ` Matt Mackall 0 siblings, 0 replies; 27+ messages in thread From: Matt Mackall @ 2009-07-09 12:56 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 09:56 +0200, Peter Zijlstra wrote: > On Wed, 2009-07-08 at 17:01 -0700, Linus Torvalds wrote: > > Looking at the people looking at SYSTEM_RUNNING, I do note some odd cases. > > Why the heck does kernel/perf_counter.c do it, for example? > > Ah, those are the swcounter and other probe entry points. I've had > several cases where we called into the perf counter code from those > points before it was initialized, getting in kernel segfaults due to > dereferencing uninitialized data etc.. > > I could keep a variable that tracked the perf_counter_init() state, and > use that instead if you prefer? Looks like that'd be more accurate. Linus's proposed patch might break your current assumptions. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 0:01 ` Linus Torvalds 2009-07-09 3:08 ` David Miller 2009-07-09 7:56 ` Peter Zijlstra @ 2009-07-09 13:26 ` Matt Mackall 2009-07-09 13:46 ` Peter Zijlstra 2 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2009-07-09 13:26 UTC (permalink / raw) To: Linus Torvalds Cc: Anton Vorontsov, Andrew Morton, a.p.zijlstra, oleg, mingo, linux-kernel, netdev On Wed, 2009-07-08 at 17:01 -0700, Linus Torvalds wrote: > > On Thu, 9 Jul 2009, Anton Vorontsov wrote: > > > > The netpoll code is using msleep() just a few lines below cond_resched(), > > so we won't make things worse. ;-) > > Yeah. That function is definitely sleeping. It does things like > kmalloc(GFP_KERNEL), rtnl_lock() and synchronize_rcu() etc too, so an > added msleep() is the least of our problems. > > Afaik, it's called from a bog-standard "module_init()", which happens late > enough that everything works. > > In fact, I wonder if we should set SYSTEM_RUNNING much earlier - _before_ > doing the whole "do_initcalls()". Well there are two ways of consistently defining SYSTEM_RUNNING: a) define it with reference to the well-understood notion of booting vs running and don't switch it until handing off to init b) define it with reference to its usage by an arbitrary user like cond_resched() In the latter case, we obviously need to move it to the earliest point that scheduling is possible. But there are a number of things like http://lxr.linux.no/linux+v2.6.30/kernel/printk.c#L228 that assume the definition is actually (a). We're currently within a couple lines of a strict definition of (a) already, so I actually think cond_resched() is just wrong (and we already know it broke a previously-working user). It should perhaps be using another private flag that gets set as soon as scheduling is up and running. But I'd actually go further and say that it's unfortunate to be checking extra flags in such an important inline, especially since the check is false for all but the first couple seconds of run time. Seems like we could avoid adding an extra check by artificially elevating the preempt count in early boot (or at compile time) then dropping it when scheduling becomes available. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 13:26 ` Matt Mackall @ 2009-07-09 13:46 ` Peter Zijlstra 2009-07-09 14:18 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2009-07-09 13:46 UTC (permalink / raw) To: Matt Mackall Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 08:26 -0500, Matt Mackall wrote: > On Wed, 2009-07-08 at 17:01 -0700, Linus Torvalds wrote: > > > > On Thu, 9 Jul 2009, Anton Vorontsov wrote: > > > > > > The netpoll code is using msleep() just a few lines below cond_resched(), > > > so we won't make things worse. ;-) > > > > Yeah. That function is definitely sleeping. It does things like > > kmalloc(GFP_KERNEL), rtnl_lock() and synchronize_rcu() etc too, so an > > added msleep() is the least of our problems. > > > > Afaik, it's called from a bog-standard "module_init()", which happens late > > enough that everything works. > > > > In fact, I wonder if we should set SYSTEM_RUNNING much earlier - _before_ > > doing the whole "do_initcalls()". > > Well there are two ways of consistently defining SYSTEM_RUNNING: > > a) define it with reference to the well-understood notion of booting vs > running and don't switch it until handing off to init This makes the most sense IMHO. > b) define it with reference to its usage by an arbitrary user like > cond_resched() > > In the latter case, we obviously need to move it to the earliest point > that scheduling is possible. But there are a number of things like > > http://lxr.linux.no/linux+v2.6.30/kernel/printk.c#L228 > > that assume the definition is actually (a). We're currently within a > couple lines of a strict definition of (a) already, so I actually think > cond_resched() is just wrong (and we already know it broke a > previously-working user). It should perhaps be using another private > flag that gets set as soon as scheduling is up and running. Right as mentioned before in this thread, we grew scheduler_running a while back which could be used for this. > But I'd actually go further and say that it's unfortunate to be checking > extra flags in such an important inline, especially since the check is > false for all but the first couple seconds of run time. Seems like we > could avoid adding an extra check by artificially elevating the preempt > count in early boot (or at compile time) then dropping it when > scheduling becomes available. Calling cond_resched() and co when !preemptable is an error so this wouldn't actually work. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 13:46 ` Peter Zijlstra @ 2009-07-09 14:18 ` Matt Mackall 2009-07-09 14:31 ` Peter Zijlstra 0 siblings, 1 reply; 27+ messages in thread From: Matt Mackall @ 2009-07-09 14:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 15:46 +0200, Peter Zijlstra wrote: > On Thu, 2009-07-09 at 08:26 -0500, Matt Mackall wrote: > > On Wed, 2009-07-08 at 17:01 -0700, Linus Torvalds wrote: > > > > > > On Thu, 9 Jul 2009, Anton Vorontsov wrote: > > > > > > > > The netpoll code is using msleep() just a few lines below cond_resched(), > > > > so we won't make things worse. ;-) > > > > > > Yeah. That function is definitely sleeping. It does things like > > > kmalloc(GFP_KERNEL), rtnl_lock() and synchronize_rcu() etc too, so an > > > added msleep() is the least of our problems. > > > > > > Afaik, it's called from a bog-standard "module_init()", which happens late > > > enough that everything works. > > > > > > In fact, I wonder if we should set SYSTEM_RUNNING much earlier - _before_ > > > doing the whole "do_initcalls()". > > > > Well there are two ways of consistently defining SYSTEM_RUNNING: > > > > a) define it with reference to the well-understood notion of booting vs > > running and don't switch it until handing off to init > > This makes the most sense IMHO. > > > b) define it with reference to its usage by an arbitrary user like > > cond_resched() > > > > In the latter case, we obviously need to move it to the earliest point > > that scheduling is possible. But there are a number of things like > > > > http://lxr.linux.no/linux+v2.6.30/kernel/printk.c#L228 > > > > that assume the definition is actually (a). We're currently within a > > couple lines of a strict definition of (a) already, so I actually think > > cond_resched() is just wrong (and we already know it broke a > > previously-working user). It should perhaps be using another private > > flag that gets set as soon as scheduling is up and running. > > Right as mentioned before in this thread, we grew scheduler_running a > while back which could be used for this. > > > But I'd actually go further and say that it's unfortunate to be checking > > extra flags in such an important inline, especially since the check is > > false for all but the first couple seconds of run time. Seems like we > > could avoid adding an extra check by artificially elevating the preempt > > count in early boot (or at compile time) then dropping it when > > scheduling becomes available. > > Calling cond_resched() and co when !preemptable is an error so this > wouldn't actually work. Sorry if I was unclear. I'm suggesting setting the count so the existing PREEMPT_ACTIVE test here fires: int __sched _cond_resched(void) { if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && system_state == SYSTEM_RUNNING) { __cond_resched(); return 1; } return 0; } That should be kosher. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 14:18 ` Matt Mackall @ 2009-07-09 14:31 ` Peter Zijlstra 2009-07-09 14:43 ` Matt Mackall 2009-07-09 17:29 ` Linus Torvalds 0 siblings, 2 replies; 27+ messages in thread From: Peter Zijlstra @ 2009-07-09 14:31 UTC (permalink / raw) To: Matt Mackall Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 09:18 -0500, Matt Mackall wrote: > > Sorry if I was unclear. I'm suggesting setting the count so the existing > PREEMPT_ACTIVE test here fires: > > int __sched _cond_resched(void) > { > if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && > system_state == SYSTEM_RUNNING) { > __cond_resched(); > return 1; > } > return 0; > } Right, /me read preempt and thought a simple preempt inc but didn't read the code. Shame on me. So something like (utterly untested and such) --- arch/x86/include/asm/thread_info.h | 2 +- kernel/sched.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index b078352..7b5dbce 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -49,7 +49,7 @@ struct thread_info { .exec_domain = &default_exec_domain, \ .flags = 0, \ .cpu = 0, \ - .preempt_count = 1, \ + .preempt_count = 1 + PREEMPT_ACTIVE, \ .addr_limit = KERNEL_DS, \ .restart_block = { \ .fn = do_no_restart_syscall, \ diff --git a/kernel/sched.c b/kernel/sched.c index fd3ac58..8e81162 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -6599,8 +6599,7 @@ static void __cond_resched(void) int __sched _cond_resched(void) { - if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && - system_state == SYSTEM_RUNNING) { + if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE)) { __cond_resched(); return 1; } @@ -9422,6 +9421,7 @@ void __init sched_init(void) perf_counter_init(); scheduler_running = 1; + sub_preempt_count(PREEMPT_ACTIVE); } #ifdef CONFIG_DEBUG_SPINLOCK_SLEEP ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 14:31 ` Peter Zijlstra @ 2009-07-09 14:43 ` Matt Mackall 2009-07-09 14:51 ` Peter Zijlstra 2009-07-09 17:29 ` Linus Torvalds 1 sibling, 1 reply; 27+ messages in thread From: Matt Mackall @ 2009-07-09 14:43 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 16:31 +0200, Peter Zijlstra wrote: > On Thu, 2009-07-09 at 09:18 -0500, Matt Mackall wrote: > > > > Sorry if I was unclear. I'm suggesting setting the count so the existing > > PREEMPT_ACTIVE test here fires: > > > > int __sched _cond_resched(void) > > { > > if (need_resched() && !(preempt_count() & PREEMPT_ACTIVE) && > > system_state == SYSTEM_RUNNING) { > > __cond_resched(); > > return 1; > > } > > return 0; > > } > > Right, /me read preempt and thought a simple preempt inc but didn't read > the code. Shame on me. > > So something like (utterly untested and such) Yeah, that's what I had in mind. Probably throw in a define: /* for disabling scheduling in early boot */ #define PREEMPT_EARLY (1 + PREEMPT_ACTIVE) and slap a comment on the sub_preempt_count(). Does anything actually use scheduler_running yet? Perhaps my tree is old. Also, might_sleep's use of system_state probably bears revisiting. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 14:43 ` Matt Mackall @ 2009-07-09 14:51 ` Peter Zijlstra 2009-07-09 15:06 ` Matt Mackall 0 siblings, 1 reply; 27+ messages in thread From: Peter Zijlstra @ 2009-07-09 14:51 UTC (permalink / raw) To: Matt Mackall Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 09:43 -0500, Matt Mackall wrote: > Yeah, that's what I had in mind. Probably throw in a define: > > /* for disabling scheduling in early boot */ > #define PREEMPT_EARLY (1 + PREEMPT_ACTIVE) > > and slap a comment on the sub_preempt_count(). Right, and visit all the other arch init code ;-) I'll wait to see if Ingo has anything to say about it and then complete this thing. > Does anything actually use scheduler_running yet? Perhaps my tree is > old. # git grep scheduler_running kernel/sched.c:static __read_mostly int scheduler_running; kernel/sched.c: scheduler_running = 1; kernel/sched_rt.c: if (unlikely(!scheduler_running)) kernel/sched_rt.c: if (unlikely(!scheduler_running)) If memory serves there used to be more, but I think that migrated into kernel/sched_clock.c, which has sched_clock_running. > Also, might_sleep's use of system_state probably bears revisiting. Yeah, all that code is from long before we had scheduler_running (which was introduced around CFS/.23). ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 14:51 ` Peter Zijlstra @ 2009-07-09 15:06 ` Matt Mackall 0 siblings, 0 replies; 27+ messages in thread From: Matt Mackall @ 2009-07-09 15:06 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 16:51 +0200, Peter Zijlstra wrote: > > Does anything actually use scheduler_running yet? Perhaps my tree is > > old. > > # git grep scheduler_running > kernel/sched.c:static __read_mostly int scheduler_running; > kernel/sched.c: scheduler_running = 1; > kernel/sched_rt.c: if (unlikely(!scheduler_running)) > kernel/sched_rt.c: if (unlikely(!scheduler_running)) > > If memory serves there used to be more, but I think that migrated into > kernel/sched_clock.c, which has sched_clock_running. The static in the first line makes that a bit of a head-scratcher? Oh, I see: it's living in sin. Going to avert my eyes for now. -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-09 14:31 ` Peter Zijlstra 2009-07-09 14:43 ` Matt Mackall @ 2009-07-09 17:29 ` Linus Torvalds 1 sibling, 0 replies; 27+ messages in thread From: Linus Torvalds @ 2009-07-09 17:29 UTC (permalink / raw) To: Peter Zijlstra Cc: Matt Mackall, Anton Vorontsov, Andrew Morton, oleg, mingo, linux-kernel, netdev On Thu, 9 Jul 2009, Peter Zijlstra wrote: > > So something like (utterly untested and such) This looks like a good patch. Please make it so - who knows what other uses of cond_resched() we have in module init routines that might have deadlocks without it. The netpoll case got fixed, but please just do this. I'd like to do my system_state movement too (the thing is, when you load drivers as modules you _will_ have "system_state == SYSTEM_RUNNING", so any initcall that depends on it being "early boot" is already broken), but there's no way that patch is appropriate for post-rc2. This one, however, looks appropriate (modulo getting some testing, of course) Linus ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] netpoll: Fix carrier detection for drivers that are using phylib 2009-07-08 22:20 ` [PATCH] netpoll: Fix carrier detection for drivers that are using phylib Anton Vorontsov 2009-07-09 0:01 ` Linus Torvalds @ 2009-07-09 12:52 ` Matt Mackall 1 sibling, 0 replies; 27+ messages in thread From: Matt Mackall @ 2009-07-09 12:52 UTC (permalink / raw) To: avorontsov Cc: Andrew Morton, torvalds, a.p.zijlstra, oleg, mingo, linux-kernel, netdev On Thu, 2009-07-09 at 02:20 +0400, Anton Vorontsov wrote: > Using early netconsole and gianfar driver this error pops up: > > netconsole: timeout waiting for carrier > > It appears that net/core/netpoll.c:netpoll_setup() is using > cond_resched() in a loop waiting for a carrier. > > The thing is that cond_resched() is a no-op when system_state != > SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never > scheduled, therefore link detection doesn't work. > > I belive that the main problem is in cond_resched()[1], but despite > how the cond_resched() story ends, it might be a good idea to call > msleep(1) instead of cond_resched(), as suggested by Andrew Morton. > > [1] http://lkml.org/lkml/2009/7/7/463 > > Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> > --- > > On Wed, Jul 08, 2009 at 02:47:44PM -0700, Andrew Morton wrote: > > (belatedly cc'ing netdev) > > > > Original diagnosis: > > > > : Using early netconsole and gianfar driver this error pops up: > > : > > : netconsole: timeout waiting for carrier > > : > > : It appears that net/core/netpoll.c:netpoll_setup() is using > > : cond_resched() in a loop waiting for a carrier. > > : > > : The thing is that cond_resched() is a no-op when system_state != > > : SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never > > : scheduled, therefore link detection doesn't work > > > > > On Thu, 9 Jul 2009 01:33:31 +0400 Anton Vorontsov <avorontsov@ru.mvista.com> wrote: > > > On Wed, Jul 08, 2009 at 02:10:24PM -0700, Andrew Morton wrote: > > > > > On Wed, 8 Jul 2009 09:12:30 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > > > > > Testing that the scheduler is initialized may be the more correct one. I > > > > > think the SYSTEM_RUNNING one just comes from that being used for other > > > > > debug issues. > > > > > > > > Agreed. system_state is too general. > > > > > > > > If we specifically want to know whether it is safe to call schedule() then > > > > let's create a global boolean it_is_safe_to_call_schedule and test that, > > > > rather than testing something which indirectly and unreliably implies "it > > > > is safe to call schedule". If that boolean already exists then no-brainer. > > > > > > > > All that being said, I wonder if the netconsole code should be using > > > > msleep(1) instead. Spinning on cond_resched() is a bit rude. But one > > > > would have to verify that it is safe to call schedule() at this time, and > > > > for the netconsole caller, this is dubious. > > > > > > What do you mean by "verify that it is safe"? If it works, > > > can I assume that it's safe? ;-) It works, fwiw. > > > > > > > netconsole is supposed to be available as early as possible in boot for > > obvious reasons. I'd say there's a decent risk now and in the future that > > netconsole will be initialised prior to the scheduler being available. > > > > In fact, if "netconsole: timeout waiting for carrier" newly added to > > netpoll_setup() a depedency on the scheduler being available then perhaps > > that was an incorrect change. > > 'git blame' says that carrier detection code didn't change since 2.6.12 > (where git history starts), PHYLIB is using workqueue since its > submission (2.6.13). And SYSTEM_RUNNING check was added in 2.6.16. > So it's not a new dependency. > > The netpoll code is using msleep() just a few lines below cond_resched(), > so we won't make things worse. ;-) I think that's an improvement with or without the SYSTEM_RUNNING fix. Signed-off-by: Matt Mackall <mpm@selenic.com> > Thanks! > > net/core/netpoll.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/net/core/netpoll.c b/net/core/netpoll.c > index 9675f31..df30feb 100644 > --- a/net/core/netpoll.c > +++ b/net/core/netpoll.c > @@ -740,7 +740,7 @@ int netpoll_setup(struct netpoll *np) > np->name); > break; > } > - cond_resched(); > + msleep(1); > } > > /* If carrier appears to come up instantly, we don't -- http://selenic.com : development and support for Mercurial and Linux ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() 2009-07-08 21:10 ` Andrew Morton 2009-07-08 21:33 ` Anton Vorontsov @ 2009-07-09 23:20 ` Pavel Machek 1 sibling, 0 replies; 27+ messages in thread From: Pavel Machek @ 2009-07-09 23:20 UTC (permalink / raw) To: Andrew Morton Cc: Linus Torvalds, a.p.zijlstra, oleg, avorontsov, mingo, linux-kernel Hi! > > That said, I do agree that maybe SYSTEM_RUNNING isn't the right check. > > Testing that the scheduler is initialized may be the more correct one. I > > think the SYSTEM_RUNNING one just comes from that being used for other > > debug issues. > > Agreed. system_state is too general. > > If we specifically want to know whether it is safe to call schedule() then > let's create a global boolean it_is_safe_to_call_schedule and test that, > rather than testing something which indirectly and unreliably implies "it > is safe to call schedule". If that boolean already exists then no-brainer. or maybe we could embed that check into schedule(), just returning when scheduler is not ready? And I always wondered... system_state is not protected by any kind of lock and is not atomic_t... Does it all work by mistake? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2009-07-10 12:06 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-07-07 23:58 [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Anton Vorontsov 2009-07-08 0:50 ` Oleg Nesterov 2009-07-08 6:24 ` Peter Zijlstra 2009-07-08 12:03 ` Anton Vorontsov 2009-07-08 12:12 ` Peter Zijlstra 2009-07-08 12:55 ` Anton Vorontsov 2009-07-08 12:58 ` Peter Zijlstra 2009-07-08 20:45 ` [PATCH] sched: Make cond_resched*() available earlier Anton Vorontsov 2009-07-08 16:12 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Linus Torvalds 2009-07-08 21:10 ` Andrew Morton 2009-07-08 21:33 ` Anton Vorontsov 2009-07-08 21:47 ` Andrew Morton 2009-07-08 22:20 ` [PATCH] netpoll: Fix carrier detection for drivers that are using phylib Anton Vorontsov 2009-07-09 0:01 ` Linus Torvalds 2009-07-09 3:08 ` David Miller 2009-07-09 7:56 ` Peter Zijlstra 2009-07-09 12:56 ` Matt Mackall 2009-07-09 13:26 ` Matt Mackall 2009-07-09 13:46 ` Peter Zijlstra 2009-07-09 14:18 ` Matt Mackall 2009-07-09 14:31 ` Peter Zijlstra 2009-07-09 14:43 ` Matt Mackall 2009-07-09 14:51 ` Peter Zijlstra 2009-07-09 15:06 ` Matt Mackall 2009-07-09 17:29 ` Linus Torvalds 2009-07-09 12:52 ` Matt Mackall 2009-07-09 23:20 ` [PATCH/RFC] sched: Remove SYSTEM_RUNNING checks from cond_resched*() Pavel Machek
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.