* Re: [RFC GIT Pull] core watchdog sanitizing [not found] ` <CA+55aFxQsmSb=zohGZOZK7eR4n5xOHNArmCa9j1b7+wJ+khQrg@mail.gmail.com> @ 2017-10-02 18:46 ` Thomas Gleixner 2017-10-02 19:04 ` Linus Torvalds 0 siblings, 1 reply; 5+ messages in thread From: Thomas Gleixner @ 2017-10-02 18:46 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Don Zickus, Benjamin Herrenschmidt, Michael Ellerman, Nicholas Piggin, linuxppc-dev On Sun, 1 Oct 2017, Linus Torvalds wrote: > So get rid of that kind of shit, and I may reconsider. But as is, I > look at that patch and say "no, this is worse than the garbage it used > to be". I agree that adding that 'run' argument was certainly not a piece of art. Though I disagree with the sentiment that non-functional garbage is preferrable over functionally correct code which merily contains a bad implementation choice. Without knowing you for years, it would certainly seem more rewarding to dump garbage and get it merged, than caring and mopping up garbage, which should not have been merged in the first place. Your motivation skills are truly outstanding. Enough vented. Find below the cure for that major offense. Thanks, tglx 8<-------------------- Subject: watchdog/core, powerpc: Replace watchdog_nmi_reconfigure() From: Thomas Gleixner <tglx@linutronix.de> Date: Mon, 02 Oct 2017 12:34:50 +0200 The recent cleanup of the watchdog code split watchdog_nmi_reconfigure() into two stages. One to stop the NMI and one to restart it after reconfiguration. That was done by adding a boolean 'run' argument to the function, which is functionally correct but not necessarily a piece of art. Replace it by two explicit functions: watchdog_nmi_stop() and watchdog_nmi_start(). Fixes: 6592ad2fcc8f ("watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage") Requested-by: Linus 'Nursing his pet-peeve' Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Thomas 'Mopping up garbage' Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org --- arch/powerpc/kernel/watchdog.c | 23 ++++++++++++++--------- include/linux/nmi.h | 3 ++- kernel/watchdog.c | 33 ++++++++++++++++++--------------- 3 files changed, 34 insertions(+), 25 deletions(-) --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -355,19 +355,24 @@ static void watchdog_calc_timeouts(void) wd_timer_period_ms = watchdog_thresh * 1000 * 2 / 5; } -void watchdog_nmi_reconfigure(bool run) +void watchdog_nmi_stop(void) { int cpu; cpus_read_lock(); - if (!run) { - for_each_cpu(cpu, &wd_cpus_enabled) - stop_wd_on_cpu(cpu); - } else { - watchdog_calc_timeouts(); - for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) - start_wd_on_cpu(cpu); - } + for_each_cpu(cpu, &wd_cpus_enabled) + stop_wd_on_cpu(cpu); + cpus_read_unlock(); +} + +void watchdog_nmi_start(void) +{ + int cpu; + + cpus_read_lock(); + watchdog_calc_timeouts(); + for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) + start_wd_on_cpu(cpu); cpus_read_unlock(); } --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -109,7 +109,8 @@ static inline int hardlockup_detector_pe # endif #endif -void watchdog_nmi_reconfigure(bool run); +void watchdog_nmi_stop(void); +void watchdog_nmi_start(void); /** * touch_nmi_watchdog - restart NMI watchdog timeout. --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -123,24 +123,27 @@ int __weak __init watchdog_nmi_probe(voi } /** - * watchdog_nmi_reconfigure - Optional function to reconfigure NMI watchdogs - * @run: If false stop the watchdogs on all enabled CPUs - * If true start the watchdogs on all enabled CPUs + * watchdog_nmi_stop - Stop the watchdog for reconfiguration * - * The core call order is: - * watchdog_nmi_reconfigure(false); + * The reconfiguration steps are: + * watchdog_nmi_stop(); * update_variables(); - * watchdog_nmi_reconfigure(true); + * watchdog_nmi_start(); + */ +void __weak watchdog_nmi_stop(void) { } + +/** + * watchdog_nmi_start - Start the watchdog after reconfiguration * - * The second call which starts the watchdogs again guarantees that the - * following variables are stable across the call. + * Counterpart to watchdog_nmi_stop(). + * + * The following variables have been updated in update_variables() and + * contain the currently valid configuration: * - watchdog_enabled * - watchdog_thresh * - watchdog_cpumask - * - * After the call the variables can be changed again. */ -void __weak watchdog_nmi_reconfigure(bool run) { } +void __weak watchdog_nmi_start(void) { } /** * lockup_detector_update_enable - Update the sysctl enable bit @@ -551,13 +554,13 @@ static void softlockup_unpark_threads(vo static void softlockup_reconfigure_threads(void) { - watchdog_nmi_reconfigure(false); + watchdog_nmi_stop(); softlockup_park_all_threads(); set_sample_period(); lockup_detector_update_enable(); if (watchdog_enabled && watchdog_thresh) softlockup_unpark_threads(); - watchdog_nmi_reconfigure(true); + watchdog_nmi_start(); } /* @@ -602,9 +605,9 @@ static inline void watchdog_disable_all_ static inline void softlockup_init_threads(void) { } static void softlockup_reconfigure_threads(void) { - watchdog_nmi_reconfigure(false); + watchdog_nmi_stop(); lockup_detector_update_enable(); - watchdog_nmi_reconfigure(true); + watchdog_nmi_start(); } #endif /* !CONFIG_SOFTLOCKUP_DETECTOR */ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC GIT Pull] core watchdog sanitizing 2017-10-02 18:46 ` [RFC GIT Pull] core watchdog sanitizing Thomas Gleixner @ 2017-10-02 19:04 ` Linus Torvalds 2017-10-02 19:32 ` Thomas Gleixner 0 siblings, 1 reply; 5+ messages in thread From: Linus Torvalds @ 2017-10-02 19:04 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Don Zickus, Benjamin Herrenschmidt, Michael Ellerman, Nicholas Piggin, ppc-dev On Mon, Oct 2, 2017 at 11:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > I agree that adding that 'run' argument was certainly not a piece of > art. Though I disagree with the sentiment that non-functional garbage is > preferrable over functionally correct code which merily contains a bad > implementation choice. I agree that it's somewhat arbitrary, but I also find it really hard to vet code where my initial reaction is just "this is too ugly". So it may be superficial, but .. > Enough vented. Find below the cure for that major offense. Looks much better to me. Thanks. Side note: would it perhaps make sense to have that cpus_read_lock/unlock() sequence around the whole reconfiguration section? Because while looking at that sequence, it looks a bit odd to me that cpu's can come and go in the middle of the nmi watchdog reconfiguration sequence. In particular, what happens if a new CPU is brought up just as the NMI matchdog is being reconfigured? The NMI's have been stopped for the old CPU's, what happens for the new one that came up in between that watchdog_nmi_stop/start? This may be all obviously safe, I'm just asking for clarification. Linus ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC GIT Pull] core watchdog sanitizing 2017-10-02 19:04 ` Linus Torvalds @ 2017-10-02 19:32 ` Thomas Gleixner 2017-10-02 20:32 ` Don Zickus 0 siblings, 1 reply; 5+ messages in thread From: Thomas Gleixner @ 2017-10-02 19:32 UTC (permalink / raw) To: Linus Torvalds Cc: LKML, Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Don Zickus, Benjamin Herrenschmidt, Michael Ellerman, Nicholas Piggin, ppc-dev On Mon, 2 Oct 2017, Linus Torvalds wrote: > Side note: would it perhaps make sense to have that > cpus_read_lock/unlock() sequence around the whole reconfiguration > section? > > Because while looking at that sequence, it looks a bit odd to me that > cpu's can come and go in the middle of the nmi watchdog > reconfiguration sequence. > > In particular, what happens if a new CPU is brought up just as the NMI > matchdog is being reconfigured? The NMI's have been stopped for the > old CPU's, what happens for the new one that came up in between that > watchdog_nmi_stop/start? > > This may be all obviously safe, I'm just asking for clarification. It's safe because the newly upcoming CPU will see an empty enabled mask in the powerpc implementation. The perf based implementation has a similar protection. Though yes, it would be more obvious to expand the cpus locked section. That requires a bit of shuffling. Untested patch below. Thanks, tglx 8<------------------ --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -359,21 +359,17 @@ void watchdog_nmi_stop(void) { int cpu; - cpus_read_lock(); for_each_cpu(cpu, &wd_cpus_enabled) stop_wd_on_cpu(cpu); - cpus_read_unlock(); } void watchdog_nmi_start(void) { int cpu; - cpus_read_lock(); watchdog_calc_timeouts(); for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) start_wd_on_cpu(cpu); - cpus_read_unlock(); } /* --- a/kernel/smpboot.c +++ b/kernel/smpboot.c @@ -351,7 +351,7 @@ void smpboot_update_cpumask_percpu_threa static struct cpumask tmp; unsigned int cpu; - get_online_cpus(); + lockdep_assert_cpus_held(); mutex_lock(&smpboot_threads_lock); /* Park threads that were exclusively enabled on the old mask. */ @@ -367,7 +367,6 @@ void smpboot_update_cpumask_percpu_threa cpumask_copy(old, new); mutex_unlock(&smpboot_threads_lock); - put_online_cpus(); } static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = ATOMIC_INIT(CPU_POST_DEAD); --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -535,7 +535,6 @@ static void softlockup_update_smpboot_th smpboot_update_cpumask_percpu_thread(&watchdog_threads, &watchdog_allowed_mask); - __lockup_detector_cleanup(); } /* Temporarily park all watchdog threads */ @@ -554,6 +553,7 @@ static void softlockup_unpark_threads(vo static void softlockup_reconfigure_threads(void) { + cpus_read_lock(); watchdog_nmi_stop(); softlockup_park_all_threads(); set_sample_period(); @@ -561,6 +561,12 @@ static void softlockup_reconfigure_threa if (watchdog_enabled && watchdog_thresh) softlockup_unpark_threads(); watchdog_nmi_start(); + cpus_read_unlock(); + /* + * Must be called outside the cpus locked section to prevent + * recursive locking in the perf code. + */ + __lockup_detector_cleanup(); } /* ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC GIT Pull] core watchdog sanitizing 2017-10-02 19:32 ` Thomas Gleixner @ 2017-10-02 20:32 ` Don Zickus 2017-10-02 20:45 ` Thomas Gleixner 0 siblings, 1 reply; 5+ messages in thread From: Don Zickus @ 2017-10-02 20:32 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, LKML, Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Benjamin Herrenschmidt, Michael Ellerman, Nicholas Piggin, ppc-dev On Mon, Oct 02, 2017 at 07:32:57PM +0000, Thomas Gleixner wrote: > On Mon, 2 Oct 2017, Linus Torvalds wrote: > > Side note: would it perhaps make sense to have that > > cpus_read_lock/unlock() sequence around the whole reconfiguration > > section? > > > > Because while looking at that sequence, it looks a bit odd to me that > > cpu's can come and go in the middle of the nmi watchdog > > reconfiguration sequence. > > > > In particular, what happens if a new CPU is brought up just as the NMI > > matchdog is being reconfigured? The NMI's have been stopped for the > > old CPU's, what happens for the new one that came up in between that > > watchdog_nmi_stop/start? > > > > This may be all obviously safe, I'm just asking for clarification. > > It's safe because the newly upcoming CPU will see an empty enabled mask in > the powerpc implementation. The perf based implementation has a similar > protection. > > Though yes, it would be more obvious to expand the cpus locked > section. That requires a bit of shuffling. Untested patch below. > > Thanks, > > tglx > > 8<------------------ > > --- a/arch/powerpc/kernel/watchdog.c > +++ b/arch/powerpc/kernel/watchdog.c > @@ -359,21 +359,17 @@ void watchdog_nmi_stop(void) > { > int cpu; > > - cpus_read_lock(); > for_each_cpu(cpu, &wd_cpus_enabled) > stop_wd_on_cpu(cpu); > - cpus_read_unlock(); > } > > void watchdog_nmi_start(void) > { > int cpu; > > - cpus_read_lock(); > watchdog_calc_timeouts(); > for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) > start_wd_on_cpu(cpu); > - cpus_read_unlock(); > } > > /* > --- a/kernel/smpboot.c > +++ b/kernel/smpboot.c > @@ -351,7 +351,7 @@ void smpboot_update_cpumask_percpu_threa > static struct cpumask tmp; > unsigned int cpu; > > - get_online_cpus(); > + lockdep_assert_cpus_held(); > mutex_lock(&smpboot_threads_lock); > > /* Park threads that were exclusively enabled on the old mask. */ > @@ -367,7 +367,6 @@ void smpboot_update_cpumask_percpu_threa > cpumask_copy(old, new); > > mutex_unlock(&smpboot_threads_lock); > - put_online_cpus(); > } > > static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = ATOMIC_INIT(CPU_POST_DEAD); > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -535,7 +535,6 @@ static void softlockup_update_smpboot_th > > smpboot_update_cpumask_percpu_thread(&watchdog_threads, > &watchdog_allowed_mask); > - __lockup_detector_cleanup(); > } > > /* Temporarily park all watchdog threads */ > @@ -554,6 +553,7 @@ static void softlockup_unpark_threads(vo > > static void softlockup_reconfigure_threads(void) There is a second copy of ^^^^, you will need to add identical locking there too. I can test both of these patches tomorrow. Cheers, Don > { > + cpus_read_lock(); > watchdog_nmi_stop(); > softlockup_park_all_threads(); > set_sample_period(); > @@ -561,6 +561,12 @@ static void softlockup_reconfigure_threa > if (watchdog_enabled && watchdog_thresh) > softlockup_unpark_threads(); > watchdog_nmi_start(); > + cpus_read_unlock(); > + /* > + * Must be called outside the cpus locked section to prevent > + * recursive locking in the perf code. > + */ > + __lockup_detector_cleanup(); > } > > /* > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC GIT Pull] core watchdog sanitizing 2017-10-02 20:32 ` Don Zickus @ 2017-10-02 20:45 ` Thomas Gleixner 0 siblings, 0 replies; 5+ messages in thread From: Thomas Gleixner @ 2017-10-02 20:45 UTC (permalink / raw) To: Don Zickus Cc: Linus Torvalds, LKML, Ingo Molnar, H. Peter Anvin, Peter Zijlstra, Benjamin Herrenschmidt, Michael Ellerman, Nicholas Piggin, ppc-dev On Mon, 2 Oct 2017, Don Zickus wrote: > On Mon, Oct 02, 2017 at 07:32:57PM +0000, Thomas Gleixner wrote: > > static void softlockup_reconfigure_threads(void) > > There is a second copy of ^^^^, you will need to add identical locking there > too. Good catch, but then we might move it further out > > + cpus_read_unlock(); > > + /* > > + * Must be called outside the cpus locked section to prevent > > + * recursive locking in the perf code. > > + */ > > + __lockup_detector_cleanup(); Or not due to this .... Thanks, tglx ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-10-02 20:45 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <alpine.DEB.2.20.1710011217510.3874@nanos> [not found] ` <CA+55aFxQsmSb=zohGZOZK7eR4n5xOHNArmCa9j1b7+wJ+khQrg@mail.gmail.com> 2017-10-02 18:46 ` [RFC GIT Pull] core watchdog sanitizing Thomas Gleixner 2017-10-02 19:04 ` Linus Torvalds 2017-10-02 19:32 ` Thomas Gleixner 2017-10-02 20:32 ` Don Zickus 2017-10-02 20:45 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).