linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] watchdog: Fix a watchdog crash in some configurations
@ 2015-05-04 23:17 john.hubbard
  2015-05-05 13:35 ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: john.hubbard @ 2015-05-04 23:17 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Don Zickus, Ingo Molnar, Ulrich Obergfell, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, Stephen Rothwell, linux-next,
	John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
introduced a regression in some configurations. Specifically,
with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
the kernel will crash in lockup_detector_init(), due to a
NULL tick_nohz_full_mask pointer.

This is because the above commit uses tick_nohz_full_mask
(in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
tick_nohz_full_mask only gets allocated if either:

    a) CONFIG_NO_HZ_FULL_ALL is set, or

    b) Someone passes in nohz_full=<any_value> on the boot
      args line.

To correct this, change lockup_detector_init so that it does
a runtime check (in addition to the ifdef check). This now
matches the way most of the other CONFIG_NO_HZ_FULL code does
it's checking. This fix is a little simpler than my original
proposed fix, thanks to Chris Metcalf for that.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 kernel/watchdog.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 40fda2f..910d73f 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -921,10 +921,14 @@ void __init lockup_detector_init(void)
 	set_sample_period();
 
 #ifdef CONFIG_NO_HZ_FULL
-	if (!cpumask_empty(tick_nohz_full_mask))
-		pr_info("Disabling watchdog on nohz_full cores by default\n");
-	cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
-		       tick_nohz_full_mask);
+	if (tick_nohz_full_enabled()) {
+		if (!cpumask_empty(tick_nohz_full_mask))
+			pr_info("Disabling watchdog on nohz_full cores by default\n");
+		cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
+			       tick_nohz_full_mask);
+	}
+	else
+		cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
 #else
 	cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
 #endif
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] watchdog: Fix a watchdog crash in some configurations
  2015-05-04 23:17 [PATCH] watchdog: Fix a watchdog crash in some configurations john.hubbard
@ 2015-05-05 13:35 ` Don Zickus
  2015-05-05 13:44   ` Chris Metcalf
  0 siblings, 1 reply; 6+ messages in thread
From: Don Zickus @ 2015-05-05 13:35 UTC (permalink / raw)
  To: john.hubbard
  Cc: Chris Metcalf, Ingo Molnar, Ulrich Obergfell, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, Stephen Rothwell, linux-next,
	John Hubbard

On Mon, May 04, 2015 at 04:17:07PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
> introduced a regression in some configurations. Specifically,
> with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
> the kernel will crash in lockup_detector_init(), due to a
> NULL tick_nohz_full_mask pointer.
> 
> This is because the above commit uses tick_nohz_full_mask
> (in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
> tick_nohz_full_mask only gets allocated if either:
> 
>     a) CONFIG_NO_HZ_FULL_ALL is set, or
> 
>     b) Someone passes in nohz_full=<any_value> on the boot
>       args line.
> 
> To correct this, change lockup_detector_init so that it does
> a runtime check (in addition to the ifdef check). This now
> matches the way most of the other CONFIG_NO_HZ_FULL code does
> it's checking. This fix is a little simpler than my original
> proposed fix, thanks to Chris Metcalf for that.

Hi Chris,

If you are ok with this, I can forward it along.

Cheers,
Don

> 
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  kernel/watchdog.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 40fda2f..910d73f 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -921,10 +921,14 @@ void __init lockup_detector_init(void)
>  	set_sample_period();
>  
>  #ifdef CONFIG_NO_HZ_FULL
> -	if (!cpumask_empty(tick_nohz_full_mask))
> -		pr_info("Disabling watchdog on nohz_full cores by default\n");
> -	cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> -		       tick_nohz_full_mask);
> +	if (tick_nohz_full_enabled()) {
> +		if (!cpumask_empty(tick_nohz_full_mask))
> +			pr_info("Disabling watchdog on nohz_full cores by default\n");
> +		cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> +			       tick_nohz_full_mask);
> +	}
> +	else
> +		cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
>  #else
>  	cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
>  #endif
> -- 
> 2.3.7
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] watchdog: Fix a watchdog crash in some configurations
  2015-05-05 13:35 ` Don Zickus
@ 2015-05-05 13:44   ` Chris Metcalf
  2015-05-05 14:06     ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Metcalf @ 2015-05-05 13:44 UTC (permalink / raw)
  To: Don Zickus
  Cc: john.hubbard, Ingo Molnar, Ulrich Obergfell, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, Stephen Rothwell, linux-next,
	John Hubbard


> On May 5, 2015, at 9:35 AM, Don Zickus <dzickus@redhat.com> wrote:
> 
>> On Mon, May 04, 2015 at 04:17:07PM -0700, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>> 
>> Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
>> introduced a regression in some configurations. Specifically,
>> with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
>> the kernel will crash in lockup_detector_init(), due to a
>> NULL tick_nohz_full_mask pointer.
>> 
>> This is because the above commit uses tick_nohz_full_mask
>> (in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
>> tick_nohz_full_mask only gets allocated if either:
>> 
>>    a) CONFIG_NO_HZ_FULL_ALL is set, or
>> 
>>    b) Someone passes in nohz_full=<any_value> on the boot
>>      args line.
>> 
>> To correct this, change lockup_detector_init so that it does
>> a runtime check (in addition to the ifdef check). This now
>> matches the way most of the other CONFIG_NO_HZ_FULL code does
>> it's checking. This fix is a little simpler than my original
>> proposed fix, thanks to Chris Metcalf for that.
> 
> Hi Chris,
> 
> If you are ok with this, I can forward it along.
> 
> Cheers,
> Don

With the new dynamic test, we don't actually need the ifdef anymore. I asked John if he could respin it without that. 

> 
>> 
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>> kernel/watchdog.c | 12 ++++++++----
>> 1 file changed, 8 insertions(+), 4 deletions(-)
>> 
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 40fda2f..910d73f 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -921,10 +921,14 @@ void __init lockup_detector_init(void)
>>    set_sample_period();
>> 
>> #ifdef CONFIG_NO_HZ_FULL
>> -    if (!cpumask_empty(tick_nohz_full_mask))
>> -        pr_info("Disabling watchdog on nohz_full cores by default\n");
>> -    cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
>> -               tick_nohz_full_mask);
>> +    if (tick_nohz_full_enabled()) {
>> +        if (!cpumask_empty(tick_nohz_full_mask))
>> +            pr_info("Disabling watchdog on nohz_full cores by default\n");
>> +        cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
>> +                   tick_nohz_full_mask);
>> +    }
>> +    else
>> +        cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
>> #else
>>    cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
>> #endif
>> -- 
>> 2.3.7
>> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] watchdog: Fix a watchdog crash in some configurations
  2015-05-05 13:44   ` Chris Metcalf
@ 2015-05-05 14:06     ` Don Zickus
  2015-05-05 19:38       ` [PATCH v2] " john.hubbard
  0 siblings, 1 reply; 6+ messages in thread
From: Don Zickus @ 2015-05-05 14:06 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: john.hubbard, Ingo Molnar, Ulrich Obergfell, Thomas Gleixner,
	Peter Zijlstra, Andrew Morton, Stephen Rothwell, linux-next,
	John Hubbard

On Tue, May 05, 2015 at 01:44:57PM +0000, Chris Metcalf wrote:
> 
> > On May 5, 2015, at 9:35 AM, Don Zickus <dzickus@redhat.com> wrote:
> > 
> >> On Mon, May 04, 2015 at 04:17:07PM -0700, john.hubbard@gmail.com wrote:
> >> From: John Hubbard <jhubbard@nvidia.com>
> >> 
> >> Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
> >> introduced a regression in some configurations. Specifically,
> >> with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
> >> the kernel will crash in lockup_detector_init(), due to a
> >> NULL tick_nohz_full_mask pointer.
> >> 
> >> This is because the above commit uses tick_nohz_full_mask
> >> (in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
> >> tick_nohz_full_mask only gets allocated if either:
> >> 
> >>    a) CONFIG_NO_HZ_FULL_ALL is set, or
> >> 
> >>    b) Someone passes in nohz_full=<any_value> on the boot
> >>      args line.
> >> 
> >> To correct this, change lockup_detector_init so that it does
> >> a runtime check (in addition to the ifdef check). This now
> >> matches the way most of the other CONFIG_NO_HZ_FULL code does
> >> it's checking. This fix is a little simpler than my original
> >> proposed fix, thanks to Chris Metcalf for that.
> > 
> > Hi Chris,
> > 
> > If you are ok with this, I can forward it along.
> > 
> > Cheers,
> > Don
> 
> With the new dynamic test, we don't actually need the ifdef anymore. I asked John if he could respin it without that. 

Ok, I will wait for the respin. Thanks!

Cheers,
Don

> 
> > 
> >> 
> >> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> >> ---
> >> kernel/watchdog.c | 12 ++++++++----
> >> 1 file changed, 8 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> >> index 40fda2f..910d73f 100644
> >> --- a/kernel/watchdog.c
> >> +++ b/kernel/watchdog.c
> >> @@ -921,10 +921,14 @@ void __init lockup_detector_init(void)
> >>    set_sample_period();
> >> 
> >> #ifdef CONFIG_NO_HZ_FULL
> >> -    if (!cpumask_empty(tick_nohz_full_mask))
> >> -        pr_info("Disabling watchdog on nohz_full cores by default\n");
> >> -    cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> >> -               tick_nohz_full_mask);
> >> +    if (tick_nohz_full_enabled()) {
> >> +        if (!cpumask_empty(tick_nohz_full_mask))
> >> +            pr_info("Disabling watchdog on nohz_full cores by default\n");
> >> +        cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> >> +                   tick_nohz_full_mask);
> >> +    }
> >> +    else
> >> +        cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
> >> #else
> >>    cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
> >> #endif
> >> -- 
> >> 2.3.7
> >> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] watchdog: Fix a watchdog crash in some configurations
  2015-05-05 14:06     ` Don Zickus
@ 2015-05-05 19:38       ` john.hubbard
  2015-05-05 22:12         ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: john.hubbard @ 2015-05-05 19:38 UTC (permalink / raw)
  To: Don Zickus, Chris Metcalf
  Cc: Ingo Molnar, Ulrich Obergfell, Thomas Gleixner, Peter Zijlstra,
	Andrew Morton, Stephen Rothwell, linux-next, John Hubbard

From: John Hubbard <jhubbard@nvidia.com>

Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
introduced a regression in some configurations. Specifically,
with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
the kernel will crash in lockup_detector_init(), due to a
NULL tick_nohz_full_mask pointer.

This is because the above commit uses tick_nohz_full_mask
(in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
tick_nohz_full_mask only gets allocated if either:

    a) CONFIG_NO_HZ_FULL_ALL is set, or

    b) Someone passes in nohz_full=<any_value> on the boot
      args line.

To correct this, change lockup_detector_init so that it does
a runtime check instead of the ifdef check. This fix is
simpler than my original proposed fix, thanks to Chris Metcalf
for that.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 kernel/watchdog.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 40fda2f..c2eb97c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -920,14 +920,14 @@ void __init lockup_detector_init(void)
 {
 	set_sample_period();
 
-#ifdef CONFIG_NO_HZ_FULL
-	if (!cpumask_empty(tick_nohz_full_mask))
-		pr_info("Disabling watchdog on nohz_full cores by default\n");
-	cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
-		       tick_nohz_full_mask);
-#else
-	cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
-#endif
+	if (tick_nohz_full_enabled()) {
+		if (!cpumask_empty(tick_nohz_full_mask))
+			pr_info("Disabling watchdog on nohz_full cores by default\n");
+		cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
+			       tick_nohz_full_mask);
+	}
+	else
+		cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
 
 	if (watchdog_enabled)
 		watchdog_enable_all_cpus();
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] watchdog: Fix a watchdog crash in some configurations
  2015-05-05 19:38       ` [PATCH v2] " john.hubbard
@ 2015-05-05 22:12         ` Andrew Morton
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2015-05-05 22:12 UTC (permalink / raw)
  To: john.hubbard
  Cc: Don Zickus, Chris Metcalf, Ingo Molnar, Ulrich Obergfell,
	Thomas Gleixner, Peter Zijlstra, Stephen Rothwell, linux-next,
	John Hubbard

On Tue,  5 May 2015 12:38:11 -0700 john.hubbard@gmail.com wrote:

> From: John Hubbard <jhubbard@nvidia.com>
> 
> Commit 8fcf2cc768acd845c1fed837bf9cfe2d7106336d in linux-next
> introduced a regression in some configurations. Specifically,
> with CONFIG_NO_HZ_FULL set, and CONFIG_NO_HZ_FULL_ALL *not* set,
> the kernel will crash in lockup_detector_init(), due to a
> NULL tick_nohz_full_mask pointer.
> 
> This is because the above commit uses tick_nohz_full_mask
> (in lockup_detector_init), if CONFIG_NO_HZ_FULL is set, but
> tick_nohz_full_mask only gets allocated if either:
> 
>     a) CONFIG_NO_HZ_FULL_ALL is set, or
> 
>     b) Someone passes in nohz_full=<any_value> on the boot
>       args line.
> 
> To correct this, change lockup_detector_init so that it does
> a runtime check instead of the ifdef check. This fix is
> simpler than my original proposed fix, thanks to Chris Metcalf
> for that.
> 
> ...
>
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -920,14 +920,14 @@ void __init lockup_detector_init(void)
>  {
>  	set_sample_period();
>  
> -#ifdef CONFIG_NO_HZ_FULL
> -	if (!cpumask_empty(tick_nohz_full_mask))
> -		pr_info("Disabling watchdog on nohz_full cores by default\n");
> -	cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> -		       tick_nohz_full_mask);
> -#else
> -	cpumask_copy(&watchdog_cpumask, cpu_possible_mask);
> -#endif
> +	if (tick_nohz_full_enabled()) {
> +		if (!cpumask_empty(tick_nohz_full_mask))
> +			pr_info("Disabling watchdog on nohz_full cores by default\n");
> +		cpumask_andnot(&watchdog_cpumask, cpu_possible_mask,
> +			       tick_nohz_full_mask);
> +	}
> +	else
> +		cpumask_copy(&watchdog_cpumask, cpu_possible_mask);

Breaks x86_64 allmodconfig:

kernel/watchdog.c: In function 'lockup_detector_init':
kernel/watchdog.c:924: error: 'tick_nohz_full_mask' undeclared (first use in this function)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-05-05 22:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-04 23:17 [PATCH] watchdog: Fix a watchdog crash in some configurations john.hubbard
2015-05-05 13:35 ` Don Zickus
2015-05-05 13:44   ` Chris Metcalf
2015-05-05 14:06     ` Don Zickus
2015-05-05 19:38       ` [PATCH v2] " john.hubbard
2015-05-05 22:12         ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).