linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86, sched: Prevent divisions by zero in frequency invariant accounting
@ 2020-04-22 14:40 Giovanni Gherdovich
  2020-04-22 14:53 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Giovanni Gherdovich @ 2020-04-22 14:40 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Rafael J . Wysocki
  Cc: x86, linux-pm, linux-kernel, Linus Torvalds, Srinivas Pandruvada,
	Thomas Gleixner, Borislav Petkov, Len Brown, Giovanni Gherdovich

The product mcnt * arch_max_freq_ratio could be zero if it overflows u64.

For context, a large value for arch_max_freq_ratio would be 5000,
corresponding to a turbo_freq/base_freq ratio of 5 (normally it's more like
1500-2000). A large increment frequency for the MPERF counter would be 5GHz
(the base clock of all CPUs on the market today is less than that). With
these figures, a CPU would need to go without a scheduler tick for around 8
days for the u64 overflow to happen. It is unlikely, but the check is
warranted.

Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
---
 arch/x86/kernel/smpboot.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8c89e4d9ad28..fb71395cbcad 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -2055,14 +2055,14 @@ void arch_scale_freq_tick(void)
 
 	acnt = aperf - this_cpu_read(arch_prev_aperf);
 	mcnt = mperf - this_cpu_read(arch_prev_mperf);
-	if (!mcnt)
-		return;
 
 	this_cpu_write(arch_prev_aperf, aperf);
 	this_cpu_write(arch_prev_mperf, mperf);
 
 	acnt <<= 2*SCHED_CAPACITY_SHIFT;
 	mcnt *= arch_max_freq_ratio;
+	if (!mcnt)
+		return;
 
 	freq_scale = div64_u64(acnt, mcnt);
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86, sched: Prevent divisions by zero in frequency invariant accounting
  2020-04-22 14:40 [PATCH] x86, sched: Prevent divisions by zero in frequency invariant accounting Giovanni Gherdovich
@ 2020-04-22 14:53 ` Peter Zijlstra
  2020-04-22 17:17   ` Giovanni Gherdovich
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2020-04-22 14:53 UTC (permalink / raw)
  To: Giovanni Gherdovich
  Cc: Ingo Molnar, Rafael J . Wysocki, x86, linux-pm, linux-kernel,
	Linus Torvalds, Srinivas Pandruvada, Thomas Gleixner,
	Borislav Petkov, Len Brown

On Wed, Apr 22, 2020 at 04:40:55PM +0200, Giovanni Gherdovich wrote:
> The product mcnt * arch_max_freq_ratio could be zero if it overflows u64.
> 
> For context, a large value for arch_max_freq_ratio would be 5000,
> corresponding to a turbo_freq/base_freq ratio of 5 (normally it's more like
> 1500-2000). A large increment frequency for the MPERF counter would be 5GHz
> (the base clock of all CPUs on the market today is less than that). With
> these figures, a CPU would need to go without a scheduler tick for around 8
> days for the u64 overflow to happen. It is unlikely, but the check is
> warranted.
> 
> Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz>
> Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
> ---
>  arch/x86/kernel/smpboot.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 8c89e4d9ad28..fb71395cbcad 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -2055,14 +2055,14 @@ void arch_scale_freq_tick(void)
>  
>  	acnt = aperf - this_cpu_read(arch_prev_aperf);
>  	mcnt = mperf - this_cpu_read(arch_prev_mperf);
> -	if (!mcnt)
> -		return;
>  
>  	this_cpu_write(arch_prev_aperf, aperf);
>  	this_cpu_write(arch_prev_mperf, mperf);
>  
>  	acnt <<= 2*SCHED_CAPACITY_SHIFT;
>  	mcnt *= arch_max_freq_ratio;
> +	if (!mcnt)
> +		return;

Should we not pr_warn() and disable the whole thing when this happens?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86, sched: Prevent divisions by zero in frequency invariant accounting
  2020-04-22 14:53 ` Peter Zijlstra
@ 2020-04-22 17:17   ` Giovanni Gherdovich
  0 siblings, 0 replies; 3+ messages in thread
From: Giovanni Gherdovich @ 2020-04-22 17:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Rafael J . Wysocki, x86, linux-pm, linux-kernel,
	Linus Torvalds, Srinivas Pandruvada, Thomas Gleixner,
	Borislav Petkov, Len Brown

On Wed, 2020-04-22 at 16:53 +0200, Peter Zijlstra wrote:
> On Wed, Apr 22, 2020 at 04:40:55PM +0200, Giovanni Gherdovich wrote:
> > The product mcnt * arch_max_freq_ratio could be zero if it overflows u64.
> > 
> > For context, a large value for arch_max_freq_ratio would be 5000,
> > corresponding to a turbo_freq/base_freq ratio of 5 (normally it's more like
> > 1500-2000). A large increment frequency for the MPERF counter would be 5GHz
> > (the base clock of all CPUs on the market today is less than that). With
> > these figures, a CPU would need to go without a scheduler tick for around 8
> > days for the u64 overflow to happen. It is unlikely, but the check is
> > warranted.
> > 
> > Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz>
> > Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
> > ---
> >  arch/x86/kernel/smpboot.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index 8c89e4d9ad28..fb71395cbcad 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -2055,14 +2055,14 @@ void arch_scale_freq_tick(void)
> >  
> >  	acnt = aperf - this_cpu_read(arch_prev_aperf);
> >  	mcnt = mperf - this_cpu_read(arch_prev_mperf);
> > -	if (!mcnt)
> > -		return;
> >  
> >  	this_cpu_write(arch_prev_aperf, aperf);
> >  	this_cpu_write(arch_prev_mperf, mperf);
> >  
> >  	acnt <<= 2*SCHED_CAPACITY_SHIFT;
> >  	mcnt *= arch_max_freq_ratio;
> > +	if (!mcnt)
> > +		return;
> 
> Should we not pr_warn() and disable the whole thing when this happens?

Ok, I will resend this patch disabling freq invariant accounting when this
overflow happens.

To elaborate further, your comment touches on an area where x86 freq
invariance is very weak at the moment: what happens if the tick doesn't run on
a cpu for a long time (answer: the estimation of freq_scale is garbage).
And by "a long time" I mean a few seconds; the patch I'm about to resend only
covers a minuscule fraction of those cases. That is, not only the tick has
been missing for days (?!), but we only noticed because the product
mcnt * arch_max_freq_ratio gave exactly 2^64 (aka 0). It could have been
waiting for 1 more millis and we wouldn't have seen the issue.

Anyways I agree on the principle: even if we can't address all problems now,
let's at least cover those where the solution is easy.


Giovanni

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-04-22 17:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-22 14:40 [PATCH] x86, sched: Prevent divisions by zero in frequency invariant accounting Giovanni Gherdovich
2020-04-22 14:53 ` Peter Zijlstra
2020-04-22 17:17   ` Giovanni Gherdovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).