linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Viresh Kumar <viresh.kumar@linaro.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linux PM <linux-pm@vger.kernel.org>,
	x86 Maintainers <x86@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@suse.de>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: tsc: Rework time_cpufreq_notifier()
Date: Mon, 22 Apr 2019 13:47:09 +0530	[thread overview]
Message-ID: <20190422081709.i5mv4fbket5ls4xc@vireshk-i7> (raw)
In-Reply-To: <38900622.ao2n2t5aPS@kreacher>

On 18-04-19, 16:11, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> There are problems with running time_cpufreq_notifier() on SMP
> systems.
> 
> First off, the rdtsc() called from there runs on the CPU executing
> that code and not necessarily on the CPU whose sched_clock() rate is
> updated which is questionable at best.
> 
> Second, in the cases when the frequencies of all CPUs in an SMP
> system are always in sync, it is not sufficient to update just
> one of them or the set associated with a given cpufreq policy on
> frequency changes - all CPUs in the system should be updated and
> that would require more than a simple transition notifier.
> 
> Note, however, that the underlying issue (the TSC rate depending on
> the CPU frequency) has not been present in hardware shipping for the
> last few years and in quite a few relevant cases (acpi-cpufreq in
> particular) running time_cpufreq_notifier() will cause the TSC to
> be marked as unstable anyway.
> 
> For this reason, make time_cpufreq_notifier() simply mark the TSC
> as unstable and give up when run on SMP and only try to carry out
> any adjustments otherwise.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  arch/x86/kernel/tsc.c |   29 ++++++++++++++---------------
>  1 file changed, 14 insertions(+), 15 deletions(-)
> 
> Index: linux-pm/arch/x86/kernel/tsc.c
> ===================================================================
> --- linux-pm.orig/arch/x86/kernel/tsc.c
> +++ linux-pm/arch/x86/kernel/tsc.c
> @@ -185,8 +185,7 @@ static void __init cyc2ns_init_boot_cpu(
>  /*
>   * Secondary CPUs do not run through tsc_init(), so set up
>   * all the scale factors for all CPUs, assuming the same
> - * speed as the bootup CPU. (cpufreq notifiers will fix this
> - * up if their speed diverges)
> + * speed as the bootup CPU.
>   */
>  static void __init cyc2ns_init_secondary_cpus(void)
>  {
> @@ -937,12 +936,12 @@ void tsc_restore_sched_clock_state(void)
>  }
>  
>  #ifdef CONFIG_CPU_FREQ
> -/* Frequency scaling support. Adjust the TSC based timer when the cpu frequency
> +/*
> + * Frequency scaling support. Adjust the TSC based timer when the CPU frequency
>   * changes.
>   *
> - * RED-PEN: On SMP we assume all CPUs run with the same frequency.  It's
> - * not that important because current Opteron setups do not support
> - * scaling on SMP anyroads.
> + * NOTE: On SMP the situation is not fixable in general, so simply mark the TSC
> + * as unstable and give up in those cases.
>   *
>   * Should fix up last_tsc too. Currently gettimeofday in the
>   * first tick after the change will be slightly wrong.
> @@ -956,22 +955,22 @@ static int time_cpufreq_notifier(struct
>  				void *data)
>  {
>  	struct cpufreq_freqs *freq = data;
> -	unsigned long *lpj;
>  
> -	lpj = &boot_cpu_data.loops_per_jiffy;
> -#ifdef CONFIG_SMP
> -	if (!(freq->flags & CPUFREQ_CONST_LOOPS))
> -		lpj = &cpu_data(freq->cpu).loops_per_jiffy;
> -#endif
> +	if (num_online_cpus() > 1) {

What about checking num_possible_cpus() instead ? So we reliably quit
everytime even if some CPUs are offlined.

And can we place this check before registering the notifier, so it
never gets called ?

> +		mark_tsc_unstable("cpufreq changes on SMP");
> +		return 0;
> +	}
>  
>  	if (!ref_freq) {
>  		ref_freq = freq->old;
> -		loops_per_jiffy_ref = *lpj;
> +		loops_per_jiffy_ref = boot_cpu_data.loops_per_jiffy;
>  		tsc_khz_ref = tsc_khz;
>  	}
> +
>  	if ((val == CPUFREQ_PRECHANGE  && freq->old < freq->new) ||
> -			(val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
> -		*lpj = cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);
> +	    (val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
> +		boot_cpu_data.loops_per_jiffy =
> +			cpufreq_scale(loops_per_jiffy_ref, ref_freq, freq->new);
>  
>  		tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new);
>  		if (!(freq->flags & CPUFREQ_CONST_LOOPS))
> 
> 

-- 
viresh

  parent reply	other threads:[~2019-04-22  8:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18 14:11 [PATCH] x86: tsc: Rework time_cpufreq_notifier() Rafael J. Wysocki
2019-04-18 18:40 ` Borislav Petkov
2019-04-18 19:36   ` Rafael J. Wysocki
2019-04-22  8:17 ` Viresh Kumar [this message]
2019-04-23  8:19   ` Rafael J. Wysocki
2019-04-23  8:42     ` Viresh Kumar
2019-04-23  8:42 ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190422081709.i5mv4fbket5ls4xc@vireshk-i7 \
    --to=viresh.kumar@linaro.org \
    --cc=bp@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).