All of lore.kernel.org
 help / color / mirror / Atom feed
From: Giovanni Gherdovich <ggherdovich@suse.cz>
To: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@suse.de>, Len Brown <lenb@kernel.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	x86@kernel.org, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Mel Gorman <mgorman@techsingularity.net>,
	Doug Smythies <dsmythies@telus.net>,
	Like Xu <like.xu@linux.intel.com>,
	Neil Rickert <nwr10cst-oslnx@yahoo.com>,
	Chris Wilson <chris@chris-wilson.co.uk>
Subject: Re: [PATCH 1/4] x86, sched: Bail out of frequency invariance if base frequency is unknown
Date: Thu, 23 Apr 2020 10:06:04 +0200	[thread overview]
Message-ID: <1587629164.28094.11.camel@suse.cz> (raw)
In-Reply-To: <20200422171547.GA11942@ranerica-svr.sc.intel.com>

On Wed, 2020-04-22 at 10:15 -0700, Ricardo Neri wrote:
> On Thu, Apr 16, 2020 at 07:47:42AM +0200, Giovanni Gherdovich wrote:
> > Some hypervisors such as VMWare ESXi 5.5 advertise support for
> > X86_FEATURE_APERFMPERF but then fill all MSR's with zeroes. In particular,
> > MSR_PLATFORM_INFO set to zero tricks the code that wants to know the base
> > clock frequency of the CPU (highest non-turbo frequency), producing a
> > division by zero when computing the ratio turbo_freq/base_freq necessary
> > for frequency invariant accounting.
> > 
> > It is to be noted that even if MSR_PLATFORM_INFO contained the appropriate
> > data, APERF and MPERF are constantly zero on ESXi 5.5, thus freq-invariance
> > couldn't be done in principle (not that it would make a lot of sense in a
> > VM anyway). The real problem is advertising X86_FEATURE_APERFMPERF. This
> > appears to be fixed in more recent versions: ESXi 6.7 doesn't advertise
> > that feature.
> > 
> > Signed-off-by: Giovanni Gherdovich <ggherdovich@suse.cz>
> > Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
> > ---
> >  arch/x86/kernel/smpboot.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index fe3ab9632f3b..3a318ec9bc17 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -1985,6 +1985,15 @@ static bool intel_set_max_freq_ratio(void)
> >  	return false;
> >  
> >  out:
> > +	/*
> > +	 * Some hypervisors advertise X86_FEATURE_APERFMPERF
> > +	 * but then fill all MSR's with zeroes.
> > +	 */
> > +	if (!base_freq) {
> > +		pr_debug("Couldn't determine cpu base frequency, necessary for scale-invariant accounting.\n");
> > +		return false;
> > +	}
> 
> It may be possible that MSR_TURBO_RATIO_LIMIT is also all-zeros. In
> such case, turbo_freq will be also zero. If that is the case,
> arch_max_freq_ratio will be zero and we will see a division by zero
> exception in arch_scale_freq_tick() because mcnt is multiplied by
> arch_max_freq_ratio().

Thanks Ricardo for clarifying this.

Follow-up question: when I see an all-zeros MSR_TURBO_RATIO_LIMIT, can I
assume the CPU doesn't support turbo boost? Or is it possible that such a CPU
has turbo boost, just the turbo ratios aren't declared in the MSR?

Some context: this feature (called "frequency invariance") wants to know
what's the max clock freq a CPU can have at any time (it needs it for some
scheduler calculations). This is hard to know precisely, because turbo can
kick in at any time and depends on many factors.  So it settles for an
"average maximum frequency", which I decided the 4 cores turbo is a good
estimate for. Now, if an all-zeros MSR_TURBO_RATIO_LIMIT means "turbo boost
unsupported", this is actually the easy case because then I know exactly what
the max freq is (base frequency). If, on the other hand, an all-zeros MSR
means "there may or may not be turbo, and you don't know how much" then I must
disable frequency invariance.


Thanks,
Giovanni

  reply	other threads:[~2020-04-23  8:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16  5:47 [PATCH 0/4] Frequency invariance fixes for x86 Giovanni Gherdovich
2020-04-16  5:47 ` [PATCH 1/4] x86, sched: Bail out of frequency invariance if base frequency is unknown Giovanni Gherdovich
2020-04-16  6:41   ` Giovanni Gherdovich
2020-04-16 15:41   ` Rafael J. Wysocki
2020-04-22 17:15   ` Ricardo Neri
2020-04-23  8:06     ` Giovanni Gherdovich [this message]
2020-04-24  1:32       ` Ricardo Neri
2020-04-24  5:53         ` Giovanni Gherdovich
2020-04-22 21:20   ` [tip: sched/urgent] " tip-bot2 for Giovanni Gherdovich
2020-04-16  5:47 ` [PATCH 2/4] x86, sched: Account for CPUs with less than 4 cores in freq. invariance Giovanni Gherdovich
2020-04-16 15:26   ` Dave Kleikamp
2020-04-22 21:20   ` [tip: sched/urgent] " tip-bot2 for Giovanni Gherdovich
2020-04-16  5:47 ` [PATCH 3/4] x86, sched: Don't enable static key when starting secondary CPUs Giovanni Gherdovich
2020-04-22 21:20   ` [tip: sched/urgent] " tip-bot2 for Peter Zijlstra (Intel)
2020-04-16  5:47 ` [PATCH 4/4] x86, sched: Move check for CPU type to caller function Giovanni Gherdovich
2020-04-22 21:20   ` [tip: sched/urgent] " tip-bot2 for Giovanni Gherdovich
2020-04-16  7:20 ` [PATCH 0/4] Frequency invariance fixes for x86 Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1587629164.28094.11.camel@suse.cz \
    --to=ggherdovich@suse.cz \
    --cc=bp@suse.de \
    --cc=chris@chris-wilson.co.uk \
    --cc=dsmythies@telus.net \
    --cc=lenb@kernel.org \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=nwr10cst-oslnx@yahoo.com \
    --cc=peterz@infradead.org \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.