linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Doug Smythies <dsmythies@telus.net>
To: Feng Tang <feng.tang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"paulmck@kernel.org" <paulmck@kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	srinivas pandruvada <srinivas.pandruvada@linux.intel.com>,
	dsmythies <dsmythies@telus.net>
Subject: Re: CPU excessively long times between frequency scaling driver calls - bisected
Date: Tue, 8 Feb 2022 22:23:13 -0800	[thread overview]
Message-ID: <CAAYoRsXkyWf0vmEE2HvjF6pzCC4utxTF=7AFx1PJv4Evh=C+Ow@mail.gmail.com> (raw)
In-Reply-To: <20220208091525.GA7898@shbuild999.sh.intel.com>

On Tue, Feb 8, 2022 at 1:15 AM Feng Tang <feng.tang@intel.com> wrote:
> On Mon, Feb 07, 2022 at 11:13:00PM -0800, Doug Smythies wrote:
> > > >
> > > > Since kernel 5.16-rc4 and commit: b50db7095fe002fa3e16605546cba66bf1b68a3e
> > > > " x86/tsc: Disable clocksource watchdog for TSC on qualified platorms"
> > > >
> > > > There are now occasions where times between calls to the driver can be
> > > > over 100's of seconds and can result in the CPU frequency being left
> > > > unnecessarily high for extended periods.
> > > >
> > > > From the number of clock cycles executed between these long
> > > > durations one can tell that the CPU has been running code, but
> > > > the driver never got called.
> > > >
> > > > Attached are some graphs from some trace data acquired using
> > > > intel_pstate_tracer.py where one can observe an idle system between
> > > > about 42 and well over 200 seconds elapsed time, yet CPU10 never gets
> > > > called, which would have resulted in reducing it's pstate request, until
> > > > an elapsed time of 167.616 seconds, 126 seconds since the last call. The
> > > > CPU frequency never does go to minimum.
> > > >
> > > > For reference, a similar CPU frequency graph is also attached, with
> > > > the commit reverted. The CPU frequency drops to minimum,
> > > > over about 10 or 15 seconds.,
> > >
> > > commit b50db7095fe0 essentially disables the clocksource watchdog,
> > > which literally doesn't have much to do with cpufreq code.
> > >
> > > One thing I can think of is, without the patch, there is a periodic
> > > clocksource timer running every 500 ms, and it loops to run on
> > > all CPUs in turn. For your HW, it has 12 CPUs (from the graph),
> > > so each CPU will get a timer (HW timer interrupt backed) every 6
> > > seconds. Could this affect the cpufreq governor's work flow (I just
> > > quickly read some cpufreq code, and seem there is irq_work/workqueue
> > > involved).
> >
> > 6 Seconds is the longest duration I have ever seen on this
> > processor before commit b50db7095fe0.
> >
> > I said "the times between calls to the driver have never
> > exceeded 10 seconds" originally, but that involved other processors.
> >
> > I also did longer, 9000 second tests:
> >
> > For a reverted kernel the driver was called 131,743,
> > and 0 times the duration was longer than 6.1 seconds.
> >
> > For a non-reverted kernel the driver was called 110,241 times,
> > and 1397 times the duration was longer than 6.1 seconds,
> > and the maximum duration was 303.6 seconds
>
> Thanks for the data, which shows it is related to the removal of
> clocksource watchdog timers. And under this specific configurations,
> the cpufreq work flow has some dependence on that watchdog timers.
>
> Also could you share you kernel config, boot message and some
> system settings like for tickless mode, so that other people can
> try to reproduce? thanks

I steal the kernel configuration file from the Ubuntu mainline PPA
[1], what they call "lowlatency", or 1000Hz tick. I make these
changes before compile:

scripts/config --disable DEBUG_INFO
scripts/config --disable SYSTEM_TRUSTED_KEYS
scripts/config --disable SYSTEM_REVOCATION_KEYS

I also send you the config and dmesg files in an off-list email.

This is an idle, and very low periodic loads, system type test.
My test computer has no GUI and very few services running.
Notice that I have not used the word "regression" yet in this thread,
because I don't know for certain that it is. In the end, we don't
care about CPU frequency, we care about wasting energy.
It is definitely a change, and I am able to measure small increases
in energy use, but this is all at the low end of the power curve.
So far I have not found a significant example of increased power
use, but I also have not looked very hard.

During any test, many monitoring tools might shorten durations.
For example if I run turbostat, say:

sudo turbostat --Summary --quiet --show
Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,RAMWatt,GFXWatt,CorWatt --interval
2.5

Well, yes then the maximum duration would be 2.5 seconds,
because turbostat wakes up each CPU to inquire about things
causing a call to the CPU scaling driver. (I tested this, for about
900 seconds.)

For my power tests I use a sample interval of >= 300 seconds.
For duration only tests, turbostat is not run at the same time.

My grub line:

GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 consoleblank=314
intel_pstate=active intel_pstate=no_hwp msr.allow_writes=on
cpuidle.governor=teo"

A typical pstate tracer command (with the script copied to the
directory where I run this stuff:):

sudo ./intel_pstate_tracer.py --interval 600 --name vnew02 --memory 800000

>
> > > Can you try one test that keep all the current setting and change
> > > the irq affinity of disk/network-card to 0xfff to let interrupts
> > > from them be distributed to all CPUs?
> >
> > I am willing to do the test, but I do not know how to change the
> > irq affinity.
>
> I might say that too soon. I used to "echo fff > /proc/irq/xxx/smp_affinity"
> (xx is the irq number of a device) to let interrupts be distributed
> to all CPUs long time ago, but it doesn't work on my 2 desktops at hand.
> Seems it only support one-cpu irq affinity in recent kernel.
>
> You can still try that command, though it may not work.

I did not try this yet.

[1] https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.17-rc3/

  reply	other threads:[~2022-02-09  6:24 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <003f01d81c8c$d20ee3e0$762caba0$@telus.net>
2022-02-08  2:39 ` CPU excessively long times between frequency scaling driver calls - bisected Feng Tang
2022-02-08  7:13   ` Doug Smythies
2022-02-08  9:15     ` Feng Tang
2022-02-09  6:23       ` Doug Smythies [this message]
2022-02-10  7:45         ` Zhang, Rui
2022-02-13 18:54           ` Doug Smythies
2022-02-14 15:17             ` srinivas pandruvada
2022-02-15 21:35               ` Doug Smythies
2022-02-22  7:34               ` Feng Tang
2022-02-22 18:04                 ` Rafael J. Wysocki
2022-02-23  0:07                   ` Doug Smythies
2022-02-23  0:32                     ` srinivas pandruvada
2022-02-23  0:40                       ` Feng Tang
2022-02-23 14:23                         ` Rafael J. Wysocki
2022-02-24  8:08                           ` Feng Tang
2022-02-24 14:44                             ` Paul E. McKenney
2022-02-24 16:29                               ` Doug Smythies
2022-02-24 16:58                                 ` Paul E. McKenney
2022-02-25  0:29                               ` Feng Tang
2022-02-25  1:06                                 ` Paul E. McKenney
2022-02-25 17:45                             ` Rafael J. Wysocki
2022-02-26  0:36                               ` Doug Smythies
2022-02-28  4:12                                 ` Feng Tang
2022-02-28 19:36                                   ` Rafael J. Wysocki
2022-03-01  5:52                                     ` Feng Tang
2022-03-01 11:58                                       ` Rafael J. Wysocki
2022-03-01 17:18                                         ` Doug Smythies
2022-03-01 17:34                                           ` Rafael J. Wysocki
2022-03-02  4:06                                             ` Doug Smythies
2022-03-02 19:00                                               ` Rafael J. Wysocki
2022-03-03 23:00                                                 ` Doug Smythies
2022-03-04  6:59                                                   ` Doug Smythies
2022-03-16 15:54                                                     ` Doug Smythies
2022-03-17 12:30                                                       ` Rafael J. Wysocki
2022-03-17 13:58                                                         ` Doug Smythies
2022-03-24 14:04                                                           ` Doug Smythies
2022-03-24 18:17                                                             ` Rafael J. Wysocki
2022-03-25  0:03                                                               ` Doug Smythies
2022-03-03  5:27                                               ` Feng Tang
2022-03-03 12:02                                                 ` Rafael J. Wysocki
2022-03-04  5:13                                                   ` Feng Tang
2022-03-04 16:23                                                     ` Paul E. McKenney
2022-02-23  2:49                   ` Feng Tang
2022-02-23 14:11                     ` Rafael J. Wysocki
2022-02-23  9:40                   ` Thomas Gleixner
2022-02-23 14:23                     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAYoRsXkyWf0vmEE2HvjF6pzCC4utxTF=7AFx1PJv4Evh=C+Ow@mail.gmail.com' \
    --to=dsmythies@telus.net \
    --cc=feng.tang@intel.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).