All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julia Lawall <julia.lawall@inria.fr>
To: Francisco Jerez <currojerez@riseup.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Len Brown <lenb@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: cpufreq: intel_pstate: map utilization into the pstate range
Date: Thu, 6 Jan 2022 20:49:08 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.22.394.2201062044340.3098@hadrien> (raw)
In-Reply-To: <87a6g9rbje.fsf@riseup.net>



On Wed, 5 Jan 2022, Francisco Jerez wrote:

> Julia Lawall <julia.lawall@inria.fr> writes:
>
> > On Tue, 4 Jan 2022, Rafael J. Wysocki wrote:
> >
> >> On Tue, Jan 4, 2022 at 4:49 PM Julia Lawall <julia.lawall@inria.fr> wrote:
> >> >
> >> > I tried the whole experiment again on an Intel w2155 (one socket, 10
> >> > physical cores, pstates 12, 33, and 45).
> >> >
> >> > For the CPU there is a small jump a between 32 and 33 - less than for the
> >> > 6130.
> >> >
> >> > For the RAM, there is a big jump between 21 and 22.
> >> >
> >> > Combining them leaves a big jump between 21 and 22.
> >>
> >> These jumps are most likely related to voltage increases.
> >>
> >> > It seems that the definition of efficient is that there is no more cost
> >> > for the computation than the cost of simply having the machine doing any
> >> > computation at all.  It doesn't take into account the time and energy
> >> > required to do some actual amount of work.
> >>
> >> Well, that's not what I wanted to say.
> >
> > I was referring to Francisco's comment that the lowest indicated frequency
> > should be the most efficient one.  Turbostat also reports the lowest
> > frequency as the most efficient one.  In my graph, there are the pstates 7
> > and 10, which give exactly the same energy consumption as 12.  7 and 10
> > are certainly less efficient, because the energy consumption is the same,
> > but the execution speed is lower.
> >
> >> Of course, the configuration that requires less energy to be spent to
> >> do a given amount of work is more energy-efficient.  To measure this,
> >> the system needs to be given exactly the same amount of work for each
> >> run and the energy spent by it during each run needs to be compared.
>
> I disagree that the system needs to be given the exact same amount of
> work in order to measure differences in energy efficiency.  The average
> energy efficiency of Julia's 10s workloads can be calculated easily in
> both cases (e.g. as the W/E ratio below, W will just be a different
> value for each run), and the result will likely approximate the
> instantaneous energy efficiency of the fixed P-states we're comparing,
> since her workload seems to be fairly close to a steady state.
>
> >
> > This is bascially my point of view, but there is a question about it.  If
> > over 10 seconds you consume 10J and by running twice as fast you would
> > consume only 6J, then how do you account for the nest 5 seconds?  If the
> > machine is then idle for the next 5 seconds, maybe you would end up
> > consuming 8J in total over the 10 seconds.  But if you take advantage of
> > the free 5 seconds to pack in another job, then you end up consuming 12J.
> >
>
> Geometrically, such an oscillatory workload with periods of idling and
> periods of activity would give an average power consumption along the
> line that passes through the points corresponding to both states on the
> CPU's power curve -- IOW your average power consumption will just be the
> weighted average of the power consumption of each state (with the duty
> cycle t_i/t_total of each state being its weight):
>
> P_avg = t_0/t_total * P_0 + t_1/t_total * P_1
>
> Your energy usage would just be 10s times that P_avg, since you're
> assuming that the total runtime of the workload is fixed at 10s
> independent of how long the CPU actually takes to complete the
> computation.  In cases where the P-state during the period of activity
> t_1 is equal or lower to the maximum efficiency P-state, that line
> segment is guaranteed to lie below the power curve, indicating that such
> oscillation is more efficient than running the workload fixed to its
> average P-state.
>
> That said, this scenario doesn't really seem very relevant to your case,
> since the last workload you've provided turbostat traces for seems to
> show almost no oscillation.  If there was such an oscillation, your
> total energy usage would still be greater for oscillations between idle
> and some P-state different from the most efficient one.  Such an
> oscillation doesn't explain the anomaly we're seeing on your traces,
> which show more energy-efficient instantaneous behavior for a P-state 2x
> the one reported by your processor as the most energy-efficient.

All the turbostat output and graphs I have sent recently were just for
continuous spinning:

for(;;);

Now I am trying running for the percentage of the time corresponding to
10 / P for pstate P (ie 0.5 of the time for pstate 20), and then sleeping,
to see whether one can just add the sleeping power consumption of the
machine to compute the efficiency as Rafael suggested.

julia

>
> >> However, I think that you are interested in answering a different
> >> question: Given a specific amount of time (say T) to run the workload,
> >> what frequency to run the CPUs doing the work at in order to get the
> >> maximum amount of work done per unit of energy spent by the system (as
> >> a whole)?  Or, given 2 different frequency levels, which of them to
> >> run the CPUs at to get more work done per energy unit?
> >
> > This is the approach where you assume that the machine will be idle in any
> > leftover time.  And it accounts for the energy consumed in that idle time.
> >
> >> The work / energy ratio can be estimated as
> >>
> >> W / E = C * f / P(f)
> >>
> >> where C is a constant and P(f) is the power drawn by the whole system
> >> while the CPUs doing the work are running at frequency f, and of
> >> course for the system discussed previously it is greater in the 2 GHz
> >> case.
> >>
> >> However P(f) can be divided into two parts, P_1(f) that really depends
> >> on the frequency and P_0 that does not depend on it.  If P_0 is large
> >> enough to dominate P(f), which is the case in the 10-20 range of
> >> P-states on the system in question, it is better to run the CPUs doing
> >> the work faster (as long as there is always enough work to do for
> >> them; see below).  This doesn't mean that P(f) is not a convex
> >> function of f, though.
> >>
> >> Moreover, this assumes that there will always be enough work for the
> >> system to do when running the busy CPUs at 2 GHz, or that it can go
> >> completely idle when it doesn't do any work, but let's see what
> >> happens if the amount of work to do is W_1 = C * 1 GHz * T and the
> >> system cannot go completely idle when the work is done.
> >>
> >> Then, nothing changes for the busy CPUs running at 1 GHz, but in the 2
> >> GHz case we get W = W_1 and E = P(2 GHz) * T/2 + P_0 * T/2, because
> >> the busy CPUs are only busy 1/2 of the time, but power P_0 is drawn by
> >> the system regardless.  Hence, in the 2 GHz case (assuming P(2 GHz) =
> >> 120 W and P_0 = 90 W), we get
> >>
> >> W / E = 2 * C * 1 GHz / (P(2 GHz) + P_0) = 0.0095 * C * 1 GHz
> >>
> >> which is slightly less than the W / E ratio at 1 GHz approximately
> >> equal to 0.01 * C * 1 GHz (assuming P(1 GHz) = 100 W), so in these
> >> conditions it would be better to run the busy CPUs at 1 GHz.
> >
> > OK, I'll try to measure this.
> >
> > thanks,
> > julia
>

  reply	other threads:[~2022-01-06 19:49 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-13 22:52 cpufreq: intel_pstate: map utilization into the pstate range Julia Lawall
2021-12-17 18:36 ` Rafael J. Wysocki
2021-12-17 19:32   ` Julia Lawall
2021-12-17 20:36     ` Francisco Jerez
2021-12-17 22:51       ` Julia Lawall
2021-12-18  0:04         ` Francisco Jerez
2021-12-18  6:12           ` Julia Lawall
2021-12-18 10:19             ` Francisco Jerez
2021-12-18 11:07               ` Julia Lawall
2021-12-18 22:12                 ` Francisco Jerez
2021-12-19  6:42                   ` Julia Lawall
2021-12-19 14:19                     ` Rafael J. Wysocki
2021-12-19 14:30                       ` Rafael J. Wysocki
2021-12-19 21:47                       ` Julia Lawall
2021-12-19 22:10                     ` Francisco Jerez
2021-12-19 22:41                       ` Julia Lawall
2021-12-19 23:31                         ` Francisco Jerez
2021-12-21 17:04                       ` Rafael J. Wysocki
2021-12-21 23:56                         ` Francisco Jerez
2021-12-22 14:54                           ` Rafael J. Wysocki
2021-12-24 11:08                             ` Julia Lawall
2021-12-28 16:58                           ` Julia Lawall
2021-12-28 17:40                             ` Rafael J. Wysocki
2021-12-28 17:46                               ` Julia Lawall
2021-12-28 18:06                                 ` Rafael J. Wysocki
2021-12-28 18:16                                   ` Julia Lawall
2021-12-29  9:13                                   ` Julia Lawall
2021-12-30 17:03                                     ` Rafael J. Wysocki
2021-12-30 17:54                                       ` Julia Lawall
2021-12-30 17:58                                         ` Rafael J. Wysocki
2021-12-30 18:20                                           ` Julia Lawall
2021-12-30 18:37                                             ` Rafael J. Wysocki
2021-12-30 18:44                                               ` Julia Lawall
2022-01-03 15:50                                                 ` Rafael J. Wysocki
2022-01-03 16:41                                                   ` Julia Lawall
2022-01-03 18:23                                                   ` Julia Lawall
2022-01-03 19:58                                                     ` Rafael J. Wysocki
2022-01-03 20:51                                                       ` Julia Lawall
2022-01-04 14:09                                                         ` Rafael J. Wysocki
2022-01-04 15:49                                                           ` Julia Lawall
2022-01-04 19:22                                                             ` Rafael J. Wysocki
2022-01-05 20:19                                                               ` Julia Lawall
2022-01-05 23:46                                                                 ` Francisco Jerez
2022-01-06 19:49                                                                   ` Julia Lawall [this message]
2022-01-06 20:28                                                                     ` Srinivas Pandruvada
2022-01-06 20:43                                                                       ` Julia Lawall
2022-01-06 21:55                                                                         ` srinivas pandruvada
2022-01-06 21:58                                                                           ` Julia Lawall
2022-01-05  0:38                                                         ` Francisco Jerez
2021-12-19 14:14     ` Rafael J. Wysocki
2021-12-19 17:03       ` Julia Lawall
2021-12-19 22:30         ` Francisco Jerez
2021-12-21 18:10         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.22.394.2201062044340.3098@hadrien \
    --to=julia.lawall@inria.fr \
    --cc=currojerez@riseup.net \
    --cc=juri.lelli@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.