linux-hwmon.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Naveen Krishna Ch <naveenkrishna.ch@gmail.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: Naveen Krishna Chatradhi <nchatrad@amd.com>, linux-hwmon@vger.kernel.org
Subject: Re: [PATCH 2/3] hwmon: (amd_energy) Add documentation
Date: Mon, 11 May 2020 22:19:18 +0530	[thread overview]
Message-ID: <CAHfPSqDkuUuejO-b2wJnMzr1TNRzhNogiF4_ov3TKytrwS+JtA@mail.gmail.com> (raw)
In-Reply-To: <498b7568-3b59-35f7-79d4-a6e4a972aec0@roeck-us.net>

Hi Guenter

On Thu, 7 May 2020 at 00:57, Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi,
>
> On 5/6/20 10:11 AM, Naveen Krishna Ch wrote:
> > Hi Guenter
> >
> > On Wed, 6 May 2020 at 22:03, Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> On Fri, May 01, 2020 at 11:20:02PM +0530, Naveen Krishna Chatradhi wrote:
> >>> Document amd_energy driver with all chips supported by it.
> >>>
> >>> Cc: Guenter Roeck <linux@roeck-us.net>
> >>> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
> >>> ---
> >>> Changes in v5: None
> >>>
> >>>  Documentation/hwmon/amd_energy.rst | 100 +++++++++++++++++++++++++++++
> >>>  Documentation/hwmon/index.rst      |   1 +
> >>>  2 files changed, 101 insertions(+)
> >>>  create mode 100644 Documentation/hwmon/amd_energy.rst
> >>>
> >>> diff --git a/Documentation/hwmon/amd_energy.rst b/Documentation/hwmon/amd_energy.rst
> >>> new file mode 100644
> >>> index 000000000000..2216c8b13e58
> >>> --- /dev/null
> >>> +++ b/Documentation/hwmon/amd_energy.rst
> >>> @@ -0,0 +1,100 @@
> >>> +.. SPDX-License-Identifier: GPL-2.0
> >>> +
> >>> +Kernel driver amd_energy
> >>> +==========================
> >>> +
> >>> +Supported chips:
> >>> +
> >>> +* AMD Family 17h Processors
> >>> +
> >>> +  Prefix: 'amd_energy'
> >>> +
> >>> +  Addresses used:  RAPL MSRs
> >>> +
> >>> +  Datasheets:
> >>> +
> >>> +  - Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors
> >>> +
> >>> +     https://developer.amd.com/wp-content/resources/55570-B1_PUB.zip
> >>> +
> >>> +  - Preliminary Processor Programming Reference (PPR) for AMD Family 17h Model 31h, Revision B0 Processors
> >>> +
> >>> +     https://developer.amd.com/wp-content/resources/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip
> >>> +
> >>> +Author: Naveen Krishna Chatradhi <nchatrad@amd.com>
> >>> +
> >>> +Description
> >>> +-----------
> >>> +
> >>> +The Energy driver exposes the energy counters that are
> >>> +reported via the Running Average Power Limit (RAPL)
> >>> +Model-specific Registers (MSRs) via the hardware monitor
> >>> +(HWMON) sysfs interface.
> >>> +
> >>> +1. Power, Energy and Time Units
> >>> +   MSR_RAPL_POWER_UNIT/ C001_0299:
> >>> +   shared with all cores in the socket
> >>> +
> >>> +2. Energy consumed by each Core
> >>> +   MSR_CORE_ENERGY_STATUS/ C001_029A:
> >>> +   32-bitRO, Accumulator, core-level power reporting
> >>> +
> >>> +3. Energy consumed by Socket
> >>> +   MSR_PACKAGE_ENERGY_STATUS/ C001_029B:
> >>> +   32-bitRO, Accumulator, socket-level power reporting,
> >>> +   shared with all cores in socket
> >>> +
> >>> +These registers are updated every 1ms and cleared on
> >>> +reset of the system.
> >>> +
> >>> +Energy Caluclation
> >>> +------------------
> >>> +
> >>> +Energy information (in Joules) is based on the multiplier,
> >>> +1/2^ESU; where ESU is an unsigned integer read from
> >>> +MSR_RAPL_POWER_UNIT register. Default value is 10000b,
> >>> +indicating energy status unit is 15.3 micro-Joules increment.
> >>> +
> >>> +Reported values are scaled as per the formula
> >>> +
> >>> +scaled value = ((1/2^ESU) * (Raw value) * 1000000UL) in Joules
> >>> +
> >>> +Users calculate power for a given domain by calculating
> >>> +     dEnergy/dTime for that domain.
> >>> +
> >>> +Socket energy accumulation
> >>> +--------------------------
> >>> +
> >>> +Current Socket energy status register is 32bit, assuming a 240W
> >>> +system, the register would wrap around in
> >>> +
> >>> +     2^32*15.3 e-6/240 = 273.80416512 secs to wrap(~4.5 mins)
> >>> +
> >>> +To improve the wrap around time, a kernel thread is implemented
> >>> +to accumulate the socket energy counter to a 64-bit counter. The
> >>> +kernel thread starts running during probe, wakes up at 100secs
> >>
> >> wakes up every 100 seconds
> >>
> >>> +and stops running in remove.
> >>
> >> stops running when the driver is removed.
> > Will correct them
> >>
> >> All counters need to be be updated by the kernel thread, not just the socket
> >> counter. If the socket counter can wrap in 4.5 minutes, the matching per-core
> >> counters on a 64-core system can wrap every 4.5 * 64 = 288 minutes, which
> >> isn't much better. This might be even worse on a system with fewer cores and
> >> higher per-core power.
> >
> > Agreed, just need few clarifications though
> > 1. Is it OK to implement another thread for cores alone, as it need not run as
> > frequently as the socket thread.
>
> Your call, but personally I think it is not worth the overhead; see below.
>
> > 2. We have a scenario on servers, a thread accumulating energy for all 128 cores
> > might compromise the compute. So, i would like to provide a configuration
> > symbol or sysfs mechanism to enable/disable the core accumulation.
> >
>
> Another option would be to use a single thread but only update a single core
> per socket at a time. If the socket thread needs to run every N seconds,
> one would assume that the core thread only needs to run every N * (number
> of cores) seconds (assuming that it uses the same scale). If so, reading
> the data for one core (or maybe a couple of cores if the scale is different)
> plus the data for the socket should not be that expensive.
This is good and possible. Thanks
>
> If that is not acceptable, it might make more sense to blacklist the driver
> entirely in such situations; without accumulation the reported values are
> pretty much worthless.
Sure, will implement core accumulation as well.
>
> Thanks,
> Guenter
>
> >>
> >>> +
> >>> +A socket energy read would return the current register value
> >>> +added to the respective energy accumulator.
> >>> +
> >>> +Sysfs attributes
> >>> +----------------
> >>> +
> >>> +=============== ========  =====================================
> >>> +Attribute    Label     Description
> >>> +===============      ========  =====================================
> >>> +
> >>> +* For index N between [1] and [nr_cpus]
> >>> +
> >>> +===============      ========  ======================================
> >>> +energy[N]_input EcoreX         Core Energy   X = [0] to [nr_cpus - 1]
> >>> +                       Measured input core energy
> >>> +===============      ========  ======================================
> >>> +
> >>> +* For N between [nr_cpus] and [nr_cpus + nr_socks]
> >>> +
> >>> +===============      ========  ======================================
> >>> +energy[N]_input EsocketX  Socket Energy X = [0] to [nr_socks -1]
> >>> +                       Measured input socket energy
> >>> +=============== ========  ======================================
> >>> diff --git a/Documentation/hwmon/index.rst b/Documentation/hwmon/index.rst
> >>> index 8ef62fd39787..fc4b89810e67 100644
> >>> --- a/Documentation/hwmon/index.rst
> >>> +++ b/Documentation/hwmon/index.rst
> >>> @@ -39,6 +39,7 @@ Hardware Monitoring Kernel Drivers
> >>>     adt7470
> >>>     adt7475
> >>>     amc6821
> >>> +   amd_energy
> >>>     asb100
> >>>     asc7621
> >>>     aspeed-pwm-tacho
> >
> >
> >
>


-- 
Shine bright,
(: Nav :)

  reply	other threads:[~2020-05-11 16:49 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-01 17:50 [PATCH 1/3] hwmon: Add amd_energy driver to report energy counters Naveen Krishna Chatradhi
2020-05-01 17:50 ` [PATCH 2/3] hwmon: (amd_energy) Add documentation Naveen Krishna Chatradhi
2020-05-06 16:33   ` Guenter Roeck
2020-05-06 17:11     ` Naveen Krishna Ch
2020-05-06 19:27       ` Guenter Roeck
2020-05-11 16:49         ` Naveen Krishna Ch [this message]
2020-05-01 17:50 ` [PATCH 3/3] MAINTAINERS: add entry for AMD energy driver Naveen Krishna Chatradhi
2020-05-06 16:23 ` [PATCH 1/3] hwmon: Add amd_energy driver to report energy counters Guenter Roeck
2020-05-06 17:06   ` Naveen Krishna Ch
2020-05-06 18:32 ` Guenter Roeck
2020-05-14 15:56   ` Naveen Krishna Ch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHfPSqDkuUuejO-b2wJnMzr1TNRzhNogiF4_ov3TKytrwS+JtA@mail.gmail.com \
    --to=naveenkrishna.ch@gmail.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=nchatrad@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).