linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks
@ 2020-03-11  9:37 Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the third version of the patches to track and expose idle PURR
and SPURR ticks. These patches are required by tools such as lparstat
to compute system utilization for capacity planning purposes.

The previous versions can be found here:
v2: https://lkml.org/lkml/2020/2/21/21
v1: https://lore.kernel.org/patchwork/cover/1159341/

They key changes from v2 are:

    - The prolog and epilog functions have been named
      pseries_idle_prolog() and pseries_idle_epilog() respectively to
      indicate their pseries specific nature.

    - Fixed the Documentation for
      /sys/devices/system/cpu/cpuX/idle_spurr as pointed out by
      Nathan Lynch.

    - Introduces a patch (Patch 6/6) to send an IPI in order to read
      and cache the values of purr, spurr, idle_purr and idle_spurr of
      the target CPU when any one of them is read via sysfs. These
      cached values will be presented if any of these sysfs are read
      within the next 10ms. If these sysfs files are read after 10ms
      from the earlier IPI, a fresh IPI is issued to read and cache
      the values again. This minimizes the number of IPIs required to
      be sent when these values are read back-to-back via the sysfs
      interface.

Test-results: While reading the four sysfs files back-to-back for a
given CPU every second for 100 seconds.
    
    Without patch 6/6 (Without caching): 
                 16 [XICS 2 Edge IPI] = 422 times
                 DBL [Doorbell interrupts] = 13 times
                 Total : 435 IPIs.
    
    With patch 6/6 (With caching):
                  16 [XICS 2 Edge IPI] = 111 times
                  DBL [Doorbell interrupts] = 17 times
                  Total : 128 IPIs.
Motivation:
===========
On PSeries LPARs, the data centers planners desire a more accurate
view of system utilization per resource such as CPU to plan the system
capacity requirements better. Such accuracy can be obtained by reading
PURR/SPURR registers for CPU resource utilization.

Tools such as lparstat which are used to compute the utilization need
to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
counters are already exposed through sysfs.  We already account for
PURR ticks when we go to idle so that we can update the VPA area. This
patchset extends support to account for SPURR ticks when idle, and
expose both via per-cpu sysfs files.

These patches are required for enhancement to the lparstat utility
that compute the CPU utilization based on PURR and SPURR which can be
found here :
https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4


With the patches, when lparstat is run on a LPAR running CPU-Hogs,
=========================================================================
sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 
---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
1  99.99   0.00  3.35GHz[111%] 110.99   0.00
2 100.00   0.00  3.35GHz[111%] 111.00   0.00
3 100.00   0.00  3.35GHz[111%] 111.00   0.00
=========================================================================

When lparstat is run on an LPAR that is idle,
=========================================================================
$ sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 
---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
1   0.71  99.30  2.18GHz[ 72%]   0.53  71.48
2   0.56  99.44  2.11GHz[ 70%]   0.43  69.57
3   0.54  99.46  2.11GHz[ 70%]   0.43  69.57
=========================================================================
     

Gautham R. Shenoy (6):
  powerpc: Move idle_loop_prolog()/epilog() functions to header file
  powerpc/idle: Add accessor function to always read latest idle PURR
  powerpc/pseries: Account for SPURR ticks on idle CPUs
  powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
  pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr

 Documentation/ABI/testing/sysfs-devices-system-cpu |  39 ++++++
 arch/powerpc/include/asm/idle.h                    |  89 ++++++++++++++
 arch/powerpc/kernel/sysfs.c                        | 133 +++++++++++++++++++--
 arch/powerpc/platforms/pseries/setup.c             |   8 +-
 drivers/cpuidle/cpuidle-pseries.c                  |  39 ++----
 5 files changed, 267 insertions(+), 41 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

-- 
1.9.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 2/6] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently prior to entering an idle state on a Linux Guest, the
pseries cpuidle driver implement an idle_loop_prolog() and
idle_loop_epilog() functions which ensure that idle_purr is correctly
computed, and the hypervisor is informed that the CPU cycles have been
donated.

These prolog and epilog functions are also required in the default
idle call, i.e pseries_lpar_idle(). Hence move these accessor
functions to a common header file and call them from
pseries_lpar_idle(). Since the existing header files such as
asm/processor.h have enough clutter, create a new header file
asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog()
to pseries_idle_prolog() and pseries_idle_epilog() as they are only
relavent for on pseries guests.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 28 ++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/setup.c |  7 +++++--
 drivers/cpuidle/cpuidle-pseries.c      | 36 +++++++---------------------------
 3 files changed, 40 insertions(+), 31 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
new file mode 100644
index 0000000..e838ea5
--- /dev/null
+++ b/arch/powerpc/include/asm/idle.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_POWERPC_IDLE_H
+#define _ASM_POWERPC_IDLE_H
+#include <asm/runlatch.h>
+
+static inline void pseries_idle_prolog(unsigned long *in_purr)
+{
+	ppc64_runlatch_off();
+	*in_purr = mfspr(SPRN_PURR);
+	/*
+	 * Indicate to the HV that we are idle. Now would be
+	 * a good time to find other work to dispatch.
+	 */
+	get_lppaca()->idle = 1;
+}
+
+static inline void pseries_idle_epilog(unsigned long in_purr)
+{
+	u64 wait_cycles;
+
+	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+	wait_cycles += mfspr(SPRN_PURR) - in_purr;
+	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+	get_lppaca()->idle = 0;
+
+	ppc64_runlatch_on();
+}
+#endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0c8421d..2f53e6b 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -68,6 +68,7 @@
 #include <asm/isa-bridge.h>
 #include <asm/security_features.h>
 #include <asm/asm-const.h>
+#include <asm/idle.h>
 #include <asm/swiotlb.h>
 #include <asm/svm.h>
 
@@ -319,6 +320,8 @@ static int alloc_dispatch_log_kmem_cache(void)
 
 static void pseries_lpar_idle(void)
 {
+	unsigned long in_purr;
+
 	/*
 	 * Default handler to go into low thread priority and possibly
 	 * low power mode by ceding processor to hypervisor
@@ -328,7 +331,7 @@ static void pseries_lpar_idle(void)
 		return;
 
 	/* Indicate to hypervisor that we are idle. */
-	get_lppaca()->idle = 1;
+	pseries_idle_prolog(&in_purr);
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -339,7 +342,7 @@ static void pseries_lpar_idle(void)
 	 */
 	cede_processor();
 
-	get_lppaca()->idle = 0;
+	pseries_idle_epilog(in_purr);
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 74c2479..46d5e05 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -19,6 +19,7 @@
 #include <asm/machdep.h>
 #include <asm/firmware.h>
 #include <asm/runlatch.h>
+#include <asm/idle.h>
 #include <asm/plpar_wrappers.h>
 
 struct cpuidle_driver pseries_idle_driver = {
@@ -31,29 +32,6 @@ struct cpuidle_driver pseries_idle_driver = {
 static u64 snooze_timeout __read_mostly;
 static bool snooze_timeout_en __read_mostly;
 
-static inline void idle_loop_prolog(unsigned long *in_purr)
-{
-	ppc64_runlatch_off();
-	*in_purr = mfspr(SPRN_PURR);
-	/*
-	 * Indicate to the HV that we are idle. Now would be
-	 * a good time to find other work to dispatch.
-	 */
-	get_lppaca()->idle = 1;
-}
-
-static inline void idle_loop_epilog(unsigned long in_purr)
-{
-	u64 wait_cycles;
-
-	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-	wait_cycles += mfspr(SPRN_PURR) - in_purr;
-	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
-	get_lppaca()->idle = 0;
-
-	ppc64_runlatch_on();
-}
-
 static int snooze_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
@@ -63,7 +41,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
 	set_thread_flag(TIF_POLLING_NRFLAG);
 
-	idle_loop_prolog(&in_purr);
+	pseries_idle_prolog(&in_purr);
 	local_irq_enable();
 	snooze_exit_time = get_tb() + snooze_timeout;
 
@@ -87,7 +65,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
 	local_irq_disable();
 
-	idle_loop_epilog(in_purr);
+	pseries_idle_epilog(in_purr);
 
 	return index;
 }
@@ -115,7 +93,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 {
 	unsigned long in_purr;
 
-	idle_loop_prolog(&in_purr);
+	pseries_idle_prolog(&in_purr);
 	get_lppaca()->donate_dedicated_cpu = 1;
 
 	HMT_medium();
@@ -124,7 +102,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	local_irq_disable();
 	get_lppaca()->donate_dedicated_cpu = 0;
 
-	idle_loop_epilog(in_purr);
+	pseries_idle_epilog(in_purr);
 
 	return index;
 }
@@ -135,7 +113,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 {
 	unsigned long in_purr;
 
-	idle_loop_prolog(&in_purr);
+	pseries_idle_prolog(&in_purr);
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -147,7 +125,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 	check_and_cede_processor();
 
 	local_irq_disable();
-	idle_loop_epilog(in_purr);
+	pseries_idle_epilog(in_purr);
 
 	return index;
 }
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/6] powerpc/idle: Add accessor function to always read latest idle PURR
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently when CPU goes idle, we take a snapshot of PURR via
pseries_idle_prolog() which is used at the CPU idle exit to compute
the idle PURR cycles via the function pseries_idle_epilog().  Thus,
the value of idle PURR cycle thus read before pseries_idle_prolog() and
after pseries_idle_epilog() is always correct.

However, if we were to read the idle PURR cycles from an interrupt
context between pseries_idle_prolog() and pseries_idle_epilog() (this will
be done in a future patch), then, the value of the idle PURR thus read
will not include the cycles spent in the most recent idle period.

This patch addresses the issue by providing accessor function to read
the idle PURR such such that it includes the cycles spent in the most
recent idle period, if we read it between pseries_idle_prolog() and
pseries_idle_epilog(). In order to achieve it, the patch saves the
snapshot of PURR in pseries_idle_prolog() in a per-cpu variable,
instead of on the stack, so that it can be accessed from an interrupt
context.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 46 +++++++++++++++++++++++++++-------
 arch/powerpc/platforms/pseries/setup.c |  7 +++---
 drivers/cpuidle/cpuidle-pseries.c      | 15 +++++------
 3 files changed, 46 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index e838ea5..7552823 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -3,10 +3,27 @@
 #define _ASM_POWERPC_IDLE_H
 #include <asm/runlatch.h>
 
-static inline void pseries_idle_prolog(unsigned long *in_purr)
+DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+
+static inline void snapshot_purr_idle_entry(void)
+{
+	*this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
+}
+
+static inline void update_idle_purr_accounting(void)
+{
+	u64 wait_cycles;
+	u64 in_purr = *this_cpu_ptr(&idle_entry_purr_snap);
+
+	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+	wait_cycles += mfspr(SPRN_PURR) - in_purr;
+	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+}
+
+static inline void pseries_idle_prolog(void)
 {
 	ppc64_runlatch_off();
-	*in_purr = mfspr(SPRN_PURR);
+	snapshot_purr_idle_entry();
 	/*
 	 * Indicate to the HV that we are idle. Now would be
 	 * a good time to find other work to dispatch.
@@ -14,15 +31,26 @@ static inline void pseries_idle_prolog(unsigned long *in_purr)
 	get_lppaca()->idle = 1;
 }
 
-static inline void pseries_idle_epilog(unsigned long in_purr)
+static inline void pseries_idle_epilog(void)
 {
-	u64 wait_cycles;
-
-	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-	wait_cycles += mfspr(SPRN_PURR) - in_purr;
-	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+	update_idle_purr_accounting();
 	get_lppaca()->idle = 0;
-
 	ppc64_runlatch_on();
 }
+
+static inline u64 read_this_idle_purr(void)
+{
+	/*
+	 * If we are reading from an idle context, update the
+	 * idle-purr cycles corresponding to the last idle period.
+	 * Since the idle context is not yet over, take a fresh
+	 * snapshot of the idle-purr.
+	 */
+	if (unlikely(get_lppaca()->idle == 1)) {
+		update_idle_purr_accounting();
+		snapshot_purr_idle_entry();
+	}
+
+	return be64_to_cpu(get_lppaca()->wait_state_cycles);
+}
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 2f53e6b..4905c96 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,10 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_entry_purr_snap);
 static void pseries_lpar_idle(void)
 {
-	unsigned long in_purr;
-
 	/*
 	 * Default handler to go into low thread priority and possibly
 	 * low power mode by ceding processor to hypervisor
@@ -331,7 +330,7 @@ static void pseries_lpar_idle(void)
 		return;
 
 	/* Indicate to hypervisor that we are idle. */
-	pseries_idle_prolog(&in_purr);
+	pseries_idle_prolog();
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -342,7 +341,7 @@ static void pseries_lpar_idle(void)
 	 */
 	cede_processor();
 
-	pseries_idle_epilog(in_purr);
+	pseries_idle_epilog();
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 46d5e05..6513ef2 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -36,12 +36,11 @@ static int snooze_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
 {
-	unsigned long in_purr;
 	u64 snooze_exit_time;
 
 	set_thread_flag(TIF_POLLING_NRFLAG);
 
-	pseries_idle_prolog(&in_purr);
+	pseries_idle_prolog();
 	local_irq_enable();
 	snooze_exit_time = get_tb() + snooze_timeout;
 
@@ -65,7 +64,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
 	local_irq_disable();
 
-	pseries_idle_epilog(in_purr);
+	pseries_idle_epilog();
 
 	return index;
 }
@@ -91,9 +90,8 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv,
 				int index)
 {
-	unsigned long in_purr;
 
-	pseries_idle_prolog(&in_purr);
+	pseries_idle_prolog();
 	get_lppaca()->donate_dedicated_cpu = 1;
 
 	HMT_medium();
@@ -102,7 +100,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	local_irq_disable();
 	get_lppaca()->donate_dedicated_cpu = 0;
 
-	pseries_idle_epilog(in_purr);
+	pseries_idle_epilog();
 
 	return index;
 }
@@ -111,9 +109,8 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
 {
-	unsigned long in_purr;
 
-	pseries_idle_prolog(&in_purr);
+	pseries_idle_prolog();
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -125,7 +122,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 	check_and_cede_processor();
 
 	local_irq_disable();
-	pseries_idle_epilog(in_purr);
+	pseries_idle_epilog();
 
 	return index;
 }
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 2/6] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

Via pseries_idle_prolog(), pseries_idle_epilog(), we track the idle
PURR ticks in the VPA variable "wait_state_cycles". This patch extends
the support to account for the idle SPURR ticks. It also provides an
accessor function to accurately reads idle SPURR ticks.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 33 +++++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/setup.c |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index 7552823..a375589 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -3,13 +3,20 @@
 #define _ASM_POWERPC_IDLE_H
 #include <asm/runlatch.h>
 
+DECLARE_PER_CPU(u64, idle_spurr_cycles);
 DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+DECLARE_PER_CPU(u64, idle_entry_spurr_snap);
 
 static inline void snapshot_purr_idle_entry(void)
 {
 	*this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
 }
 
+static inline void snapshot_spurr_idle_entry(void)
+{
+	*this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR);
+}
+
 static inline void update_idle_purr_accounting(void)
 {
 	u64 wait_cycles;
@@ -20,10 +27,19 @@ static inline void update_idle_purr_accounting(void)
 	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
 }
 
+static inline void update_idle_spurr_accounting(void)
+{
+	u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles);
+	u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap);
+
+	*idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
+}
+
 static inline void pseries_idle_prolog(void)
 {
 	ppc64_runlatch_off();
 	snapshot_purr_idle_entry();
+	snapshot_spurr_idle_entry();
 	/*
 	 * Indicate to the HV that we are idle. Now would be
 	 * a good time to find other work to dispatch.
@@ -34,6 +50,7 @@ static inline void pseries_idle_prolog(void)
 static inline void pseries_idle_epilog(void)
 {
 	update_idle_purr_accounting();
+	update_idle_spurr_accounting();
 	get_lppaca()->idle = 0;
 	ppc64_runlatch_on();
 }
@@ -53,4 +70,20 @@ static inline u64 read_this_idle_purr(void)
 
 	return be64_to_cpu(get_lppaca()->wait_state_cycles);
 }
+
+static inline u64 read_this_idle_spurr(void)
+{
+	/*
+	 * If we are reading from an idle context, update the
+	 * idle-spurr cycles corresponding to the last idle period.
+	 * Since the idle context is not yet over, take a fresh
+	 * snapshot of the idle-spurr.
+	 */
+	if (get_lppaca()->idle == 1) {
+		update_idle_spurr_accounting();
+		snapshot_spurr_idle_entry();
+	}
+
+	return *this_cpu_ptr(&idle_spurr_cycles);
+}
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 4905c96..1b55e80 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,7 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_spurr_cycles);
 DEFINE_PER_CPU(u64, idle_entry_purr_snap);
+DEFINE_PER_CPU(u64, idle_entry_spurr_snap);
 static void pseries_lpar_idle(void)
 {
 	/*
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2020-03-11  9:37 ` [PATCH v3 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Gautham R. Shenoy
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

The total PURR and SPURR ticks are already exposed via the per-cpu
sysfs files "purr" and "spurr". This patch adds support for exposing
the idle PURR and SPURR ticks via new per-cpu sysfs files named
"idle_purr" and "idle_spurr".

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/sysfs.c | 54 ++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 479c706..c9ddb83 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -19,6 +19,7 @@
 #include <asm/smp.h>
 #include <asm/pmc.h>
 #include <asm/firmware.h>
+#include <asm/idle.h>
 #include <asm/svm.h>
 
 #include "cacheinfo.h"
@@ -760,6 +761,42 @@ static void create_svm_file(void)
 }
 #endif /* CONFIG_PPC_SVM */
 
+static void read_idle_purr(void *val)
+{
+	u64 *ret = val;
+
+	*ret = read_this_idle_purr();
+}
+
+static ssize_t idle_purr_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	u64 val;
+
+	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
+	return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
+
+static void read_idle_spurr(void *val)
+{
+	u64 *ret = val;
+
+	*ret = read_this_idle_spurr();
+}
+
+static ssize_t idle_spurr_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	u64 val;
+
+	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
+	return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
+
 static int register_cpu_online(unsigned int cpu)
 {
 	struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -823,10 +860,15 @@ static int register_cpu_online(unsigned int cpu)
 		if (!firmware_has_feature(FW_FEATURE_LPAR))
 			add_write_permission_dev_attr(&dev_attr_purr);
 		device_create_file(s, &dev_attr_purr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_create_file(s, &dev_attr_idle_purr);
 	}
 
-	if (cpu_has_feature(CPU_FTR_SPURR))
+	if (cpu_has_feature(CPU_FTR_SPURR)) {
 		device_create_file(s, &dev_attr_spurr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_create_file(s, &dev_attr_idle_spurr);
+	}
 
 	if (cpu_has_feature(CPU_FTR_DSCR))
 		device_create_file(s, &dev_attr_dscr);
@@ -910,11 +952,17 @@ static int unregister_cpu_online(unsigned int cpu)
 		device_remove_file(s, &dev_attr_mmcra);
 #endif /* CONFIG_PMU_SYSFS */
 
-	if (cpu_has_feature(CPU_FTR_PURR))
+	if (cpu_has_feature(CPU_FTR_PURR)) {
 		device_remove_file(s, &dev_attr_purr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_remove_file(s, &dev_attr_idle_purr);
+	}
 
-	if (cpu_has_feature(CPU_FTR_SPURR))
+	if (cpu_has_feature(CPU_FTR_SPURR)) {
 		device_remove_file(s, &dev_attr_spurr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_remove_file(s, &dev_attr_idle_spurr);
+	}
 
 	if (cpu_has_feature(CPU_FTR_DSCR))
 		device_remove_file(s, &dev_attr_dscr);
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (3 preceding siblings ...)
  2020-03-11  9:37 ` [PATCH v3 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  2020-03-11  9:37 ` [PATCH v3 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Gautham R. Shenoy
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Add documentation for the following sysfs interfaces:
/sys/devices/system/cpu/cpuX/purr
/sys/devices/system/cpu/cpuX/spurr
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b4..bc07677 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -580,3 +580,42 @@ Description:	Secure Virtual Machine
 		If 1, it means the system is using the Protected Execution
 		Facility in POWER9 and newer processors. i.e., it is a Secure
 		Virtual Machine.
+
+What: 		/sys/devices/system/cpu/cpuX/purr
+Date:		Apr 2005
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	PURR ticks for this CPU since the system boot.
+
+		The Processor Utilization Resources Register (PURR) is
+		a 64-bit counter which provides an estimate of the
+		resources used by the CPU thread. The contents of this
+		register increases monotonically. This sysfs interface
+		exposes the number of PURR ticks for cpuX.
+
+What: 		/sys/devices/system/cpu/cpuX/spurr
+Date:		Dec 2006
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	SPURR ticks for this CPU since the system boot.
+
+		The Scaled Processor Utilization Resources Register
+		(SPURR) is a 64-bit counter that provides a frequency
+		invariant estimate of the resources used by the CPU
+		thread. The contents of this register increases
+		monotonically. This sysfs interface exposes the number
+		of SPURR ticks for cpuX.
+
+What: 		/sys/devices/system/cpu/cpuX/idle_purr
+Date:		Mar 2020
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	PURR ticks for cpuX when it was idle.
+
+		This sysfs interface exposes the number of PURR ticks
+		for cpuX when it was idle.
+
+What: 		/sys/devices/system/cpu/cpuX/idle_spurr
+Date:		Mar 2020
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	SPURR ticks for cpuX when it was idle.
+
+		This sysfs interface exposes the number of SPURR ticks
+		for cpuX when it was idle.
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr
  2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (4 preceding siblings ...)
  2020-03-11  9:37 ` [PATCH v3 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
@ 2020-03-11  9:37 ` Gautham R. Shenoy
  5 siblings, 0 replies; 7+ messages in thread
From: Gautham R. Shenoy @ 2020-03-11  9:37 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU
via the sysfs interface
/sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently
generates an IPI to obtain the desired value from the target CPU X.
Since these aforementioned sysfs files are typically read one after
another, we end up generating 4 IPIs per CPU in a short duration.

In order to minimize the IPI noise, this patch caches the values of
all the four entities whenever one of them is read. If subsequently
any of these are read within the next 10ms, the cached value is
returned. With this, we will generate at most one IPI every 10ms for
every CPU.

Test-results: While reading the four sysfs files back-to-back for a
given CPU every second for 100 seconds.

Without the patch:
		 16 [XICS 2 Edge IPI] = 422 times
		 DBL [Doorbell interrupts] = 13 times
		 Total : 435 IPIs.

With the patch:
		  16 [XICS 2 Edge IPI] = 111 times
		  DBL [Doorbell interrupts] = 17 times
		  Total : 128 IPIs.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/sysfs.c | 109 ++++++++++++++++++++++++++++++++++++--------
 1 file changed, 90 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index c9ddb83..db8fc90 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -586,8 +586,6 @@ void ppc_enable_pmcs(void)
  * SPRs which are not related to PMU.
  */
 #ifdef CONFIG_PPC64
-SYSFS_SPRSETUP(purr, SPRN_PURR);
-SYSFS_SPRSETUP(spurr, SPRN_SPURR);
 SYSFS_SPRSETUP(pir, SPRN_PIR);
 SYSFS_SPRSETUP(tscr, SPRN_TSCR);
 
@@ -596,8 +594,6 @@ void ppc_enable_pmcs(void)
   enable write when needed with a separate function.
   Lets be conservative and default to pseries.
 */
-static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
-static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
 static DEVICE_ATTR(pir, 0400, show_pir, NULL);
 static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr);
 #endif /* CONFIG_PPC64 */
@@ -761,39 +757,114 @@ static void create_svm_file(void)
 }
 #endif /* CONFIG_PPC_SVM */
 
-static void read_idle_purr(void *val)
+/*
+ * The duration (in ms) from the last IPI to the target CPU until
+ * which a cached value of purr, spurr, idle_purr, idle_spurr can be
+ * reported to the user on a corresponding sysfs file read. Beyond
+ * this duration, fresh values need to be obtained by sending IPIs to
+ * the target CPU when the sysfs files are read.
+ */
+static unsigned long util_stats_staleness_tolerance_ms = 10;
+struct util_acct_stats {
+	u64 latest_purr;
+	u64 latest_spurr;
+	u64 latest_idle_purr;
+	u64 latest_idle_spurr;
+	unsigned long last_update_jiffies;
+};
+
+DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats);
+
+static void update_util_acct_stats(void *ptr)
 {
-	u64 *ret = val;
+	struct util_acct_stats *stats = ptr;
 
-	*ret = read_this_idle_purr();
+	stats->latest_purr = mfspr(SPRN_PURR);
+	stats->latest_spurr = mfspr(SPRN_SPURR);
+	stats->latest_idle_purr = read_this_idle_purr();
+	stats->latest_idle_spurr = read_this_idle_spurr();
+	stats->last_update_jiffies = jiffies;
 }
 
-static ssize_t idle_purr_show(struct device *dev,
-			      struct device_attribute *attr, char *buf)
+struct util_acct_stats *get_util_stats_ptr(int cpu)
+{
+	struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu);
+	unsigned long delta_jiffies;
+
+	delta_jiffies = jiffies - stats->last_update_jiffies;
+
+	/*
+	 * If we have a recent enough data, reuse that instead of
+	 * sending an IPI.
+	 */
+	if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms)
+		return stats;
+
+	smp_call_function_single(cpu, update_util_acct_stats, stats, 1);
+	return stats;
+}
+
+static ssize_t show_purr(struct device *dev,
+			 struct device_attribute *attr, char *buf)
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
-	u64 val;
+	struct util_acct_stats *stats;
 
-	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
-	return sprintf(buf, "%llx\n", val);
+	stats = get_util_stats_ptr(cpu->dev.id);
+	return sprintf(buf, "%llx\n", stats->latest_purr);
 }
-static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
 
-static void read_idle_spurr(void *val)
+static void write_purr(void *val)
 {
-	u64 *ret = val;
+	mtspr(SPRN_PURR, *(unsigned long *)val);
+}
 
-	*ret = read_this_idle_spurr();
+static ssize_t __used store_purr(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t count)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	unsigned long val;
+	int ret = kstrtoul(buf, 16, &val);
+
+	if (ret != 0)
+		return -EINVAL;
+
+	smp_call_function_single(cpu->dev.id, write_purr, &val, 1);
+	return count;
+}
+static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
+
+static ssize_t show_spurr(struct device *dev,
+			  struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	struct util_acct_stats *stats;
+
+	stats = get_util_stats_ptr(cpu->dev.id);
+	return sprintf(buf, "%llx\n", stats->latest_spurr);
 }
+static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
+
+static ssize_t idle_purr_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	struct util_acct_stats *stats;
+
+	stats = get_util_stats_ptr(cpu->dev.id);
+	return sprintf(buf, "%llx\n", stats->latest_idle_purr);
+}
+static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
 
 static ssize_t idle_spurr_show(struct device *dev,
 			       struct device_attribute *attr, char *buf)
 {
 	struct cpu *cpu = container_of(dev, struct cpu, dev);
-	u64 val;
+	struct util_acct_stats *stats;
 
-	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
-	return sprintf(buf, "%llx\n", val);
+	stats =  get_util_stats_ptr(cpu->dev.id);
+	return sprintf(buf, "%llx\n", stats->latest_idle_spurr);
 }
 static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
 
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-11  9:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11  9:37 [PATCH v3 0/6] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 2/6] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
2020-03-11  9:37 ` [PATCH v3 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Gautham R. Shenoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).