linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks
@ 2020-02-21  5:18 Gautham R. Shenoy
  2020-02-21  5:18 ` [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the second version of the patches to track and expose idle
PURR and SPURR ticks. These patches are required by tools such as
lparstat to compute system utilization for capacity planning purposes.

v1 can be found here: https://lore.kernel.org/patchwork/cover/1159341/

The key changes from v1 are

    - The sysfs reads of idle PURR and SPURR now send an
      smp_call_function to the target CPU in order to read the most
      recent value of idle PURR and SPURR. This is required if the
      target CPU was idle for a long duration, in which case the
      cycles corresponding to its latest idle duration would not be
      updated in the variable tracking idle PURR/SPURR. Thus merely
      reading the variable would not reflect the most accurate idle
      PURR/SPURR ticks.
    
    - Ensured that even when idle PURR/SPURR values are read in an
      interrupt context in-between idle_loop_prolog() and
      idle_loop_epilog(), we return the value that includes the cycles
      spent in the most recent idle period.

    - The sysfs files for idle_purr and idle_spurr are created only
      when the FW_FEATURE_LPAR is enabled (the earlier version was
      checking for FW_FEATURE_SPLPAR)

Motivation:
===========
On PSeries LPARs, the data centers planners desire a more accurate
view of system utilization per resource such as CPU to plan the system
capacity requirements better. Such accuracy can be obtained by reading
PURR/SPURR registers for CPU resource utilization.

Tools such as lparstat which are used to compute the utilization need
to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
counters are already exposed through sysfs.  We already account for
PURR ticks when we go to idle so that we can update the VPA area. This
patchset extends support to account for SPURR ticks when idle, and
expose both via per-cpu sysfs files.

These patches are required for enhancement to the lparstat utility
that compute the CPU utilization based on PURR and SPURR which can be
found here :
https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4

With the patches, when lparstat is run on a LPAR running CPU-Hogs,
=========================================================================
$sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 
---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
 99.99   0.00  3.35GHz[111%] 110.99   0.00
100.00   0.00  3.35GHz[111%] 111.00   0.00
100.00   0.00  3.35GHz[111%] 111.00   0.00
=========================================================================

When lparstat is run on an LPAR that is idle,
=========================================================================
$ sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 
---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
  0.09  99.91  2.11GHz[ 70%]   0.11  69.90
  0.32  99.68  2.17GHz[ 72%]   0.25  71.75
  0.56  99.44  2.18GHz[ 72%]   0.42  71.58
=========================================================================

Gautham R. Shenoy (5):
  powerpc: Move idle_loop_prolog()/epilog() functions to header file
  powerpc/idle: Add accessor function to always read latest idle PURR
  powerpc/pseries: Account for SPURR ticks on idle CPUs
  powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  Documentation: Document sysfs interfaces purr, spurr, idle_purr,
    idle_spurr

 Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++++++++++
 arch/powerpc/include/asm/idle.h                    | 88 ++++++++++++++++++++++
 arch/powerpc/kernel/sysfs.c                        | 54 ++++++++++++-
 arch/powerpc/platforms/pseries/setup.c             |  8 +-
 drivers/cpuidle/cpuidle-pseries.c                  | 39 ++--------
 5 files changed, 191 insertions(+), 37 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

-- 
1.9.4


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
@ 2020-02-21  5:18 ` Gautham R. Shenoy
  2020-02-21 15:03   ` Nathan Lynch
  2020-02-21  5:18 ` [PATCH v2 2/5] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently prior to entering an idle state on a Linux Guest, the
pseries cpuidle driver implement an idle_loop_prolog() and
idle_loop_epilog() functions which ensure that idle_purr is correctly
computed, and the hypervisor is informed that the CPU cycles have been
donated.

These prolog and epilog functions are also required in the default
idle call, i.e pseries_lpar_idle(). Hence move these accessor
functions to a common header file and call them from
pseries_lpar_idle(). Since the existing header files such as
asm/processor.h have enough clutter, create a new header file
asm/idle.h.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 27 +++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/setup.c |  7 +++++--
 drivers/cpuidle/cpuidle-pseries.c      | 24 +-----------------------
 3 files changed, 33 insertions(+), 25 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
new file mode 100644
index 0000000..f32a7d8
--- /dev/null
+++ b/arch/powerpc/include/asm/idle.h
@@ -0,0 +1,27 @@
+#ifndef _ASM_POWERPC_IDLE_H
+#define _ASM_POWERPC_IDLE_H
+#include <asm/runlatch.h>
+
+static inline void idle_loop_prolog(unsigned long *in_purr)
+{
+	ppc64_runlatch_off();
+	*in_purr = mfspr(SPRN_PURR);
+	/*
+	 * Indicate to the HV that we are idle. Now would be
+	 * a good time to find other work to dispatch.
+	 */
+	get_lppaca()->idle = 1;
+}
+
+static inline void idle_loop_epilog(unsigned long in_purr)
+{
+	u64 wait_cycles;
+
+	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+	wait_cycles += mfspr(SPRN_PURR) - in_purr;
+	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+	get_lppaca()->idle = 0;
+
+	ppc64_runlatch_on();
+}
+#endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 0c8421d..ffd4d59 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -68,6 +68,7 @@
 #include <asm/isa-bridge.h>
 #include <asm/security_features.h>
 #include <asm/asm-const.h>
+#include <asm/idle.h>
 #include <asm/swiotlb.h>
 #include <asm/svm.h>
 
@@ -319,6 +320,8 @@ static int alloc_dispatch_log_kmem_cache(void)
 
 static void pseries_lpar_idle(void)
 {
+	unsigned long in_purr;
+
 	/*
 	 * Default handler to go into low thread priority and possibly
 	 * low power mode by ceding processor to hypervisor
@@ -328,7 +331,7 @@ static void pseries_lpar_idle(void)
 		return;
 
 	/* Indicate to hypervisor that we are idle. */
-	get_lppaca()->idle = 1;
+	idle_loop_prolog(&in_purr);
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -339,7 +342,7 @@ static void pseries_lpar_idle(void)
 	 */
 	cede_processor();
 
-	get_lppaca()->idle = 0;
+	idle_loop_epilog(in_purr);
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index 74c2479..fc9dee9c 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -19,6 +19,7 @@
 #include <asm/machdep.h>
 #include <asm/firmware.h>
 #include <asm/runlatch.h>
+#include <asm/idle.h>
 #include <asm/plpar_wrappers.h>
 
 struct cpuidle_driver pseries_idle_driver = {
@@ -31,29 +32,6 @@ struct cpuidle_driver pseries_idle_driver = {
 static u64 snooze_timeout __read_mostly;
 static bool snooze_timeout_en __read_mostly;
 
-static inline void idle_loop_prolog(unsigned long *in_purr)
-{
-	ppc64_runlatch_off();
-	*in_purr = mfspr(SPRN_PURR);
-	/*
-	 * Indicate to the HV that we are idle. Now would be
-	 * a good time to find other work to dispatch.
-	 */
-	get_lppaca()->idle = 1;
-}
-
-static inline void idle_loop_epilog(unsigned long in_purr)
-{
-	u64 wait_cycles;
-
-	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-	wait_cycles += mfspr(SPRN_PURR) - in_purr;
-	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
-	get_lppaca()->idle = 0;
-
-	ppc64_runlatch_on();
-}
-
 static int snooze_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/5] powerpc/idle: Add accessor function to always read latest idle PURR
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
  2020-02-21  5:18 ` [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
@ 2020-02-21  5:18 ` Gautham R. Shenoy
  2020-02-21  5:18 ` [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Currently when CPU goes idle, we take a snapshot of PURR via
idle_loop_prolog() which is used at the CPU idle exit to compute the
idle PURR cycles via the function idle_loop_epilog().  Thus, the value
of idle PURR cycle thus read before idle_loop_prolog() and after
idle_loop_epilog() is always correct.

However, if we were to read the idle PURR cycles from an interrupt
context between idle_loop_prolog() and idle_loop_epilog() (this will
be done in a future patch), then, the value of the idle PURR thus read
will not include the cycles spent in the most recent idle period.

This patch addresses the issue by providing accessor function to read
the idle PURR such such that it includes the cycles spent in the most
recent idle period, if we read it between idle_loop_prolog() and
idle_loop_epilog(). In order to achieve it, the patch saves the
snapshot of PURR in idle_loop_prolog() in a per-cpu variable, instead
of on the stack, so that it can be accessed from an interrupt context.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 46 +++++++++++++++++++++++++++-------
 arch/powerpc/platforms/pseries/setup.c |  7 +++---
 drivers/cpuidle/cpuidle-pseries.c      | 15 +++++------
 3 files changed, 46 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index f32a7d8..126a217 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -2,10 +2,27 @@
 #define _ASM_POWERPC_IDLE_H
 #include <asm/runlatch.h>
 
-static inline void idle_loop_prolog(unsigned long *in_purr)
+DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+
+static inline void snapshot_purr_idle_entry(void)
+{
+	*this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
+}
+
+static inline void update_idle_purr_accounting(void)
+{
+	u64 wait_cycles;
+	u64 in_purr = *this_cpu_ptr(&idle_entry_purr_snap);
+
+	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+	wait_cycles += mfspr(SPRN_PURR) - in_purr;
+	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+}
+
+static inline void idle_loop_prolog(void)
 {
 	ppc64_runlatch_off();
-	*in_purr = mfspr(SPRN_PURR);
+	snapshot_purr_idle_entry();
 	/*
 	 * Indicate to the HV that we are idle. Now would be
 	 * a good time to find other work to dispatch.
@@ -13,15 +30,26 @@ static inline void idle_loop_prolog(unsigned long *in_purr)
 	get_lppaca()->idle = 1;
 }
 
-static inline void idle_loop_epilog(unsigned long in_purr)
+static inline void idle_loop_epilog(void)
 {
-	u64 wait_cycles;
-
-	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-	wait_cycles += mfspr(SPRN_PURR) - in_purr;
-	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+	update_idle_purr_accounting();
 	get_lppaca()->idle = 0;
-
 	ppc64_runlatch_on();
 }
+
+static inline u64 read_this_idle_purr(void)
+{
+	/*
+	 * If we are reading from an idle context, update the
+	 * idle-purr cycles corresponding to the last idle period.
+	 * Since the idle context is not yet over, take a fresh
+	 * snapshot of the idle-purr.
+	 */
+	if (unlikely(get_lppaca()->idle == 1)) {
+		update_idle_purr_accounting();
+		snapshot_purr_idle_entry();
+	}
+
+	return be64_to_cpu(get_lppaca()->wait_state_cycles);
+}
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index ffd4d59..e9f2cefa 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,10 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_entry_purr_snap);
 static void pseries_lpar_idle(void)
 {
-	unsigned long in_purr;
-
 	/*
 	 * Default handler to go into low thread priority and possibly
 	 * low power mode by ceding processor to hypervisor
@@ -331,7 +330,7 @@ static void pseries_lpar_idle(void)
 		return;
 
 	/* Indicate to hypervisor that we are idle. */
-	idle_loop_prolog(&in_purr);
+	idle_loop_prolog();
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -342,7 +341,7 @@ static void pseries_lpar_idle(void)
 	 */
 	cede_processor();
 
-	idle_loop_epilog(in_purr);
+	idle_loop_epilog();
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
index fc9dee9c..98d3832 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -36,12 +36,11 @@ static int snooze_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
 {
-	unsigned long in_purr;
 	u64 snooze_exit_time;
 
 	set_thread_flag(TIF_POLLING_NRFLAG);
 
-	idle_loop_prolog(&in_purr);
+	idle_loop_prolog();
 	local_irq_enable();
 	snooze_exit_time = get_tb() + snooze_timeout;
 
@@ -65,7 +64,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
 	local_irq_disable();
 
-	idle_loop_epilog(in_purr);
+	idle_loop_epilog();
 
 	return index;
 }
@@ -91,9 +90,8 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv,
 				int index)
 {
-	unsigned long in_purr;
 
-	idle_loop_prolog(&in_purr);
+	idle_loop_prolog();
 	get_lppaca()->donate_dedicated_cpu = 1;
 
 	HMT_medium();
@@ -102,7 +100,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev,
 	local_irq_disable();
 	get_lppaca()->donate_dedicated_cpu = 0;
 
-	idle_loop_epilog(in_purr);
+	idle_loop_epilog();
 
 	return index;
 }
@@ -111,9 +109,8 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 			struct cpuidle_driver *drv,
 			int index)
 {
-	unsigned long in_purr;
 
-	idle_loop_prolog(&in_purr);
+	idle_loop_prolog();
 
 	/*
 	 * Yield the processor to the hypervisor.  We return if
@@ -125,7 +122,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
 	check_and_cede_processor();
 
 	local_irq_disable();
-	idle_loop_epilog(in_purr);
+	idle_loop_epilog();
 
 	return index;
 }
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
  2020-02-21  5:18 ` [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
  2020-02-21  5:18 ` [PATCH v2 2/5] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
@ 2020-02-21  5:18 ` Gautham R. Shenoy
  2020-02-21 16:47   ` Nathan Lynch
  2020-02-21  5:18 ` [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

Via idle_loop_prolog(), idle_loop_epilog(), we track the idle PURR
ticks in the VPA variable "wait_state_cycles". This patch extends the
support to account for the idle SPURR ticks. It also provides an
accessor function to accurately reads idle SPURR ticks.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/idle.h        | 33 +++++++++++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/setup.c |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index 126a217..db82fc1 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -2,13 +2,20 @@
 #define _ASM_POWERPC_IDLE_H
 #include <asm/runlatch.h>
 
+DECLARE_PER_CPU(u64, idle_spurr_cycles);
 DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+DECLARE_PER_CPU(u64, idle_entry_spurr_snap);
 
 static inline void snapshot_purr_idle_entry(void)
 {
 	*this_cpu_ptr(&idle_entry_purr_snap) = mfspr(SPRN_PURR);
 }
 
+static inline void snapshot_spurr_idle_entry(void)
+{
+	*this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR);
+}
+
 static inline void update_idle_purr_accounting(void)
 {
 	u64 wait_cycles;
@@ -19,10 +26,19 @@ static inline void update_idle_purr_accounting(void)
 	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
 }
 
+static inline void update_idle_spurr_accounting(void)
+{
+	u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles);
+	u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap);
+
+	*idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
+}
+
 static inline void idle_loop_prolog(void)
 {
 	ppc64_runlatch_off();
 	snapshot_purr_idle_entry();
+	snapshot_spurr_idle_entry();
 	/*
 	 * Indicate to the HV that we are idle. Now would be
 	 * a good time to find other work to dispatch.
@@ -33,6 +49,7 @@ static inline void idle_loop_prolog(void)
 static inline void idle_loop_epilog(void)
 {
 	update_idle_purr_accounting();
+	update_idle_spurr_accounting();
 	get_lppaca()->idle = 0;
 	ppc64_runlatch_on();
 }
@@ -52,4 +69,20 @@ static inline u64 read_this_idle_purr(void)
 
 	return be64_to_cpu(get_lppaca()->wait_state_cycles);
 }
+
+static inline u64 read_this_idle_spurr(void)
+{
+	/*
+	 * If we are reading from an idle context, update the
+	 * idle-spurr cycles corresponding to the last idle period.
+	 * Since the idle context is not yet over, take a fresh
+	 * snapshot of the idle-spurr.
+	 */
+	if (get_lppaca()->idle == 1) {
+		update_idle_spurr_accounting();
+		snapshot_spurr_idle_entry();
+	}
+
+	return *this_cpu_ptr(&idle_spurr_cycles);
+}
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index e9f2cefa..5ef5c82 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,7 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_spurr_cycles);
 DEFINE_PER_CPU(u64, idle_entry_purr_snap);
+DEFINE_PER_CPU(u64, idle_entry_spurr_snap);
 static void pseries_lpar_idle(void)
 {
 	/*
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2020-02-21  5:18 ` [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
@ 2020-02-21  5:18 ` Gautham R. Shenoy
  2020-02-21 16:50   ` Nathan Lynch
  2020-02-21  5:18 ` [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
  2020-02-24  4:17 ` [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Kamalesh Babulal
  5 siblings, 1 reply; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

The total PURR and SPURR ticks are already exposed via the per-cpu
sysfs files "purr" and "spurr". This patch adds support for exposing
the idle PURR and SPURR ticks via new per-cpu sysfs files named
"idle_purr" and "idle_spurr".

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/sysfs.c | 54 ++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 80a676d..5b4b450 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -19,6 +19,7 @@
 #include <asm/smp.h>
 #include <asm/pmc.h>
 #include <asm/firmware.h>
+#include <asm/idle.h>
 #include <asm/svm.h>
 
 #include "cacheinfo.h"
@@ -733,6 +734,42 @@ static void create_svm_file(void)
 }
 #endif /* CONFIG_PPC_SVM */
 
+static void read_idle_purr(void *val)
+{
+	u64 *ret = (u64 *)val;
+
+	*ret = read_this_idle_purr();
+}
+
+static ssize_t idle_purr_show(struct device *dev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	u64 val;
+
+	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
+	return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
+
+static void read_idle_spurr(void *val)
+{
+	u64 *ret = (u64 *)val;
+
+	*ret = read_this_idle_spurr();
+}
+
+static ssize_t idle_spurr_show(struct device *dev,
+			       struct device_attribute *attr, char *buf)
+{
+	struct cpu *cpu = container_of(dev, struct cpu, dev);
+	u64 val;
+
+	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
+	return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
+
 static int register_cpu_online(unsigned int cpu)
 {
 	struct cpu *c = &per_cpu(cpu_devices, cpu);
@@ -794,10 +831,15 @@ static int register_cpu_online(unsigned int cpu)
 		if (!firmware_has_feature(FW_FEATURE_LPAR))
 			add_write_permission_dev_attr(&dev_attr_purr);
 		device_create_file(s, &dev_attr_purr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_create_file(s, &dev_attr_idle_purr);
 	}
 
-	if (cpu_has_feature(CPU_FTR_SPURR))
+	if (cpu_has_feature(CPU_FTR_SPURR)) {
 		device_create_file(s, &dev_attr_spurr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_create_file(s, &dev_attr_idle_spurr);
+	}
 
 	if (cpu_has_feature(CPU_FTR_DSCR))
 		device_create_file(s, &dev_attr_dscr);
@@ -879,11 +921,17 @@ static int unregister_cpu_online(unsigned int cpu)
 	if (cpu_has_feature(CPU_FTR_MMCRA))
 		device_remove_file(s, &dev_attr_mmcra);
 
-	if (cpu_has_feature(CPU_FTR_PURR))
+	if (cpu_has_feature(CPU_FTR_PURR)) {
 		device_remove_file(s, &dev_attr_purr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_remove_file(s, &dev_attr_idle_purr);
+	}
 
-	if (cpu_has_feature(CPU_FTR_SPURR))
+	if (cpu_has_feature(CPU_FTR_SPURR)) {
 		device_remove_file(s, &dev_attr_spurr);
+		if (firmware_has_feature(FW_FEATURE_LPAR))
+			device_remove_file(s, &dev_attr_idle_spurr);
+	}
 
 	if (cpu_has_feature(CPU_FTR_DSCR))
 		device_remove_file(s, &dev_attr_dscr);
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (3 preceding siblings ...)
  2020-02-21  5:18 ` [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
@ 2020-02-21  5:18 ` Gautham R. Shenoy
  2020-02-21 16:55   ` Nathan Lynch
  2020-02-24  4:17 ` [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Kamalesh Babulal
  5 siblings, 1 reply; 19+ messages in thread
From: Gautham R. Shenoy @ 2020-02-21  5:18 UTC (permalink / raw)
  To: Nathan Lynch, Michael Ellerman, Vaidyanathan Srinivasan,
	Kamalesh Babulal, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Add documentation for the following sysfs interfaces:
/sys/devices/system/cpu/cpuX/purr
/sys/devices/system/cpu/cpuX/spurr
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b4..799dc737a 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -580,3 +580,42 @@ Description:	Secure Virtual Machine
 		If 1, it means the system is using the Protected Execution
 		Facility in POWER9 and newer processors. i.e., it is a Secure
 		Virtual Machine.
+
+What: 		/sys/devices/system/cpu/cpuX/purr
+Date:		Apr 2005
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	PURR ticks for this CPU since the system boot.
+
+		The Processor Utilization Resources Register (PURR) is
+		a 64-bit counter which provides an estimate of the
+		resources used by the CPU thread. The contents of this
+		register increases monotonically. This sysfs interface
+		exposes the number of PURR ticks for cpuX.
+
+What: 		/sys/devices/system/cpu/cpuX/spurr
+Date:		Dec 2006
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	SPURR ticks for this CPU since the system boot.
+
+		The Scaled Processor Utilization Resources Register
+		(SPURR) is a 64-bit counter that provides a frequency
+		invariant estimate of the resources used by the CPU
+		thread. The contents of this register increases
+		monotonically. This sysfs interface exposes the number
+		of SPURR ticks for cpuX.
+
+What: 		/sys/devices/system/cpu/cpuX/idle_purr
+Date:		Nov 2019
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	PURR ticks for cpuX when it was idle.
+
+		This sysfs interface exposes the number of PURR ticks
+		for cpuX when it was idle.
+
+What: 		/sys/devices/system/cpu/cpuX/spurr
+Date:		Nov 2019
+Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	SPURR ticks for cpuX when it was idle.
+
+		This sysfs interface exposes the number of SPURR ticks
+		for cpuX when it was idle.
-- 
1.9.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file
  2020-02-21  5:18 ` [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
@ 2020-02-21 15:03   ` Nathan Lynch
  2020-02-24  4:55     ` Gautham R Shenoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Lynch @ 2020-02-21 15:03 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:

> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> Currently prior to entering an idle state on a Linux Guest, the
> pseries cpuidle driver implement an idle_loop_prolog() and
> idle_loop_epilog() functions which ensure that idle_purr is correctly
> computed, and the hypervisor is informed that the CPU cycles have been
> donated.
>
> These prolog and epilog functions are also required in the default
> idle call, i.e pseries_lpar_idle(). Hence move these accessor
> functions to a common header file and call them from
> pseries_lpar_idle(). Since the existing header files such as
> asm/processor.h have enough clutter, create a new header file
> asm/idle.h.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/idle.h        | 27 +++++++++++++++++++++++++++
>  arch/powerpc/platforms/pseries/setup.c |  7 +++++--
>  drivers/cpuidle/cpuidle-pseries.c      | 24 +-----------------------
>  3 files changed, 33 insertions(+), 25 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/idle.h
>
> diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
> new file mode 100644
> index 0000000..f32a7d8
> --- /dev/null
> +++ b/arch/powerpc/include/asm/idle.h
> @@ -0,0 +1,27 @@
> +#ifndef _ASM_POWERPC_IDLE_H
> +#define _ASM_POWERPC_IDLE_H
> +#include <asm/runlatch.h>
> +
> +static inline void idle_loop_prolog(unsigned long *in_purr)
> +{
> +	ppc64_runlatch_off();
> +	*in_purr = mfspr(SPRN_PURR);
> +	/*
> +	 * Indicate to the HV that we are idle. Now would be
> +	 * a good time to find other work to dispatch.
> +	 */
> +	get_lppaca()->idle = 1;
> +}
> +
> +static inline void idle_loop_epilog(unsigned long in_purr)
> +{
> +	u64 wait_cycles;
> +
> +	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
> +	wait_cycles += mfspr(SPRN_PURR) - in_purr;
> +	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
> +	get_lppaca()->idle = 0;
> +
> +	ppc64_runlatch_on();
> +}
> +#endif

Looks fine and correct as a cleanup, but asm/include/idle.h and
idle_loop_prolog, idle_loop_epilog, strike me as too generic for
pseries-specific code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs
  2020-02-21  5:18 ` [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
@ 2020-02-21 16:47   ` Nathan Lynch
  2020-02-24  5:05     ` Gautham R Shenoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Lynch @ 2020-02-21 16:47 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> +static inline void snapshot_spurr_idle_entry(void)
> +{
> +	*this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR);
> +}
> +

[...]

> +static inline void update_idle_spurr_accounting(void)
> +{
> +	u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles);
> +	u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap);
> +
> +	*idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
> +}

[...]

> +static inline u64 read_this_idle_spurr(void)
> +{
> +	/*
> +	 * If we are reading from an idle context, update the
> +	 * idle-spurr cycles corresponding to the last idle period.
> +	 * Since the idle context is not yet over, take a fresh
> +	 * snapshot of the idle-spurr.
> +	 */
> +	if (get_lppaca()->idle == 1) {
> +		update_idle_spurr_accounting();
> +		snapshot_spurr_idle_entry();

This samples spurr twice when it could do with just one. I don't know
the performance implications, but will the results be coherent?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-02-21  5:18 ` [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
@ 2020-02-21 16:50   ` Nathan Lynch
  2020-02-24  5:14     ` Gautham R Shenoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Lynch @ 2020-02-21 16:50 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 80a676d..5b4b450 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -19,6 +19,7 @@
>  #include <asm/smp.h>
>  #include <asm/pmc.h>
>  #include <asm/firmware.h>
> +#include <asm/idle.h>
>  #include <asm/svm.h>
>  
>  #include "cacheinfo.h"
> @@ -733,6 +734,42 @@ static void create_svm_file(void)
>  }
>  #endif /* CONFIG_PPC_SVM */
>  
> +static void read_idle_purr(void *val)
> +{
> +	u64 *ret = (u64 *)val;

No cast from void* needed.


> +
> +	*ret = read_this_idle_purr();
> +}
> +
> +static ssize_t idle_purr_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct cpu *cpu = container_of(dev, struct cpu, dev);
> +	u64 val;
> +
> +	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
> +	return sprintf(buf, "%llx\n", val);
> +}
> +static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
> +
> +static void read_idle_spurr(void *val)
> +{
> +	u64 *ret = (u64 *)val;
> +
> +	*ret = read_this_idle_spurr();
> +}
> +
> +static ssize_t idle_spurr_show(struct device *dev,
> +			       struct device_attribute *attr, char *buf)
> +{
> +	struct cpu *cpu = container_of(dev, struct cpu, dev);
> +	u64 val;
> +
> +	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
> +	return sprintf(buf, "%llx\n", val);
> +}
> +static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);

It's regrettable that we have to wake up potentially idle CPUs in order
to derive correct idle statistics for them, but I suppose the main user
(lparstat) of these interfaces already is causing this to happen by
polling the existing per-cpu purr and spurr attributes.

So now lparstat will incur at minimum four syscalls and four IPIs per
CPU per polling interval -- one for each of purr, spurr, idle_purr and
idle_spurr. Correct?

At some point it's going to make sense to batch sampling of remote CPUs'
SPRs.


>  static int register_cpu_online(unsigned int cpu)
>  {
>  	struct cpu *c = &per_cpu(cpu_devices, cpu);
> @@ -794,10 +831,15 @@ static int register_cpu_online(unsigned int cpu)
>  		if (!firmware_has_feature(FW_FEATURE_LPAR))
>  			add_write_permission_dev_attr(&dev_attr_purr);
>  		device_create_file(s, &dev_attr_purr);
> +		if (firmware_has_feature(FW_FEATURE_LPAR))
> +			device_create_file(s, &dev_attr_idle_purr);
>  	}
>  
> -	if (cpu_has_feature(CPU_FTR_SPURR))
> +	if (cpu_has_feature(CPU_FTR_SPURR)) {
>  		device_create_file(s, &dev_attr_spurr);
> +		if (firmware_has_feature(FW_FEATURE_LPAR))
> +			device_create_file(s, &dev_attr_idle_spurr);
> +	}
>  
>  	if (cpu_has_feature(CPU_FTR_DSCR))
>  		device_create_file(s, &dev_attr_dscr);
> @@ -879,11 +921,17 @@ static int unregister_cpu_online(unsigned int cpu)
>  	if (cpu_has_feature(CPU_FTR_MMCRA))
>  		device_remove_file(s, &dev_attr_mmcra);
>  
> -	if (cpu_has_feature(CPU_FTR_PURR))
> +	if (cpu_has_feature(CPU_FTR_PURR)) {
>  		device_remove_file(s, &dev_attr_purr);
> +		if (firmware_has_feature(FW_FEATURE_LPAR))
> +			device_remove_file(s, &dev_attr_idle_purr);
> +	}
>  
> -	if (cpu_has_feature(CPU_FTR_SPURR))
> +	if (cpu_has_feature(CPU_FTR_SPURR)) {
>  		device_remove_file(s, &dev_attr_spurr);
> +		if (firmware_has_feature(FW_FEATURE_LPAR))
> +			device_remove_file(s, &dev_attr_idle_spurr);
> +	}
>  
>  	if (cpu_has_feature(CPU_FTR_DSCR))
>  		device_remove_file(s, &dev_attr_dscr);

The cpu register/unregister stuff here looks correct.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
  2020-02-21  5:18 ` [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
@ 2020-02-21 16:55   ` Nathan Lynch
  2020-02-24  5:15     ` Gautham R Shenoy
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Lynch @ 2020-02-21 16:55 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

"Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 2e0e3b4..799dc737a 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -580,3 +580,42 @@ Description:	Secure Virtual Machine
>  		If 1, it means the system is using the Protected Execution
>  		Facility in POWER9 and newer processors. i.e., it is a Secure
>  		Virtual Machine.
> +
> +What: 		/sys/devices/system/cpu/cpuX/purr
> +Date:		Apr 2005
> +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description:	PURR ticks for this CPU since the system boot.
> +
> +		The Processor Utilization Resources Register (PURR) is
> +		a 64-bit counter which provides an estimate of the
> +		resources used by the CPU thread. The contents of this
> +		register increases monotonically. This sysfs interface
> +		exposes the number of PURR ticks for cpuX.
> +
> +What: 		/sys/devices/system/cpu/cpuX/spurr
> +Date:		Dec 2006
> +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description:	SPURR ticks for this CPU since the system boot.
> +
> +		The Scaled Processor Utilization Resources Register
> +		(SPURR) is a 64-bit counter that provides a frequency
> +		invariant estimate of the resources used by the CPU
> +		thread. The contents of this register increases
> +		monotonically. This sysfs interface exposes the number
> +		of SPURR ticks for cpuX.
> +
> +What: 		/sys/devices/system/cpu/cpuX/idle_purr
> +Date:		Nov 2019
> +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description:	PURR ticks for cpuX when it was idle.
> +
> +		This sysfs interface exposes the number of PURR ticks
> +		for cpuX when it was idle.
> +
> +What: 		/sys/devices/system/cpu/cpuX/spurr

Copy-paste error? This should be:

                        /sys/devices/system/cpu/cpuX/idle_spurr

> +Date:		Nov 2019

And I suppose Nov 2019 is no longer accurate.


> +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description:	SPURR ticks for cpuX when it was idle.
> +
> +		This sysfs interface exposes the number of SPURR ticks
> +		for cpuX when it was idle.
> -- 
> 1.9.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks
  2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
                   ` (4 preceding siblings ...)
  2020-02-21  5:18 ` [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
@ 2020-02-24  4:17 ` Kamalesh Babulal
  5 siblings, 0 replies; 19+ messages in thread
From: Kamalesh Babulal @ 2020-02-24  4:17 UTC (permalink / raw)
  To: Gautham R. Shenoy, Nathan Lynch, Michael Ellerman,
	Vaidyanathan Srinivasan, Naveen N. Rao, Tyrel Datwyler
  Cc: linuxppc-dev, linux-kernel

On 2/21/20 10:48 AM, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

[...]

> 
> Gautham R. Shenoy (5):
>   powerpc: Move idle_loop_prolog()/epilog() functions to header file
>   powerpc/idle: Add accessor function to always read latest idle PURR
>   powerpc/pseries: Account for SPURR ticks on idle CPUs
>   powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
>   Documentation: Document sysfs interfaces purr, spurr, idle_purr,
>     idle_spurr
> 
>  Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++++++++++
>  arch/powerpc/include/asm/idle.h                    | 88 ++++++++++++++++++++++
>  arch/powerpc/kernel/sysfs.c                        | 54 ++++++++++++-
>  arch/powerpc/platforms/pseries/setup.c             |  8 +-
>  drivers/cpuidle/cpuidle-pseries.c                  | 39 ++--------
>  5 files changed, 191 insertions(+), 37 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/idle.h
> 

For the whole series:

Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>

-- 
Kamalesh


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file
  2020-02-21 15:03   ` Nathan Lynch
@ 2020-02-24  4:55     ` Gautham R Shenoy
  2020-03-06 17:06       ` Nathan Lynch
  0 siblings, 1 reply; 19+ messages in thread
From: Gautham R Shenoy @ 2020-02-24  4:55 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

Hello Nathan,

On Fri, Feb 21, 2020 at 09:03:16AM -0600, Nathan Lynch wrote:
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> 
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > Currently prior to entering an idle state on a Linux Guest, the
> > pseries cpuidle driver implement an idle_loop_prolog() and
> > idle_loop_epilog() functions which ensure that idle_purr is correctly
> > computed, and the hypervisor is informed that the CPU cycles have been
> > donated.
> >
> > These prolog and epilog functions are also required in the default
> > idle call, i.e pseries_lpar_idle(). Hence move these accessor
> > functions to a common header file and call them from
> > pseries_lpar_idle(). Since the existing header files such as
> > asm/processor.h have enough clutter, create a new header file
> > asm/idle.h.
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/include/asm/idle.h        | 27 +++++++++++++++++++++++++++
> >  arch/powerpc/platforms/pseries/setup.c |  7 +++++--
> >  drivers/cpuidle/cpuidle-pseries.c      | 24 +-----------------------
> >  3 files changed, 33 insertions(+), 25 deletions(-)
> >  create mode 100644 arch/powerpc/include/asm/idle.h
> >
> > diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
> > new file mode 100644
> > index 0000000..f32a7d8
> > --- /dev/null
> > +++ b/arch/powerpc/include/asm/idle.h
> > @@ -0,0 +1,27 @@
> > +#ifndef _ASM_POWERPC_IDLE_H
> > +#define _ASM_POWERPC_IDLE_H
> > +#include <asm/runlatch.h>
> > +
> > +static inline void idle_loop_prolog(unsigned long *in_purr)
> > +{
> > +	ppc64_runlatch_off();
> > +	*in_purr = mfspr(SPRN_PURR);
> > +	/*
> > +	 * Indicate to the HV that we are idle. Now would be
> > +	 * a good time to find other work to dispatch.
> > +	 */
> > +	get_lppaca()->idle = 1;
> > +}
> > +
> > +static inline void idle_loop_epilog(unsigned long in_purr)
> > +{
> > +	u64 wait_cycles;
> > +
> > +	wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
> > +	wait_cycles += mfspr(SPRN_PURR) - in_purr;
> > +	get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
> > +	get_lppaca()->idle = 0;
> > +
> > +	ppc64_runlatch_on();
> > +}
> > +#endif
> 
> Looks fine and correct as a cleanup, but asm/include/idle.h and
> idle_loop_prolog, idle_loop_epilog, strike me as too generic for
> pseries-specific code.

Should it be prefixed with pseries , i.e pseries_idle_prolog()
and pseries_idle_epilog() ?

Also, I am planning another round of cleanup to move all the
idle-related declaration from asm/include/processor.h to
asm/include/idle.h



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs
  2020-02-21 16:47   ` Nathan Lynch
@ 2020-02-24  5:05     ` Gautham R Shenoy
  0 siblings, 0 replies; 19+ messages in thread
From: Gautham R Shenoy @ 2020-02-24  5:05 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

Hello Nathan,

On Fri, Feb 21, 2020 at 10:47:41AM -0600, Nathan Lynch wrote:
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> > +static inline void snapshot_spurr_idle_entry(void)
> > +{
> > +	*this_cpu_ptr(&idle_entry_spurr_snap) = mfspr(SPRN_SPURR);
> > +}
> > +
> 
> [...]
> 
> > +static inline void update_idle_spurr_accounting(void)
> > +{
> > +	u64 *idle_spurr_cycles_ptr = this_cpu_ptr(&idle_spurr_cycles);
> > +	u64 in_spurr = *this_cpu_ptr(&idle_entry_spurr_snap);
> > +
> > +	*idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
> > +}
> 
> [...]
> 
> > +static inline u64 read_this_idle_spurr(void)
> > +{
> > +	/*
> > +	 * If we are reading from an idle context, update the
> > +	 * idle-spurr cycles corresponding to the last idle period.
> > +	 * Since the idle context is not yet over, take a fresh
> > +	 * snapshot of the idle-spurr.
> > +	 */
> > +	if (get_lppaca()->idle == 1) {
> > +		update_idle_spurr_accounting();
> > +		snapshot_spurr_idle_entry();
> 
> This samples spurr twice when it could do with just one. I don't know
> the performance implications, but will the results be coherent?


We would have taken the snapshot in idle_loop_prolog() just before
entering idle. That fact that the "if" condition is true above in
read_this_idle_spurr() implies that we are reading the idle_spurr
value from an interrupt context and since get_lppaca()->idle == 1, we
haven't yet called idle_loop_epilog(), where we would have updated the
idle_spurr ticks for the last idle period.

Hence, in this function, we first update the idle_spurr accounting
from the time of the last snapshot to now. We update the snapshot to
the current SPURR value so that when we eventually call
idle_loop_epilog(), we will account for the remaining idle duration,
i.e from the read_this_idle_spurr() call to idle_loop_epilog()

The results are therefore coherant, in that we do not perform double
accounting the second time we invoke update_idle_spurr_accounting()
from idle_loop_epilog(), but only add the spurr ticks from
read_this_idle_spurr() to idle_loop_epilog().

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-02-21 16:50   ` Nathan Lynch
@ 2020-02-24  5:14     ` Gautham R Shenoy
  2020-02-25 10:20       ` Naveen N. Rao
  0 siblings, 1 reply; 19+ messages in thread
From: Gautham R Shenoy @ 2020-02-24  5:14 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

On Fri, Feb 21, 2020 at 10:50:12AM -0600, Nathan Lynch wrote:
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> > diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> > index 80a676d..5b4b450 100644
> > --- a/arch/powerpc/kernel/sysfs.c
> > +++ b/arch/powerpc/kernel/sysfs.c
> > @@ -19,6 +19,7 @@
> >  #include <asm/smp.h>
> >  #include <asm/pmc.h>
> >  #include <asm/firmware.h>
> > +#include <asm/idle.h>
> >  #include <asm/svm.h>
> >  
> >  #include "cacheinfo.h"
> > @@ -733,6 +734,42 @@ static void create_svm_file(void)
> >  }
> >  #endif /* CONFIG_PPC_SVM */
> >  
> > +static void read_idle_purr(void *val)
> > +{
> > +	u64 *ret = (u64 *)val;
> 
> No cast from void* needed.

Will fix this. Thanks.

> 
> 
> > +
> > +	*ret = read_this_idle_purr();
> > +}
> > +
> > +static ssize_t idle_purr_show(struct device *dev,
> > +			      struct device_attribute *attr, char *buf)
> > +{
> > +	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > +	u64 val;
> > +
> > +	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
> > +	return sprintf(buf, "%llx\n", val);
> > +}
> > +static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
> > +
> > +static void read_idle_spurr(void *val)
> > +{
> > +	u64 *ret = (u64 *)val;
> > +
> > +	*ret = read_this_idle_spurr();
> > +}
> > +
> > +static ssize_t idle_spurr_show(struct device *dev,
> > +			       struct device_attribute *attr, char *buf)
> > +{
> > +	struct cpu *cpu = container_of(dev, struct cpu, dev);
> > +	u64 val;
> > +
> > +	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
> > +	return sprintf(buf, "%llx\n", val);
> > +}
> > +static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
> 
> It's regrettable that we have to wake up potentially idle CPUs in order
> to derive correct idle statistics for them, but I suppose the main user
> (lparstat) of these interfaces already is causing this to happen by
> polling the existing per-cpu purr and spurr attributes.
> 
> So now lparstat will incur at minimum four syscalls and four IPIs per
> CPU per polling interval -- one for each of purr, spurr, idle_purr and
> idle_spurr. Correct?

Yes, it is unforunate that we will end up making four syscalls and
generating IPI noise, and this is something that I discussed with
Naveen and Kamalesh. We have the following two constraints:

1) These values of PURR and SPURR required are per-cpu. Hence putting
them in lparcfg is not an option.

2) sysfs semantics encourages a single value per key, the key being
the sysfs-file. Something like the following would have made far more
sense.

cat /sys/devices/system/cpu/cpuX/purr_spurr_accounting
purr:A
idle_purr:B
spurr:C
idle_spurr:D

There are some sysfs files which allow something like this. Eg: 
/sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state

Thoughts on any other alternatives?


> 
> At some point it's going to make sense to batch sampling of remote CPUs'
> SPRs.
> 
> 
> >  static int register_cpu_online(unsigned int cpu)
> >  {
> >  	struct cpu *c = &per_cpu(cpu_devices, cpu);
> > @@ -794,10 +831,15 @@ static int register_cpu_online(unsigned int cpu)
> >  		if (!firmware_has_feature(FW_FEATURE_LPAR))
> >  			add_write_permission_dev_attr(&dev_attr_purr);
> >  		device_create_file(s, &dev_attr_purr);
> > +		if (firmware_has_feature(FW_FEATURE_LPAR))
> > +			device_create_file(s, &dev_attr_idle_purr);
> >  	}
> >  
> > -	if (cpu_has_feature(CPU_FTR_SPURR))
> > +	if (cpu_has_feature(CPU_FTR_SPURR)) {
> >  		device_create_file(s, &dev_attr_spurr);
> > +		if (firmware_has_feature(FW_FEATURE_LPAR))
> > +			device_create_file(s, &dev_attr_idle_spurr);
> > +	}
> >  
> >  	if (cpu_has_feature(CPU_FTR_DSCR))
> >  		device_create_file(s, &dev_attr_dscr);
> > @@ -879,11 +921,17 @@ static int unregister_cpu_online(unsigned int cpu)
> >  	if (cpu_has_feature(CPU_FTR_MMCRA))
> >  		device_remove_file(s, &dev_attr_mmcra);
> >  
> > -	if (cpu_has_feature(CPU_FTR_PURR))
> > +	if (cpu_has_feature(CPU_FTR_PURR)) {
> >  		device_remove_file(s, &dev_attr_purr);
> > +		if (firmware_has_feature(FW_FEATURE_LPAR))
> > +			device_remove_file(s, &dev_attr_idle_purr);
> > +	}
> >  
> > -	if (cpu_has_feature(CPU_FTR_SPURR))
> > +	if (cpu_has_feature(CPU_FTR_SPURR)) {
> >  		device_remove_file(s, &dev_attr_spurr);
> > +		if (firmware_has_feature(FW_FEATURE_LPAR))
> > +			device_remove_file(s, &dev_attr_idle_spurr);
> > +	}
> >  
> >  	if (cpu_has_feature(CPU_FTR_DSCR))
> >  		device_remove_file(s, &dev_attr_dscr);
> 
> The cpu register/unregister stuff here looks correct.

Thanks for reviewing the patch.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
  2020-02-21 16:55   ` Nathan Lynch
@ 2020-02-24  5:15     ` Gautham R Shenoy
  0 siblings, 0 replies; 19+ messages in thread
From: Gautham R Shenoy @ 2020-02-24  5:15 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Gautham R. Shenoy, linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

On Fri, Feb 21, 2020 at 10:55:07AM -0600, Nathan Lynch wrote:
> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
> > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> > index 2e0e3b4..799dc737a 100644
> > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> > @@ -580,3 +580,42 @@ Description:	Secure Virtual Machine
> >  		If 1, it means the system is using the Protected Execution
> >  		Facility in POWER9 and newer processors. i.e., it is a Secure
> >  		Virtual Machine.
> > +
> > +What: 		/sys/devices/system/cpu/cpuX/purr
> > +Date:		Apr 2005
> > +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> > +Description:	PURR ticks for this CPU since the system boot.
> > +
> > +		The Processor Utilization Resources Register (PURR) is
> > +		a 64-bit counter which provides an estimate of the
> > +		resources used by the CPU thread. The contents of this
> > +		register increases monotonically. This sysfs interface
> > +		exposes the number of PURR ticks for cpuX.
> > +
> > +What: 		/sys/devices/system/cpu/cpuX/spurr
> > +Date:		Dec 2006
> > +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> > +Description:	SPURR ticks for this CPU since the system boot.
> > +
> > +		The Scaled Processor Utilization Resources Register
> > +		(SPURR) is a 64-bit counter that provides a frequency
> > +		invariant estimate of the resources used by the CPU
> > +		thread. The contents of this register increases
> > +		monotonically. This sysfs interface exposes the number
> > +		of SPURR ticks for cpuX.
> > +
> > +What: 		/sys/devices/system/cpu/cpuX/idle_purr
> > +Date:		Nov 2019
> > +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> > +Description:	PURR ticks for cpuX when it was idle.
> > +
> > +		This sysfs interface exposes the number of PURR ticks
> > +		for cpuX when it was idle.
> > +
> > +What: 		/sys/devices/system/cpu/cpuX/spurr
> 
> Copy-paste error? This should be:

Yes, this should have been idle_spurr. Will fix it in the next
version.

> 
>                         /sys/devices/system/cpu/cpuX/idle_spurr
> 
> > +Date:		Nov 2019
> 
> And I suppose Nov 2019 is no longer accurate.

My bad. I will resend this with the updated date.


> 
> 
> > +Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> > +Description:	SPURR ticks for cpuX when it was idle.
> > +
> > +		This sysfs interface exposes the number of SPURR ticks
> > +		for cpuX when it was idle.
> > -- 
> > 1.9.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-02-24  5:14     ` Gautham R Shenoy
@ 2020-02-25 10:20       ` Naveen N. Rao
  2020-03-06 17:03         ` Nathan Lynch
  0 siblings, 1 reply; 19+ messages in thread
From: Naveen N. Rao @ 2020-02-25 10:20 UTC (permalink / raw)
  To: ego, Nathan Lynch
  Cc: Kamalesh Babulal, linux-kernel, linuxppc-dev, Michael Ellerman,
	Vaidyanathan Srinivasan, Tyrel Datwyler

Gautham R Shenoy wrote:
> On Fri, Feb 21, 2020 at 10:50:12AM -0600, Nathan Lynch wrote:
>> "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> writes:
>> > diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
>> > index 80a676d..5b4b450 100644
>> > --- a/arch/powerpc/kernel/sysfs.c
>> > +++ b/arch/powerpc/kernel/sysfs.c
>> > @@ -19,6 +19,7 @@
>> >  #include <asm/smp.h>
>> >  #include <asm/pmc.h>
>> >  #include <asm/firmware.h>
>> > +#include <asm/idle.h>
>> >  #include <asm/svm.h>
>> >  
>> >  #include "cacheinfo.h"
>> > @@ -733,6 +734,42 @@ static void create_svm_file(void)
>> >  }
>> >  #endif /* CONFIG_PPC_SVM */
>> >  
>> > +static void read_idle_purr(void *val)
>> > +{
>> > +	u64 *ret = (u64 *)val;
>> 
>> No cast from void* needed.
> 
> Will fix this. Thanks.
> 
>> 
>> 
>> > +
>> > +	*ret = read_this_idle_purr();
>> > +}
>> > +
>> > +static ssize_t idle_purr_show(struct device *dev,
>> > +			      struct device_attribute *attr, char *buf)
>> > +{
>> > +	struct cpu *cpu = container_of(dev, struct cpu, dev);
>> > +	u64 val;
>> > +
>> > +	smp_call_function_single(cpu->dev.id, read_idle_purr, &val, 1);
>> > +	return sprintf(buf, "%llx\n", val);
>> > +}
>> > +static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
>> > +
>> > +static void read_idle_spurr(void *val)
>> > +{
>> > +	u64 *ret = (u64 *)val;
>> > +
>> > +	*ret = read_this_idle_spurr();
>> > +}
>> > +
>> > +static ssize_t idle_spurr_show(struct device *dev,
>> > +			       struct device_attribute *attr, char *buf)
>> > +{
>> > +	struct cpu *cpu = container_of(dev, struct cpu, dev);
>> > +	u64 val;
>> > +
>> > +	smp_call_function_single(cpu->dev.id, read_idle_spurr, &val, 1);
>> > +	return sprintf(buf, "%llx\n", val);
>> > +}
>> > +static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
>> 
>> It's regrettable that we have to wake up potentially idle CPUs in order
>> to derive correct idle statistics for them, but I suppose the main user
>> (lparstat) of these interfaces already is causing this to happen by
>> polling the existing per-cpu purr and spurr attributes.
>> 
>> So now lparstat will incur at minimum four syscalls and four IPIs per
>> CPU per polling interval -- one for each of purr, spurr, idle_purr and
>> idle_spurr. Correct?
> 
> Yes, it is unforunate that we will end up making four syscalls and
> generating IPI noise, and this is something that I discussed with
> Naveen and Kamalesh. We have the following two constraints:
> 
> 1) These values of PURR and SPURR required are per-cpu. Hence putting
> them in lparcfg is not an option.
> 
> 2) sysfs semantics encourages a single value per key, the key being
> the sysfs-file. Something like the following would have made far more
> sense.
> 
> cat /sys/devices/system/cpu/cpuX/purr_spurr_accounting
> purr:A
> idle_purr:B
> spurr:C
> idle_spurr:D
> 
> There are some sysfs files which allow something like this. Eg: 
> /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state
> 
> Thoughts on any other alternatives?

Umm... procfs?
/me ducks

> 
> 
>> 
>> At some point it's going to make sense to batch sampling of remote CPUs'
>> SPRs.

How did you mean this? It looks like we first need to provide a separate 
user interface, since with the existing sysfs interface providing 
separate files, I am not sure if we can batch such reads.


- Naveen


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-02-25 10:20       ` Naveen N. Rao
@ 2020-03-06 17:03         ` Nathan Lynch
  2020-03-06 17:35           ` Naveen N. Rao
  0 siblings, 1 reply; 19+ messages in thread
From: Nathan Lynch @ 2020-03-06 17:03 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Kamalesh Babulal, linux-kernel, linuxppc-dev, Michael Ellerman,
	Vaidyanathan Srinivasan, Tyrel Datwyler, ego

"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
> Gautham R Shenoy wrote:
>> On Fri, Feb 21, 2020 at 10:50:12AM -0600, Nathan Lynch wrote:
>>> It's regrettable that we have to wake up potentially idle CPUs in order
>>> to derive correct idle statistics for them, but I suppose the main user
>>> (lparstat) of these interfaces already is causing this to happen by
>>> polling the existing per-cpu purr and spurr attributes.
>>> 
>>> So now lparstat will incur at minimum four syscalls and four IPIs per
>>> CPU per polling interval -- one for each of purr, spurr, idle_purr and
>>> idle_spurr. Correct?
>> 
>> Yes, it is unforunate that we will end up making four syscalls and
>> generating IPI noise, and this is something that I discussed with
>> Naveen and Kamalesh. We have the following two constraints:
>> 
>> 1) These values of PURR and SPURR required are per-cpu. Hence putting
>> them in lparcfg is not an option.
>> 
>> 2) sysfs semantics encourages a single value per key, the key being
>> the sysfs-file. Something like the following would have made far more
>> sense.
>> 
>> cat /sys/devices/system/cpu/cpuX/purr_spurr_accounting
>> purr:A
>> idle_purr:B
>> spurr:C
>> idle_spurr:D
>> 
>> There are some sysfs files which allow something like this. Eg: 
>> /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state
>> 
>> Thoughts on any other alternatives?
>
> Umm... procfs?
> /me ducks

I had wondered about perf events but I'm not sure that's any more suitable.


>>> At some point it's going to make sense to batch sampling of remote CPUs'
>>> SPRs.
>
> How did you mean this? It looks like we first need to provide a separate 
> user interface, since with the existing sysfs interface providing 
> separate files, I am not sure if we can batch such reads.

I mean in order to minimize IPI traffic something like: sample/calculate
all of a CPU's purr, idle_purr, spurr, idle_spurr in a single IPI upon a
read of any of the attributes, and cache the result for some time, so
that the anticipated subsequent reads of the other attributes use the
cached results instead of generating more IPIs.

That would keep the current sysfs interface at the cost of imposing a
certain coarseness in the results.

Anyway, that's a mitigation that could be considered if the
implementation in this patch is found to be too expensive in practice.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file
  2020-02-24  4:55     ` Gautham R Shenoy
@ 2020-03-06 17:06       ` Nathan Lynch
  0 siblings, 0 replies; 19+ messages in thread
From: Nathan Lynch @ 2020-03-06 17:06 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: linuxppc-dev, linux-kernel, Michael Ellerman,
	Vaidyanathan Srinivasan, Kamalesh Babulal, Naveen N. Rao,
	Tyrel Datwyler

Gautham R Shenoy <ego@linux.vnet.ibm.com> writes:
> On Fri, Feb 21, 2020 at 09:03:16AM -0600, Nathan Lynch wrote:
>> Looks fine and correct as a cleanup, but asm/include/idle.h and
>> idle_loop_prolog, idle_loop_epilog, strike me as too generic for
>> pseries-specific code.
>
> Should it be prefixed with pseries , i.e pseries_idle_prolog()
> and pseries_idle_epilog() ?

Yes that seems appropriate to me.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  2020-03-06 17:03         ` Nathan Lynch
@ 2020-03-06 17:35           ` Naveen N. Rao
  0 siblings, 0 replies; 19+ messages in thread
From: Naveen N. Rao @ 2020-03-06 17:35 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: ego, Kamalesh Babulal, linux-kernel, linuxppc-dev,
	Michael Ellerman, Vaidyanathan Srinivasan, Tyrel Datwyler

Nathan Lynch wrote:
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:
>> Gautham R Shenoy wrote:
>>> On Fri, Feb 21, 2020 at 10:50:12AM -0600, Nathan Lynch wrote:
>>>> It's regrettable that we have to wake up potentially idle CPUs in order
>>>> to derive correct idle statistics for them, but I suppose the main user
>>>> (lparstat) of these interfaces already is causing this to happen by
>>>> polling the existing per-cpu purr and spurr attributes.
>>>> 
>>>> So now lparstat will incur at minimum four syscalls and four IPIs per
>>>> CPU per polling interval -- one for each of purr, spurr, idle_purr and
>>>> idle_spurr. Correct?
>>> 
>>> Yes, it is unforunate that we will end up making four syscalls and
>>> generating IPI noise, and this is something that I discussed with
>>> Naveen and Kamalesh. We have the following two constraints:
>>> 
>>> 1) These values of PURR and SPURR required are per-cpu. Hence putting
>>> them in lparcfg is not an option.
>>> 
>>> 2) sysfs semantics encourages a single value per key, the key being
>>> the sysfs-file. Something like the following would have made far more
>>> sense.
>>> 
>>> cat /sys/devices/system/cpu/cpuX/purr_spurr_accounting
>>> purr:A
>>> idle_purr:B
>>> spurr:C
>>> idle_spurr:D
>>> 
>>> There are some sysfs files which allow something like this. Eg: 
>>> /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state
>>> 
>>> Thoughts on any other alternatives?
>>
>> Umm... procfs?
>> /me ducks
> 
> I had wondered about perf events but I'm not sure that's any more suitable.

Yes, we considered that, but it looks like the event reads are not 
"batched" in any manner. So, the IPI overhead will be similar.

> 
>>>> At some point it's going to make sense to batch sampling of remote CPUs'
>>>> SPRs.
>>
>> How did you mean this? It looks like we first need to provide a separate 
>> user interface, since with the existing sysfs interface providing 
>> separate files, I am not sure if we can batch such reads.
> 
> I mean in order to minimize IPI traffic something like: sample/calculate
> all of a CPU's purr, idle_purr, spurr, idle_spurr in a single IPI upon a
> read of any of the attributes, and cache the result for some time, so
> that the anticipated subsequent reads of the other attributes use the
> cached results instead of generating more IPIs.
> 
> That would keep the current sysfs interface at the cost of imposing a
> certain coarseness in the results.

Thanks for clarifying, that makes sense. Though we need to be careful in 
ensuring the sysfs semantics work as expected.

> 
> Anyway, that's a mitigation that could be considered if the
> implementation in this patch is found to be too expensive in practice.

That's a good point. We can optimize later if this turns out to be a 
problem in practice, if we end up using this approach.


- Naveen


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-03-06 17:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-21  5:18 [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Gautham R. Shenoy
2020-02-21  5:18 ` [PATCH v2 1/5] powerpc: Move idle_loop_prolog()/epilog() functions to header file Gautham R. Shenoy
2020-02-21 15:03   ` Nathan Lynch
2020-02-24  4:55     ` Gautham R Shenoy
2020-03-06 17:06       ` Nathan Lynch
2020-02-21  5:18 ` [PATCH v2 2/5] powerpc/idle: Add accessor function to always read latest idle PURR Gautham R. Shenoy
2020-02-21  5:18 ` [PATCH v2 3/5] powerpc/pseries: Account for SPURR ticks on idle CPUs Gautham R. Shenoy
2020-02-21 16:47   ` Nathan Lynch
2020-02-24  5:05     ` Gautham R Shenoy
2020-02-21  5:18 ` [PATCH v2 4/5] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Gautham R. Shenoy
2020-02-21 16:50   ` Nathan Lynch
2020-02-24  5:14     ` Gautham R Shenoy
2020-02-25 10:20       ` Naveen N. Rao
2020-03-06 17:03         ` Nathan Lynch
2020-03-06 17:35           ` Naveen N. Rao
2020-02-21  5:18 ` [PATCH v2 5/5] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Gautham R. Shenoy
2020-02-21 16:55   ` Nathan Lynch
2020-02-24  5:15     ` Gautham R Shenoy
2020-02-24  4:17 ` [PATCH v2 0/5] Track and expose idle PURR and SPURR ticks Kamalesh Babulal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).