linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN
@ 2022-09-02  8:53 Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca Nicholas Piggin
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Nicholas Piggin @ 2022-09-02  8:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

pseries provides stolen time accounting when VIRT_CPU_ACCOUNTING_NATIVE
is selected, but not when VIRT_CPU_ACCOUNTING_GEN is. We like GEN
because it's less code in arch/powerpc, allows full nohz, and distros
have moved to it, so this series adds stolen time accounting for GEN,
and moves our pseries configs over to it.

Thanks,
Nick

Since v1:
- Move the KVM patches out of this series to make it smaller.
  I'll post them separately.
- Fix compilation bug in patch 2 due to missing header in patch.
- Add defconfig changes to patch 3.
- Add tidy up patch 4.
- Improve changelogs.

Nicholas Piggin (4):
  powerpc/pseries: Add wait interval counter definitions to struct
    lppaca
  powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING
  powerpc/64: Remove PPC64 special case for cputime accounting default
  powerpc/pseries: Move dtl scanning and steal time accounting to
    pseries platform

 .../admin-guide/kernel-parameters.txt         |  6 +-
 arch/powerpc/configs/ppc64_defconfig          |  2 +
 arch/powerpc/configs/pseries_defconfig        |  2 +
 arch/powerpc/include/asm/cputime.h            |  2 +-
 arch/powerpc/include/asm/dtl.h                |  8 --
 arch/powerpc/include/asm/lppaca.h             | 10 +-
 arch/powerpc/include/asm/paravirt.h           | 12 +++
 arch/powerpc/include/asm/paravirt_api_clock.h |  1 +
 arch/powerpc/include/asm/time.h               |  5 +-
 arch/powerpc/kernel/time.c                    | 92 +------------------
 arch/powerpc/platforms/pseries/Kconfig        |  8 ++
 arch/powerpc/platforms/pseries/dtl.c          | 81 ++++++++++++++++
 arch/powerpc/platforms/pseries/lpar.c         | 11 +++
 arch/powerpc/platforms/pseries/setup.c        | 19 ++++
 init/Kconfig                                  |  3 +-
 15 files changed, 156 insertions(+), 106 deletions(-)
 create mode 100644 arch/powerpc/include/asm/paravirt_api_clock.h

-- 
2.37.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca
  2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
@ 2022-09-02  8:53 ` Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 2/4] powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING Nicholas Piggin
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Nicholas Piggin @ 2022-09-02  8:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin, Fabiano Rosas

The hypervisor exposes accumulated partition scheduling interval times
in the VPA (lppaca). These can be used to implement a simple stolen time
in the guest without complex and costly dtl scanning.

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/lppaca.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index c390ec377bae..34d44cb17c87 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -104,14 +104,18 @@ struct lppaca {
 	volatile __be32 dispersion_count; /* dispatch changed physical cpu */
 	volatile __be64 cmo_faults;	/* CMO page fault count */
 	volatile __be64 cmo_fault_time;	/* CMO page fault time */
-	u8	reserved10[104];
+	u8	reserved10[64];		/* [S]PURR expropriated/donated */
+	volatile __be64 enqueue_dispatch_tb; /* Total TB enqueue->dispatch */
+	volatile __be64 ready_enqueue_tb; /* Total TB ready->enqueue */
+	volatile __be64 wait_ready_tb;	/* Total TB wait->ready */
+	u8	reserved11[16];
 
 	/* cacheline 4-5 */
 
 	__be32	page_ins;		/* CMO Hint - # page ins by OS */
-	u8	reserved11[148];
+	u8	reserved12[148];
 	volatile __be64 dtl_idx;	/* Dispatch Trace Log head index */
-	u8	reserved12[96];
+	u8	reserved13[96];
 } ____cacheline_aligned;
 
 #define lppaca_of(cpu)	(*paca_ptrs[cpu]->lppaca_ptr)
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/4] powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING
  2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca Nicholas Piggin
@ 2022-09-02  8:53 ` Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 3/4] powerpc/64: Remove PPC64 special case for cputime accounting default Nicholas Piggin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Nicholas Piggin @ 2022-09-02  8:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Shrikanth Hegde, Nicholas Piggin

CONFIG_VIRT_CPU_ACCOUNTING_GEN under pseries does not provide stolen
time accounting unless CONFIG_PARAVIRT_TIME_ACCOUNTING is enabled.
Implement this using the VPA accumulated wait counters.

Note this will not work on current KVM hosts because KVM does not
implement the VPA dispatch counters (yet). It could be implemented
with the dispatch trace log as it is for VIRT_CPU_ACCOUNTING_NATIVE,
but that is not necessary for the more limited accounting provided
by PARAVIRT_TIME_ACCOUNTING, and it is more expensive, complex, and
has downsides like potential log wrap.

From Shrikanth:

  [...] it was tested on Power10 [PowerVM] Shared LPAR. system has two
  LPAR. we will call first one LPAR1 and second one as LPAR2. Test was
  carried out in SMT=1. Similar observation was seen in SMT=8 as well.

  LPAR config header from each LPAR is below. LPAR1 is twice as big as
  LPAR2. Since Both are sharing the same underlying hardware, work
  stealing will happen when both the LPAR's are contending for the same
  resource.

  LPAR1:
  type=Shared mode=Uncapped smt=Off lcpu=40 cpus=40 ent=20.00
  LPAR2:
  type=Shared mode=Uncapped smt=Off lcpu=20 cpus=40 ent=10.00

  mpstat was used to check for the utilization. stress-ng has been used
  as the workload. Few cases are tested. when the both LPAR are idle
  there is no steal time. when LPAR1 starts running at 100% which
  consumes all of the physical resource, steal time starts to get
  accounted.  With LPAR1 running at 100% and LPAR2 starts running, steal
  time starts increasing. This is as expected. When the LPAR2 Load is
  increased further, steal time increases further.

  Case 1: 0% LPAR1; 0% LPAR2
   %usr  %nice   %sys %iowait  %irq  %soft %steal %guest %gnice  %idle
   0.00   0.00   0.05   0.00   0.00   0.00   0.00   0.00   0.00  99.95

  Case 2: 100% LPAR1; 0% LPAR2
   %usr  %nice   %sys %iowait  %irq  %soft %steal %guest %gnice  %idle
  97.68   0.00   0.00   0.00   0.00   0.00   2.32   0.00   0.00   0.00

  Case 3: 100% LPAR1; 50% LPAR2
   %usr  %nice   %sys %iowait  %irq  %soft %steal %guest %gnice  %idle
  86.34   0.00   0.10   0.00   0.00   0.03  13.54   0.00   0.00   0.00

  Case 4: 100% LPAR1; 100% LPAR2
   %usr  %nice   %sys %iowait  %irq  %soft %steal %guest %gnice  %idle
  78.54   0.00   0.07   0.00   0.00   0.02  21.36   0.00   0.00   0.00

  Case 5: 50% LPAR1; 100% LPAR2
   %usr  %nice   %sys %iowait  %irq  %soft %steal %guest %gnice  %idle
  49.37   0.00   0.00   0.00   0.00   0.00   1.17   0.00   0.00  49.47

  Patch is accounting for the steal time and basic tests are holding
  good.

Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 .../admin-guide/kernel-parameters.txt         |  6 +++---
 arch/powerpc/include/asm/paravirt.h           | 12 ++++++++++++
 arch/powerpc/include/asm/paravirt_api_clock.h |  1 +
 arch/powerpc/platforms/pseries/Kconfig        |  8 ++++++++
 arch/powerpc/platforms/pseries/lpar.c         | 11 +++++++++++
 arch/powerpc/platforms/pseries/setup.c        | 19 +++++++++++++++++++
 6 files changed, 54 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/asm/paravirt_api_clock.h

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 426fa892d311..7172a91539f2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3741,9 +3741,9 @@
 			[X86,PV_OPS] Disable paravirtualized VMware scheduler
 			clock and use the default one.
 
-	no-steal-acc	[X86,PV_OPS,ARM64] Disable paravirtualized steal time
-			accounting. steal time is computed, but won't
-			influence scheduler behaviour
+	no-steal-acc	[X86,PV_OPS,ARM64,PPC/PSERIES] Disable paravirtualized
+			steal time accounting. steal time is computed, but
+			won't influence scheduler behaviour
 
 	nolapic		[X86-32,APIC] Do not enable or use the local APIC.
 
diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
index eb7df559ae74..f5ba1a3c41f8 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -21,6 +21,18 @@ static inline bool is_shared_processor(void)
 	return static_branch_unlikely(&shared_processor);
 }
 
+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+extern struct static_key paravirt_steal_enabled;
+extern struct static_key paravirt_steal_rq_enabled;
+
+u64 pseries_paravirt_steal_clock(int cpu);
+
+static inline u64 paravirt_steal_clock(int cpu)
+{
+	return pseries_paravirt_steal_clock(cpu);
+}
+#endif
+
 /* If bit 0 is set, the cpu has been ceded, conferred, or preempted */
 static inline u32 yield_count_of(int cpu)
 {
diff --git a/arch/powerpc/include/asm/paravirt_api_clock.h b/arch/powerpc/include/asm/paravirt_api_clock.h
new file mode 100644
index 000000000000..65ac7cee0dad
--- /dev/null
+++ b/arch/powerpc/include/asm/paravirt_api_clock.h
@@ -0,0 +1 @@
+#include <asm/paravirt.h>
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index fb6499977f99..a3b4d99567cb 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -23,13 +23,21 @@ config PPC_PSERIES
 	select SWIOTLB
 	default y
 
+config PARAVIRT
+	bool
+
 config PARAVIRT_SPINLOCKS
 	bool
 
+config PARAVIRT_TIME_ACCOUNTING
+	select PARAVIRT
+	bool
+
 config PPC_SPLPAR
 	bool "Support for shared-processor logical partitions"
 	depends on PPC_PSERIES
 	select PARAVIRT_SPINLOCKS if PPC_QUEUED_SPINLOCKS
+	select PARAVIRT_TIME_ACCOUNTING if VIRT_CPU_ACCOUNTING_GEN
 	default y
 	help
 	  Enabling this option will make the kernel run more efficiently
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index e6c117fb6491..97ef6499e501 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -660,6 +660,17 @@ static int __init vcpudispatch_stats_procfs_init(void)
 }
 
 machine_device_initcall(pseries, vcpudispatch_stats_procfs_init);
+
+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+u64 pseries_paravirt_steal_clock(int cpu)
+{
+	struct lppaca *lppaca = &lppaca_of(cpu);
+
+	return be64_to_cpu(READ_ONCE(lppaca->enqueue_dispatch_tb)) +
+		be64_to_cpu(READ_ONCE(lppaca->ready_enqueue_tb));
+}
+#endif
+
 #endif /* CONFIG_PPC_SPLPAR */
 
 void vpa_init(int cpu)
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 489f4c4df468..5e44c65a032c 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -80,6 +80,20 @@
 DEFINE_STATIC_KEY_FALSE(shared_processor);
 EXPORT_SYMBOL(shared_processor);
 
+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+struct static_key paravirt_steal_enabled;
+struct static_key paravirt_steal_rq_enabled;
+
+static bool steal_acc = true;
+static int __init parse_no_stealacc(char *arg)
+{
+	steal_acc = false;
+	return 0;
+}
+
+early_param("no-steal-acc", parse_no_stealacc);
+#endif
+
 int CMO_PrPSP = -1;
 int CMO_SecPSP = -1;
 unsigned long CMO_PageSize = (ASM_CONST(1) << IOMMU_PAGE_SHIFT_4K);
@@ -834,6 +848,11 @@ static void __init pSeries_setup_arch(void)
 		if (lppaca_shared_proc(get_lppaca())) {
 			static_branch_enable(&shared_processor);
 			pv_spinlocks_init();
+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+			static_key_slow_inc(&paravirt_steal_enabled);
+			if (steal_acc)
+				static_key_slow_inc(&paravirt_steal_rq_enabled);
+#endif
 		}
 
 		ppc_md.power_save = pseries_lpar_idle;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/4] powerpc/64: Remove PPC64 special case for cputime accounting default
  2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 2/4] powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING Nicholas Piggin
@ 2022-09-02  8:53 ` Nicholas Piggin
  2022-09-02  8:53 ` [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform Nicholas Piggin
  2022-09-09 12:06 ` [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Michael Ellerman
  4 siblings, 0 replies; 7+ messages in thread
From: Nicholas Piggin @ 2022-09-02  8:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Distro kernels tend to be moving to VIRT_CPU_ACCOUNTING_GEN, and there
is not much reason why PPC64 should be special here. Remove the special
case and make the ppc64 and pseries defconfigs use GEN accounting
(others will use TICK, as-per Kconfig defaults).

VIRT_CPU_ACCOUNTING_NATIVE does provide scaled vtime and stolen time
apportioned between system and user time, and vtime accounting is not
unconditionally enabled, and possibly other things. But it would be
better at this point to extend GEN to cover important missing features
rather than directing users back to a less used option.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/configs/ppc64_defconfig   | 2 ++
 arch/powerpc/configs/pseries_defconfig | 2 ++
 init/Kconfig                           | 3 +--
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig
index c8b0e80d613b..6be0c43397b4 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -1,7 +1,9 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
+# CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
+CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
 CONFIG_TASKSTATS=y
 CONFIG_TASK_DELAY_ACCT=y
 CONFIG_IKCONFIG=y
diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
index b571d084c148..4723ede5e10d 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -3,8 +3,10 @@ CONFIG_NR_CPUS=2048
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_AUDIT=y
+# CONFIG_CONTEXT_TRACKING_USER_FORCE is not set
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
+CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
 CONFIG_TASKSTATS=y
 CONFIG_TASK_DELAY_ACCT=y
 CONFIG_TASK_XACCT=y
diff --git a/init/Kconfig b/init/Kconfig
index 532362fcfe31..94ce5a46a802 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -461,8 +461,7 @@ config VIRT_CPU_ACCOUNTING
 
 choice
 	prompt "Cputime accounting"
-	default TICK_CPU_ACCOUNTING if !PPC64
-	default VIRT_CPU_ACCOUNTING_NATIVE if PPC64
+	default TICK_CPU_ACCOUNTING
 
 # Kind of a stub config for the pure tick based cputime accounting
 config TICK_CPU_ACCOUNTING
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform
  2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
                   ` (2 preceding siblings ...)
  2022-09-02  8:53 ` [PATCH v2 3/4] powerpc/64: Remove PPC64 special case for cputime accounting default Nicholas Piggin
@ 2022-09-02  8:53 ` Nicholas Piggin
  2022-10-10 20:49   ` Guenter Roeck
  2022-09-09 12:06 ` [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Michael Ellerman
  4 siblings, 1 reply; 7+ messages in thread
From: Nicholas Piggin @ 2022-09-02  8:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

dtl is the PAPR Dispatch Trace Log, which is entirely a pseries feature.
The pseries platform alrady has a file dealing with the dtl, so move
scanning for stolen time accounting there from kernel/time.c.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/cputime.h   |  2 +-
 arch/powerpc/include/asm/dtl.h       |  8 ---
 arch/powerpc/include/asm/time.h      |  5 +-
 arch/powerpc/kernel/time.c           | 92 ++--------------------------
 arch/powerpc/platforms/pseries/dtl.c | 81 ++++++++++++++++++++++++
 5 files changed, 90 insertions(+), 98 deletions(-)

diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h
index 6d2b27997492..431ae2343022 100644
--- a/arch/powerpc/include/asm/cputime.h
+++ b/arch/powerpc/include/asm/cputime.h
@@ -95,7 +95,7 @@ static notrace inline void account_stolen_time(void)
 		struct lppaca *lp = local_paca->lppaca_ptr;
 
 		if (unlikely(local_paca->dtl_ridx != be64_to_cpu(lp->dtl_idx)))
-			accumulate_stolen_time();
+			pseries_accumulate_stolen_time();
 	}
 #endif
 }
diff --git a/arch/powerpc/include/asm/dtl.h b/arch/powerpc/include/asm/dtl.h
index 1625888f27ef..4bcb9f9ac764 100644
--- a/arch/powerpc/include/asm/dtl.h
+++ b/arch/powerpc/include/asm/dtl.h
@@ -37,14 +37,6 @@ struct dtl_entry {
 extern struct kmem_cache *dtl_cache;
 extern rwlock_t dtl_access_lock;
 
-/*
- * When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE = y, the cpu accounting code controls
- * reading from the dispatch trace log.  If other code wants to consume
- * DTL entries, it can set this pointer to a function that will get
- * called once for each DTL entry that gets processed.
- */
-extern void (*dtl_consumer)(struct dtl_entry *entry, u64 index);
-
 extern void register_dtl_buffer(int cpu);
 extern void alloc_dtl_buffers(unsigned long *time_limit);
 extern long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity);
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 1e5643a9b1f2..9f50766c4623 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -116,8 +116,9 @@ unsigned long long tb_to_ns(unsigned long long tb_ticks);
 
 void timer_broadcast_interrupt(void);
 
-/* SPLPAR */
-void accumulate_stolen_time(void);
+/* SPLPAR and VIRT_CPU_ACCOUNTING_NATIVE */
+void pseries_accumulate_stolen_time(void);
+u64 pseries_calculate_stolen_time(u64 stop_tb);
 
 #endif /* __KERNEL__ */
 #endif /* __POWERPC_TIME_H */
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 587adcc12860..ae3e33b4ef95 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -178,92 +178,6 @@ static inline unsigned long read_spurr(unsigned long tb)
 	return tb;
 }
 
-#ifdef CONFIG_PPC_SPLPAR
-
-#include <asm/dtl.h>
-
-void (*dtl_consumer)(struct dtl_entry *, u64);
-
-/*
- * Scan the dispatch trace log and count up the stolen time.
- * Should be called with interrupts disabled.
- */
-static u64 scan_dispatch_log(u64 stop_tb)
-{
-	u64 i = local_paca->dtl_ridx;
-	struct dtl_entry *dtl = local_paca->dtl_curr;
-	struct dtl_entry *dtl_end = local_paca->dispatch_log_end;
-	struct lppaca *vpa = local_paca->lppaca_ptr;
-	u64 tb_delta;
-	u64 stolen = 0;
-	u64 dtb;
-
-	if (!dtl)
-		return 0;
-
-	if (i == be64_to_cpu(vpa->dtl_idx))
-		return 0;
-	while (i < be64_to_cpu(vpa->dtl_idx)) {
-		dtb = be64_to_cpu(dtl->timebase);
-		tb_delta = be32_to_cpu(dtl->enqueue_to_dispatch_time) +
-			be32_to_cpu(dtl->ready_to_enqueue_time);
-		barrier();
-		if (i + N_DISPATCH_LOG < be64_to_cpu(vpa->dtl_idx)) {
-			/* buffer has overflowed */
-			i = be64_to_cpu(vpa->dtl_idx) - N_DISPATCH_LOG;
-			dtl = local_paca->dispatch_log + (i % N_DISPATCH_LOG);
-			continue;
-		}
-		if (dtb > stop_tb)
-			break;
-		if (dtl_consumer)
-			dtl_consumer(dtl, i);
-		stolen += tb_delta;
-		++i;
-		++dtl;
-		if (dtl == dtl_end)
-			dtl = local_paca->dispatch_log;
-	}
-	local_paca->dtl_ridx = i;
-	local_paca->dtl_curr = dtl;
-	return stolen;
-}
-
-/*
- * Accumulate stolen time by scanning the dispatch trace log.
- * Called on entry from user mode.
- */
-void notrace accumulate_stolen_time(void)
-{
-	u64 sst, ust;
-	struct cpu_accounting_data *acct = &local_paca->accounting;
-
-	sst = scan_dispatch_log(acct->starttime_user);
-	ust = scan_dispatch_log(acct->starttime);
-	acct->stime -= sst;
-	acct->utime -= ust;
-	acct->steal_time += ust + sst;
-}
-
-static inline u64 calculate_stolen_time(u64 stop_tb)
-{
-	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
-		return 0;
-
-	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx))
-		return scan_dispatch_log(stop_tb);
-
-	return 0;
-}
-
-#else /* CONFIG_PPC_SPLPAR */
-static inline u64 calculate_stolen_time(u64 stop_tb)
-{
-	return 0;
-}
-
-#endif /* CONFIG_PPC_SPLPAR */
-
 /*
  * Account time for a transition between system, hard irq
  * or soft irq state.
@@ -322,7 +236,11 @@ static unsigned long vtime_delta(struct cpu_accounting_data *acct,
 
 	*stime_scaled = vtime_delta_scaled(acct, now, stime);
 
-	*steal_time = calculate_stolen_time(now);
+	if (IS_ENABLED(CONFIG_PPC_SPLPAR) &&
+			firmware_has_feature(FW_FEATURE_SPLPAR))
+		*steal_time = pseries_calculate_stolen_time(now);
+	else
+		*steal_time = 0;
 
 	return stime;
 }
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index 352af5b14a0f..1b1977bc78e7 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -37,6 +37,15 @@ static u8 dtl_event_mask = DTL_LOG_ALL;
 static int dtl_buf_entries = N_DISPATCH_LOG;
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+
+/*
+ * When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE = y, the cpu accounting code controls
+ * reading from the dispatch trace log.  If other code wants to consume
+ * DTL entries, it can set this pointer to a function that will get
+ * called once for each DTL entry that gets processed.
+ */
+static void (*dtl_consumer)(struct dtl_entry *entry, u64 index);
+
 struct dtl_ring {
 	u64	write_index;
 	struct dtl_entry *write_ptr;
@@ -48,6 +57,78 @@ static DEFINE_PER_CPU(struct dtl_ring, dtl_rings);
 
 static atomic_t dtl_count;
 
+/*
+ * Scan the dispatch trace log and count up the stolen time.
+ * Should be called with interrupts disabled.
+ */
+static notrace u64 scan_dispatch_log(u64 stop_tb)
+{
+	u64 i = local_paca->dtl_ridx;
+	struct dtl_entry *dtl = local_paca->dtl_curr;
+	struct dtl_entry *dtl_end = local_paca->dispatch_log_end;
+	struct lppaca *vpa = local_paca->lppaca_ptr;
+	u64 tb_delta;
+	u64 stolen = 0;
+	u64 dtb;
+
+	if (!dtl)
+		return 0;
+
+	if (i == be64_to_cpu(vpa->dtl_idx))
+		return 0;
+	while (i < be64_to_cpu(vpa->dtl_idx)) {
+		dtb = be64_to_cpu(dtl->timebase);
+		tb_delta = be32_to_cpu(dtl->enqueue_to_dispatch_time) +
+			be32_to_cpu(dtl->ready_to_enqueue_time);
+		barrier();
+		if (i + N_DISPATCH_LOG < be64_to_cpu(vpa->dtl_idx)) {
+			/* buffer has overflowed */
+			i = be64_to_cpu(vpa->dtl_idx) - N_DISPATCH_LOG;
+			dtl = local_paca->dispatch_log + (i % N_DISPATCH_LOG);
+			continue;
+		}
+		if (dtb > stop_tb)
+			break;
+		if (dtl_consumer)
+			dtl_consumer(dtl, i);
+		stolen += tb_delta;
+		++i;
+		++dtl;
+		if (dtl == dtl_end)
+			dtl = local_paca->dispatch_log;
+	}
+	local_paca->dtl_ridx = i;
+	local_paca->dtl_curr = dtl;
+	return stolen;
+}
+
+/*
+ * Accumulate stolen time by scanning the dispatch trace log.
+ * Called on entry from user mode.
+ */
+void notrace pseries_accumulate_stolen_time(void)
+{
+	u64 sst, ust;
+	struct cpu_accounting_data *acct = &local_paca->accounting;
+
+	sst = scan_dispatch_log(acct->starttime_user);
+	ust = scan_dispatch_log(acct->starttime);
+	acct->stime -= sst;
+	acct->utime -= ust;
+	acct->steal_time += ust + sst;
+}
+
+u64 pseries_calculate_stolen_time(u64 stop_tb)
+{
+	if (!firmware_has_feature(FW_FEATURE_SPLPAR))
+		return 0;
+
+	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx))
+		return scan_dispatch_log(stop_tb);
+
+	return 0;
+}
+
 /*
  * The cpu accounting code controls the DTL ring buffer, and we get
  * given entries as they are processed.
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN
  2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
                   ` (3 preceding siblings ...)
  2022-09-02  8:53 ` [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform Nicholas Piggin
@ 2022-09-09 12:06 ` Michael Ellerman
  4 siblings, 0 replies; 7+ messages in thread
From: Michael Ellerman @ 2022-09-09 12:06 UTC (permalink / raw)
  To: linuxppc-dev, Nicholas Piggin

On Fri, 2 Sep 2022 18:53:12 +1000, Nicholas Piggin wrote:
> pseries provides stolen time accounting when VIRT_CPU_ACCOUNTING_NATIVE
> is selected, but not when VIRT_CPU_ACCOUNTING_GEN is. We like GEN
> because it's less code in arch/powerpc, allows full nohz, and distros
> have moved to it, so this series adds stolen time accounting for GEN,
> and moves our pseries configs over to it.
> 
> Thanks,
> Nick
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca
      https://git.kernel.org/powerpc/c/a8933c8d55c37f4d5eb617b4bdb72b8888b88681
[2/4] powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING
      https://git.kernel.org/powerpc/c/0e8a63132800dd8ae5fcb19113f79bea43ea18d9
[3/4] powerpc/64: Remove PPC64 special case for cputime accounting default
      https://git.kernel.org/powerpc/c/02382aff72357727f9eee5107fd32c6cd070f1d6
[4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform
      https://git.kernel.org/powerpc/c/6ba5aa541aaa079c0ca888f7fe564b2035d5ca0a

cheers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform
  2022-09-02  8:53 ` [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform Nicholas Piggin
@ 2022-10-10 20:49   ` Guenter Roeck
  0 siblings, 0 replies; 7+ messages in thread
From: Guenter Roeck @ 2022-10-10 20:49 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev

On Fri, Sep 02, 2022 at 06:53:16PM +1000, Nicholas Piggin wrote:
> dtl is the PAPR Dispatch Trace Log, which is entirely a pseries feature.
> The pseries platform alrady has a file dealing with the dtl, so move
> scanning for stolen time accounting there from kernel/time.c.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

This patch ties DTL to PPC_SPLPAR without updating configuration
dependencies. As result, the following build error may now be seen
if CONFIG_PPC_SPLPAR=y and CONFIG_DTL=n.

arch/powerpc/kernel/irq.o: in function `.do_IRQ':
irq.c:(.text+0x2798): undefined reference to `.pseries_accumulate_stolen_time'

I updated my own configurations to avoid the problem, but you might see
randconfig failures with this error in the future.

Guenter

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-10-10 20:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-02  8:53 [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Nicholas Piggin
2022-09-02  8:53 ` [PATCH v2 1/4] powerpc/pseries: Add wait interval counter definitions to struct lppaca Nicholas Piggin
2022-09-02  8:53 ` [PATCH v2 2/4] powerpc/pseries: Implement CONFIG_PARAVIRT_TIME_ACCOUNTING Nicholas Piggin
2022-09-02  8:53 ` [PATCH v2 3/4] powerpc/64: Remove PPC64 special case for cputime accounting default Nicholas Piggin
2022-09-02  8:53 ` [PATCH v2 4/4] powerpc/pseries: Move dtl scanning and steal time accounting to pseries platform Nicholas Piggin
2022-10-10 20:49   ` Guenter Roeck
2022-09-09 12:06 ` [PATCH v2 0/4] powerpc: stolen time accounting for VIRT_CPU_ACCOUNTING_GEN Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).