Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support
@ 2020-02-11 18:45 Ionela Voinescu
  2020-02-11 18:45 ` [PATCH v3 1/7] arm64: add support for the AMU extension v1 Ionela Voinescu
                   ` (6 more replies)
  0 siblings, 7 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

These patches introduce support for the Activity Monitors Unit (AMU)
CPU extension, an optional extension in ARMv8.4 CPUs. This provides
performance counters intended for system management use. Two of these
counters are then used to compute the frequency scale correction
factor needed to achieve frequency invariance.

With the CONFIG_ARM64_AMU_EXTN enabled and lacking the disable_amu
kernel parameter, the kernel is able to safely run a mix of CPUs with
and without support for the AMU extension. The AMU capability is
unconditionally enabled in the kernel as to allow any late CPU to use
the feature: the cpu_enable function will be called for all CPUs that
match the criteria, including secondary and hotplugged CPUs, marking
this feature as present on that respective CPU.

To be noted that firmware must implement AMU support when running on
CPUs that present the activity monitors extension: allow access to
the registers from lower exception levels, enable the counters,
implement save and restore functionality. More details can be found
in the documentation.

Given that the activity counters inform on activity on the CPUs, and 
that not all CPUs might implement the extension, for functional and 
security reasons, it's best to disable access to the AMU registers
from userspace (EL0) and KVM guests.

In the last patch of the series, two of the AMU counters are used to
compute the frequency scale factor needed to achieve frequency
invariance of signals in the scheduler, based on an interface added
to support counter-based frequency invariance - arch_scale_freq_tick.
The interface and update point for the counter-based frequency scale
factor is based on the similar approach in the patch that introduces
frequency invariance for x86 [1]. 

The current series is based on tip/sched/core.

Testing:
 - Build tested for multiple architectures and defconfigs.
 - AMU feature detection, EL0 and KVM guest access to AMU registers,
   feature support in firmware (version 1.5 and later of the ARM 
   Trusted Firmware) was tested on an Armv8-A Base Platform FVP:
   Architecture Envelope Model [2] (supports version 8.0 to 8.5),
   with the following configurations:

   cluster0.has_arm_v8-4=1
   cluster1.has_arm_v8-4=1
   cluster0.has_amu=1
   cluster1.has_amu=1

v2 -> v3:
 - v2 can be found at [4]
 - [1/7] used cpumask instead of per-cpu variable to flag AMU presence
   as; introduced disable_amu kernel parameter; removed ftr_id_pfr0 AMU
   bits - recommended by Suzuki.
 - [2/7] replaced obscure label as recommended by Valentin.
 - [3/7] clarified activate_traps_vhe comment
 - [4/7] dropped changes in arm64/cpu-feature-registers.txt; removed
   use of variable names - recommended by Suzuki
 - previous [5/6] - dropped as [1] as added to tip/sched/core
 - [5/7] new patch introduced to cleanly obtain maximum hardware
   frequency from cpufreq
 - [6/7] (previously [6/6]):
   - Removed use of workqueues by limiting the validation work done on
     each cpu to the setting of the reference per-cpu counter variables.
     This is now called directly from cpu_enable (cpufeature.c). Also,
     further CPU, policy and system validation is done in a
     late_initcall_sync function - waits for deferred probe work to
     finish as well to ensure the maximum frequency is set by either
     cpufreq drivers or platform drivers - recommended by Lukasz.
   - Improved AMU use detection for CPUs in arch_set_freq_scale -
     recommended by Lukasz.
   - Properly validated arch_max_freq_scale and added detailed
     documentation for how arch_max_freq_scale and freq_scale are
     obtained based on counters - recommended by Valentin.
   - Overall - limited tight coupling between AMU use and cpufreq
     (use of maximum frequency information and policy validation).
 - [7/7] introduced patch to warn if arch_timer_rate is too low
   - functionality provided by Valentin.
   

v1 -> v2:
 - v1 can be found at [3]
 - Added patches that use the counters for the scheduler's frequency
   invariance engine
 - In patch arm64: add support for the AMU extension v1 - 
    - Defined an accessor function cpu_has_amu_feat to allow a read
      of amu_feat only from the current CPU, to ensure the safe use
      of the per-cpu variable for the current user (arm64 topology
      driver) and future users.
    - Modified type of amu_feat from bool to u8 to satisfy sparse
      checker's warning 'expression using sizeof _Bool [sparse]',
      as the size of bool is compiler dependent.

[1] https://lore.kernel.org/lkml/20200122151617.531-1-ggherdovich@suse.cz/
[2] https://developer.arm.com/tools-and-software/simulation-models/fixed-virtual-platforms
[3] https://lore.kernel.org/lkml/20190917134228.5369-1-ionela.voinescu@arm.com/
[4] https://lore.kernel.org/lkml/20191218182607.21607-1-ionela.voinescu@arm.com/

Ionela Voinescu (6):
  arm64: add support for the AMU extension v1
  arm64: trap to EL1 accesses to AMU counters from EL0
  arm64/kvm: disable access to AMU registers from kvm guests
  Documentation: arm64: document support for the AMU extension
  cpufreq: add function to get the hardware max frequency
  arm64: use activity monitors for frequency invariance

Valentin Schneider (1):
  clocksource/drivers/arm_arch_timer: validate arch_timer_rate

 .../admin-guide/kernel-parameters.txt         |  10 +
 Documentation/arm64/amu.rst                   | 114 +++++++++++
 Documentation/arm64/booting.rst               |  14 ++
 Documentation/arm64/index.rst                 |   1 +
 arch/arm64/Kconfig                            |  31 +++
 arch/arm64/include/asm/assembler.h            |  10 +
 arch/arm64/include/asm/cpucaps.h              |   3 +-
 arch/arm64/include/asm/cpufeature.h           |   5 +
 arch/arm64/include/asm/kvm_arm.h              |   1 +
 arch/arm64/include/asm/sysreg.h               |  38 ++++
 arch/arm64/include/asm/topology.h             |  16 ++
 arch/arm64/kernel/cpufeature.c                | 101 ++++++++++
 arch/arm64/kernel/topology.c                  | 185 ++++++++++++++++++
 arch/arm64/kvm/hyp/switch.c                   |  14 +-
 arch/arm64/kvm/sys_regs.c                     |  93 ++++++++-
 arch/arm64/mm/proc.S                          |   3 +
 drivers/base/arch_topology.c                  |   8 +
 drivers/clocksource/arm_arch_timer.c          |  18 +-
 drivers/cpufreq/cpufreq.c                     |  20 ++
 include/linux/cpufreq.h                       |   5 +
 include/linux/topology.h                      |   7 +
 21 files changed, 690 insertions(+), 7 deletions(-)
 create mode 100644 Documentation/arm64/amu.rst


base-commit: 25ac227a25ac946e0356772012398cd1710a8bab
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12 11:30   ` Suzuki Kuruppassery Poulose
  2020-02-11 18:45 ` [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0 Ionela Voinescu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture. This implements basic support for
version 1 of the activity monitors architecture, AMUv1.

This support includes:
- Extension detection on each CPU (boot, secondary, hotplugged)
- Register interface for AMU aarch64 registers
- disable_amu kernel parameter to disable detection/counter access
  at runtime

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
---
 .../admin-guide/kernel-parameters.txt         | 10 ++
 arch/arm64/Kconfig                            | 31 ++++++
 arch/arm64/include/asm/cpucaps.h              |  3 +-
 arch/arm64/include/asm/cpufeature.h           |  5 +
 arch/arm64/include/asm/sysreg.h               | 38 ++++++++
 arch/arm64/kernel/cpufeature.c                | 97 +++++++++++++++++++
 6 files changed, 183 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index ade4e6ec23e0..6f0c6d22fa4c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -843,6 +843,16 @@
 			can be useful when debugging issues that require an SLB
 			miss to occur.
 
+	disable_amu	[ARM64]
+			Disables detection, enablement and access to counter
+			registers of the Activity Monitors Unit (AMU). By
+			default these are enabled. "disable_amu=0/1" is also
+			allowed.
+			disable_amu / disable_amu=1 - ensures access to AMU's
+			 counter registers is not attempted.
+			disable_amu=0 - enables or maintain detection and
+			 access to AMU's counter registers.
+
 	disable=	[IPV6]
 			See Documentation/networking/ipv6.txt.
 
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3ab05857ca8f..b3408d7629fd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1484,6 +1484,37 @@ config ARM64_PTR_AUTH
 
 endmenu
 
+menu "ARMv8.4 architectural features"
+
+config ARM64_AMU_EXTN
+	bool "Enable support for the Activity Monitors Unit CPU extension"
+	default y
+	help
+          The activity monitors extension is an optional extension introduced
+          by the ARMv8.4 CPU architecture. This enables support for version 1
+          of the activity monitors architecture, AMUv1.
+
+          To enable the use of this extension on CPUs that implement it, say Y.
+
+          Note that for architectural reasons, firmware _must_ implement AMU
+          support when running on CPUs that present the activity monitors
+          extension. The required support is present in:
+            * Version 1.5 and later of the ARM Trusted Firmware
+
+          For kernels that have this configuration enabled but boot with broken
+          firmware, you may need to say N here until the firmware is fixed.
+          Otherwise you may experience firmware panics or lockups when
+          accessing the counter registers. Even if you are not observing these
+          symptoms, the values returned by the register reads might not
+          correctly reflect reality. Most commonly, the value read will be 0,
+          indicating that the counter is not enabled.
+
+          Alternatively, a kernel parameter is provided to disable detection,
+          enablement and access to the counter registers of the Activity
+          Monitors Unit at runtime.
+
+endmenu
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index b92683871119..7dde890bde50 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -56,7 +56,8 @@
 #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM	46
 #define ARM64_WORKAROUND_1542419		47
 #define ARM64_WORKAROUND_1319367		48
+#define ARM64_HAS_AMU_EXTN			49
 
-#define ARM64_NCAPS				49
+#define ARM64_NCAPS				50
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 4261d55e8506..5ae6e00ccabb 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -673,6 +673,11 @@ static inline bool cpu_has_hw_af(void)
 						ID_AA64MMFR1_HADBS_SHIFT);
 }
 
+#ifdef CONFIG_ARM64_AMU_EXTN
+/* Check whether the current CPU supports the Activity Monitors Unit (AMU) */
+extern bool cpu_has_amu_feat(int cpu);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6e919fafb43d..eba13b8994ce 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -382,6 +382,42 @@
 #define SYS_TPIDR_EL0			sys_reg(3, 3, 13, 0, 2)
 #define SYS_TPIDRRO_EL0			sys_reg(3, 3, 13, 0, 3)
 
+/* Definitions for system register interface to AMU for ARMv8.4 onwards */
+#define SYS_AM_EL0(crm, op2)		sys_reg(3, 3, 13, (crm), (op2))
+#define SYS_AMCR_EL0			SYS_AM_EL0(2, 0)
+#define SYS_AMCFGR_EL0			SYS_AM_EL0(2, 1)
+#define SYS_AMCGCR_EL0			SYS_AM_EL0(2, 2)
+#define SYS_AMUSERENR_EL0		SYS_AM_EL0(2, 3)
+#define SYS_AMCNTENCLR0_EL0		SYS_AM_EL0(2, 4)
+#define SYS_AMCNTENSET0_EL0		SYS_AM_EL0(2, 5)
+#define SYS_AMCNTENCLR1_EL0		SYS_AM_EL0(3, 0)
+#define SYS_AMCNTENSET1_EL0		SYS_AM_EL0(3, 1)
+
+/*
+ * Group 0 of activity monitors (architected):
+ *                op0  op1  CRn   CRm       op2
+ * Counter:       11   011  1101  010:n<3>  n<2:0>
+ * Type:          11   011  1101  011:n<3>  n<2:0>
+ * n: 0-15
+ *
+ * Group 1 of activity monitors (auxiliary):
+ *                op0  op1  CRn   CRm       op2
+ * Counter:       11   011  1101  110:n<3>  n<2:0>
+ * Type:          11   011  1101  111:n<3>  n<2:0>
+ * n: 0-15
+ */
+
+#define SYS_AMEVCNTR0_EL0(n)		SYS_AM_EL0(4 + ((n) >> 3), (n) & 7)
+#define SYS_AMEVTYPE0_EL0(n)		SYS_AM_EL0(6 + ((n) >> 3), (n) & 7)
+#define SYS_AMEVCNTR1_EL0(n)		SYS_AM_EL0(12 + ((n) >> 3), (n) & 7)
+#define SYS_AMEVTYPE1_EL0(n)		SYS_AM_EL0(14 + ((n) >> 3), (n) & 7)
+
+/* AMU v1: Fixed (architecturally defined) activity monitors */
+#define SYS_AMEVCNTR0_CORE_EL0          SYS_AMEVCNTR0_EL0(0)
+#define SYS_AMEVCNTR0_CONST_EL0         SYS_AMEVCNTR0_EL0(1)
+#define SYS_AMEVCNTR0_INST_RET_EL0      SYS_AMEVCNTR0_EL0(2)
+#define SYS_AMEVCNTR0_MEM_STALL         SYS_AMEVCNTR0_EL0(3)
+
 #define SYS_CNTFRQ_EL0			sys_reg(3, 3, 14, 0, 0)
 
 #define SYS_CNTP_TVAL_EL0		sys_reg(3, 3, 14, 2, 0)
@@ -577,6 +613,7 @@
 #define ID_AA64PFR0_CSV3_SHIFT		60
 #define ID_AA64PFR0_CSV2_SHIFT		56
 #define ID_AA64PFR0_DIT_SHIFT		48
+#define ID_AA64PFR0_AMU_SHIFT		44
 #define ID_AA64PFR0_SVE_SHIFT		32
 #define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
@@ -587,6 +624,7 @@
 #define ID_AA64PFR0_EL1_SHIFT		4
 #define ID_AA64PFR0_EL0_SHIFT		0
 
+#define ID_AA64PFR0_AMU			0x1
 #define ID_AA64PFR0_SVE			0x1
 #define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 04cf64e9f0c9..029a473ad273 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -156,6 +156,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_DIT_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_AMU_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE),
 				   FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
@@ -1150,6 +1151,84 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
 
 #endif
 
+#ifdef CONFIG_ARM64_AMU_EXTN
+
+/*
+ * The "amu_cpus" cpumask only signals that the CPU implementation for the
+ * flagged CPUs supports the Activity Monitors Unit (AMU) but does not provide
+ * information regarding all the events that it supports. When a CPU bit is
+ * set in the cpumask, the user of this feature can only rely on the presence
+ * of the 4 fixed counters for that CPU. But this does not guarantee that the
+ * counters are enabled or access to these counters is enabled by code
+ * executed at higher exception levels (firmware).
+ */
+static cpumask_var_t amu_cpus;
+
+bool cpu_has_amu_feat(int cpu)
+{
+	if (cpumask_available(amu_cpus))
+		return cpumask_test_cpu(cpu, amu_cpus);
+
+	return false;
+}
+
+static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
+{
+	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) {
+		pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
+			smp_processor_id());
+		cpumask_set_cpu(smp_processor_id(), amu_cpus);
+	}
+}
+
+/*
+ * For known broken firmware, a kernel parameter ("disable_amu") is provided
+ * to ensure access to AMU counter registers is not attempted. By default,
+ * the feature is enabled, but disable_amu can both be used to disable or
+ * enable the capability at runtime in case the default changes in the future.
+ *
+ * To be noted that for security considerations, this does not bypass the
+ * setting of AMUSERENR_EL0 to trap accesses from EL0 (userspace) to EL1
+ * (kernel). Therefore, firmware should still ensure accesses to AMU registers
+ * are not trapped in EL2/EL3.
+ */
+static bool disable_amu;
+
+static int __init set_disable_amu(char *str)
+{
+	int value = 0;
+
+	disable_amu = get_option(&str, &value) ? !!value : true;
+
+	return 0;
+}
+early_param("disable_amu", set_disable_amu);
+
+static bool has_amu(const struct arm64_cpu_capabilities *cap,
+		       int __unused)
+{
+	/*
+	 * The AMU extension is a non-conflicting feature: the kernel can
+	 * safely run a mix of CPUs with and without support for the
+	 * activity monitors extension. Therefore, if not disabled through
+	 * the kernel command line early parameter, enable the capability
+	 * to allow any late CPU to use the feature.
+	 *
+	 * With this feature enabled, the cpu_enable function will be called
+	 * for all CPUs that match the criteria, including secondary and
+	 * hotplugged, marking this feature as present on that respective CPU.
+	 * The enable function will also print a detection message.
+	 */
+
+	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
+		pr_err("Activity Monitors Unit (AMU): fail to allocate memory");
+		disable_amu = true;
+	}
+
+	return !disable_amu;
+}
+#endif
+
 #ifdef CONFIG_ARM64_VHE
 static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused)
 {
@@ -1419,6 +1498,24 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.cpu_enable = cpu_clear_disr,
 	},
 #endif /* CONFIG_ARM64_RAS_EXTN */
+#ifdef CONFIG_ARM64_AMU_EXTN
+	{
+		/*
+		 * The feature is enabled by default if CONFIG_ARM64_AMU_EXTN=y.
+		 * Therefore, don't provide .desc as we don't want the detection
+		 * message to be shown until at least one CPU is detected to
+		 * support the feature.
+		 */
+		.capability = ARM64_HAS_AMU_EXTN,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		.matches = has_amu,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_AMU_SHIFT,
+		.min_field_value = ID_AA64PFR0_AMU,
+		.cpu_enable = cpu_amu_enable,
+	},
+#endif /* CONFIG_ARM64_AMU_EXTN */
 	{
 		.desc = "Data cache clean to the PoU not required for I/D coherence",
 		.capability = ARM64_HAS_CACHE_IDC,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
  2020-02-11 18:45 ` [PATCH v3 1/7] arm64: add support for the AMU extension v1 Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12 11:44   ` Suzuki Kuruppassery Poulose
  2020-02-12 15:36   ` Valentin Schneider
  2020-02-11 18:45 ` [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests Ionela Voinescu
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, Steve Capper

The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture. In order to access the activity
monitors counters safely, if desired, the kernel should detect the
presence of the extension through the feature register, and mediate
the access.

Therefore, disable direct accesses to activity monitors counters
from EL0 (userspace) and trap them to EL1 (kernel).

To be noted that the ARM64_AMU_EXTN kernel config and the disable_amu
kernel parameter do not have an effect on this code. Given that the
amuserenr_el0 resets to an UNKNOWN value, setting the trap of EL0
accesses to EL1 is always attempted for safety and security
considerations. Therefore firmware should still ensure accesses to
AMU registers are not trapped in EL2/EL3 as this code cannot be
bypassed if the CPU implements the Activity Monitors Unit.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/include/asm/assembler.h | 10 ++++++++++
 arch/arm64/mm/proc.S               |  3 +++
 2 files changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 2cc0dd8bd9f7..2dc6d7b19831 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -443,6 +443,16 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 9000:
 	.endm
 
+/*
+ * reset_amuserenr_el0 - reset AMUSERENR_EL0 if AMUv1 present
+ */
+	.macro	reset_amuserenr_el0, tmpreg
+	mrs	\tmpreg, id_aa64pfr0_el1	// Check ID_AA64PFR0_EL1
+	ubfx	\tmpreg, \tmpreg, #ID_AA64PFR0_AMU_SHIFT, #4
+	cbz	\tmpreg, .Lskip_\@		// Skip if no AMU present
+	msr_s	SYS_AMUSERENR_EL0, xzr		// Disable AMU access from EL0
+.Lskip_\@:
+	.endm
 /*
  * copy_page - copy src to dest using temp registers t1-t8
  */
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index a1e0592d1fbc..d8aae1152c08 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -124,6 +124,7 @@ alternative_endif
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+	reset_amuserenr_el0 x0			// Disable AMU access from EL0
 
 alternative_if ARM64_HAS_RAS_EXTN
 	msr_s	SYS_DISR_EL1, xzr
@@ -415,6 +416,8 @@ ENTRY(__cpu_setup)
 	isb					// Unmask debug exceptions now,
 	enable_dbg				// since this is per-cpu
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+	reset_amuserenr_el0 x0			// Disable AMU access from EL0
+
 	/*
 	 * Memory region attributes for LPAE:
 	 *
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
  2020-02-11 18:45 ` [PATCH v3 1/7] arm64: add support for the AMU extension v1 Ionela Voinescu
  2020-02-11 18:45 ` [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0 Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12 15:36   ` Valentin Schneider
  2020-02-12 16:33   ` Suzuki Kuruppassery Poulose
  2020-02-11 18:45 ` [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension Ionela Voinescu
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, James Morse, Julien Thierry

Access to the AMU counters should be disabled by default in kvm guests,
as information from the counters might reveal activity in other guests
or activity on the host.

Therefore, disable access to AMU registers from EL0 and EL1 in kvm
guests by:
 - Hiding the presence of the extension in the feature register
   (SYS_ID_AA64PFR0_EL1) on the VCPU.
 - Disabling access to the AMU registers before switching to the guest.
 - Trapping accesses and injecting an undefined instruction into the
   guest.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/kvm_arm.h |  1 +
 arch/arm64/kvm/hyp/switch.c      | 14 ++++-
 arch/arm64/kvm/sys_regs.c        | 93 +++++++++++++++++++++++++++++++-
 3 files changed, 105 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 6e5d839f42b5..51c1d9918999 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -267,6 +267,7 @@
 
 /* Hyp Coprocessor Trap Register */
 #define CPTR_EL2_TCPAC	(1 << 31)
+#define CPTR_EL2_TAM	(1 << 30)
 #define CPTR_EL2_TTA	(1 << 20)
 #define CPTR_EL2_TFP	(1 << CPTR_EL2_TFP_SHIFT)
 #define CPTR_EL2_TZ	(1 << 8)
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 72fbbd86eb5e..11b43a365743 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -90,6 +90,18 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 	val = read_sysreg(cpacr_el1);
 	val |= CPACR_EL1_TTA;
 	val &= ~CPACR_EL1_ZEN;
+
+	/*
+	 * With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to
+	 * CPTR_EL2. In general, CPACR_EL1 has the same layout as CPTR_EL2,
+	 * except for some missing controls, such as TAM.
+	 * In this case, CPTR_EL2.TAM has the same position with or without
+	 * VHE (HCR.E2H == 1) which allows us to use here the CPTR_EL2.TAM
+	 * shift value for trapping the AMU accesses.
+	 */
+
+	val |= CPTR_EL2_TAM;
+
 	if (update_fp_enabled(vcpu)) {
 		if (vcpu_has_sve(vcpu))
 			val |= CPACR_EL1_ZEN;
@@ -111,7 +123,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 	__activate_traps_common(vcpu);
 
 	val = CPTR_EL2_DEFAULT;
-	val |= CPTR_EL2_TTA | CPTR_EL2_TZ;
+	val |= CPTR_EL2_TTA | CPTR_EL2_TZ | CPTR_EL2_TAM;
 	if (!update_fp_enabled(vcpu)) {
 		val |= CPTR_EL2_TFP;
 		__activate_traps_fpsimd32(vcpu);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 9f2165937f7d..adeaa65a4cf0 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1003,6 +1003,20 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 	{ SYS_DESC(SYS_PMEVTYPERn_EL0(n)),					\
 	  access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), }
 
+static bool access_amu(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
+			     const struct sys_reg_desc *r)
+{
+	kvm_inject_undefined(vcpu);
+
+	return false;
+}
+
+/* Macro to expand the AMU counter and type registers*/
+#define AMU_AMEVCNTR0_EL0(n) { SYS_DESC(SYS_AMEVCNTR0_EL0(n)), access_amu }
+#define AMU_AMEVTYPE0_EL0(n) { SYS_DESC(SYS_AMEVTYPE0_EL0(n)), access_amu }
+#define AMU_AMEVCNTR1_EL0(n) { SYS_DESC(SYS_AMEVCNTR1_EL0(n)), access_amu }
+#define AMU_AMEVTYPE1_EL0(n) { SYS_DESC(SYS_AMEVTYPE1_EL0(n)), access_amu }
+
 static bool trap_ptrauth(struct kvm_vcpu *vcpu,
 			 struct sys_reg_params *p,
 			 const struct sys_reg_desc *rd)
@@ -1078,8 +1092,10 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 			 (u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
 	u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
 
-	if (id == SYS_ID_AA64PFR0_EL1 && !vcpu_has_sve(vcpu)) {
-		val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
+	if (id == SYS_ID_AA64PFR0_EL1) {
+		if (!vcpu_has_sve(vcpu))
+			val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
+		val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
 	} else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) {
 		val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) |
 			 (0xfUL << ID_AA64ISAR1_API_SHIFT) |
@@ -1565,6 +1581,79 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
 	{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
 
+	{ SYS_DESC(SYS_AMCR_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCFGR_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCGCR_EL0), access_amu },
+	{ SYS_DESC(SYS_AMUSERENR_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCNTENCLR0_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCNTENSET0_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCNTENCLR1_EL0), access_amu },
+	{ SYS_DESC(SYS_AMCNTENSET1_EL0), access_amu },
+	AMU_AMEVCNTR0_EL0(0),
+	AMU_AMEVCNTR0_EL0(1),
+	AMU_AMEVCNTR0_EL0(2),
+	AMU_AMEVCNTR0_EL0(3),
+	AMU_AMEVCNTR0_EL0(4),
+	AMU_AMEVCNTR0_EL0(5),
+	AMU_AMEVCNTR0_EL0(6),
+	AMU_AMEVCNTR0_EL0(7),
+	AMU_AMEVCNTR0_EL0(8),
+	AMU_AMEVCNTR0_EL0(9),
+	AMU_AMEVCNTR0_EL0(10),
+	AMU_AMEVCNTR0_EL0(11),
+	AMU_AMEVCNTR0_EL0(12),
+	AMU_AMEVCNTR0_EL0(13),
+	AMU_AMEVCNTR0_EL0(14),
+	AMU_AMEVCNTR0_EL0(15),
+	AMU_AMEVTYPE0_EL0(0),
+	AMU_AMEVTYPE0_EL0(1),
+	AMU_AMEVTYPE0_EL0(2),
+	AMU_AMEVTYPE0_EL0(3),
+	AMU_AMEVTYPE0_EL0(4),
+	AMU_AMEVTYPE0_EL0(5),
+	AMU_AMEVTYPE0_EL0(6),
+	AMU_AMEVTYPE0_EL0(7),
+	AMU_AMEVTYPE0_EL0(8),
+	AMU_AMEVTYPE0_EL0(9),
+	AMU_AMEVTYPE0_EL0(10),
+	AMU_AMEVTYPE0_EL0(11),
+	AMU_AMEVTYPE0_EL0(12),
+	AMU_AMEVTYPE0_EL0(13),
+	AMU_AMEVTYPE0_EL0(14),
+	AMU_AMEVTYPE0_EL0(15),
+	AMU_AMEVCNTR1_EL0(0),
+	AMU_AMEVCNTR1_EL0(1),
+	AMU_AMEVCNTR1_EL0(2),
+	AMU_AMEVCNTR1_EL0(3),
+	AMU_AMEVCNTR1_EL0(4),
+	AMU_AMEVCNTR1_EL0(5),
+	AMU_AMEVCNTR1_EL0(6),
+	AMU_AMEVCNTR1_EL0(7),
+	AMU_AMEVCNTR1_EL0(8),
+	AMU_AMEVCNTR1_EL0(9),
+	AMU_AMEVCNTR1_EL0(10),
+	AMU_AMEVCNTR1_EL0(11),
+	AMU_AMEVCNTR1_EL0(12),
+	AMU_AMEVCNTR1_EL0(13),
+	AMU_AMEVCNTR1_EL0(14),
+	AMU_AMEVCNTR1_EL0(15),
+	AMU_AMEVTYPE1_EL0(0),
+	AMU_AMEVTYPE1_EL0(1),
+	AMU_AMEVTYPE1_EL0(2),
+	AMU_AMEVTYPE1_EL0(3),
+	AMU_AMEVTYPE1_EL0(4),
+	AMU_AMEVTYPE1_EL0(5),
+	AMU_AMEVTYPE1_EL0(6),
+	AMU_AMEVTYPE1_EL0(7),
+	AMU_AMEVTYPE1_EL0(8),
+	AMU_AMEVTYPE1_EL0(9),
+	AMU_AMEVTYPE1_EL0(10),
+	AMU_AMEVTYPE1_EL0(11),
+	AMU_AMEVTYPE1_EL0(12),
+	AMU_AMEVTYPE1_EL0(13),
+	AMU_AMEVTYPE1_EL0(14),
+	AMU_AMEVTYPE1_EL0(15),
+
 	{ SYS_DESC(SYS_CNTP_TVAL_EL0), access_arch_timer },
 	{ SYS_DESC(SYS_CNTP_CTL_EL0), access_arch_timer },
 	{ SYS_DESC(SYS_CNTP_CVAL_EL0), access_arch_timer },
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
                   ` (2 preceding siblings ...)
  2020-02-11 18:45 ` [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12 15:36   ` Valentin Schneider
  2020-02-11 18:45 ` [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency Ionela Voinescu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, Jonathan Corbet

The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture.

Add initial documentation for the AMUv1 extension:
 - arm64/amu.txt: AMUv1 documentation
 - arm64/booting.txt: system registers initialisation

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/arm64/amu.rst     | 114 ++++++++++++++++++++++++++++++++
 Documentation/arm64/booting.rst |  14 ++++
 Documentation/arm64/index.rst   |   1 +
 3 files changed, 129 insertions(+)
 create mode 100644 Documentation/arm64/amu.rst

diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
new file mode 100644
index 000000000000..ad609ada2d8e
--- /dev/null
+++ b/Documentation/arm64/amu.rst
@@ -0,0 +1,114 @@
+=======================================================
+Activity Monitors Unit (AMU) extension in AArch64 Linux
+=======================================================
+
+Author: Ionela Voinescu <ionela.voinescu@arm.com>
+
+Date: 2019-09-10
+
+This document briefly describes the provision of Activity Monitors Unit
+support in AArch64 Linux.
+
+
+Architecture overview
+---------------------
+
+The activity monitors extension is an optional extension introduced by the
+ARMv8.4 CPU architecture.
+
+The activity monitors unit, implemented in each CPU, provides performance
+counters intended for system management use. The AMU extension provides a
+system register interface to the counter registers and also supports an
+optional external memory-mapped interface.
+
+Version 1 of the Activity Monitors architecture implements a counter group
+of four fixed and architecturally defined 64-bit event counters.
+  - CPU cycle counter: increments at the frequency of the CPU.
+  - Constant counter: increments at the fixed frequency of the system
+    clock.
+  - Instructions retired: increments with every architecturally executed
+    instruction.
+  - Memory stall cycles: counts instruction dispatch stall cycles caused by
+    misses in the last level cache within the clock domain.
+
+When in WFI or WFE these counters do not increment.
+
+The Activity Monitors architecture provides space for up to 16 architected
+event counters. Future versions of the architecture may use this space to
+implement additional architected event counters.
+
+Additionally, version 1 implements a counter group of up to 16 auxiliary
+64-bit event counters.
+
+On cold reset all counters reset to 0.
+
+
+Basic support
+-------------
+
+The kernel can safely run a mix of CPUs with and without support for the
+activity monitors extension. Therefore, if the capability is not disabled
+at system level (either through CONFIG_ARM64_AMU_EXTN or kernel parameter)
+we unconditionally enable the capability to allow any late CPU (secondary
+or hotplugged) to detect and use the feature.
+
+When the feature is detected on a CPU, we flag the availability of the
+feature but this does not guarantee the correct functionality of the
+counters, only the presence of the extension.
+
+Firmware (code running at higher exception levels, e.g. arm-tf) support is
+needed to:
+ - Enable access for lower exception levels (EL2 and EL1) to the AMU
+   registers.
+ - Enable the counters. If not enabled these will read as 0.
+ - Save/restore the counters before/after the CPU is being put/brought up
+   from the 'off' power state.
+
+When using kernels that have this feature enabled but boot with broken
+firmware the user may experience panics or lockups when accessing the
+counter registers. Even if these symptoms are not observed, the values
+returned by the register reads might not correctly reflect reality. Most
+commonly, the counters will read as 0, indicating that they are not
+enabled.
+
+If proper support is not provided in firmware it's best to disable
+CONFIG_ARM64_AMU_EXTN or disable the capability at runtime through the
+corresponding kernel parameter. To be noted that for security reasons,
+this does not bypass the setting of AMUSERENR_EL0 to trap accesses from
+EL0 (userspace) to EL1 (kernel). Therefore, firmware should still ensure
+accesses to AMU registers are not trapped in EL2/EL3.
+
+The fixed counters of AMUv1 are accessible though the following system
+register definitions:
+ - SYS_AMEVCNTR0_CORE_EL0
+ - SYS_AMEVCNTR0_CONST_EL0
+ - SYS_AMEVCNTR0_INST_RET_EL0
+ - SYS_AMEVCNTR0_MEM_STALL_EL0
+
+Auxiliary platform specific counters can be accessed using
+SYS_AMEVCNTR1_EL0(n), where n is a value between 0 and 15.
+
+Details can be found in: arch/arm64/include/asm/sysreg.h.
+
+
+Userspace access
+----------------
+
+Currently, access from userspace to the AMU registers is disabled due to:
+ - Security reasons: they might expose information about code executed in
+   secure mode.
+ - Purpose: AMU counters are intended for system management use.
+
+Also, the presence of the feature is not visible to userspace.
+
+
+Virtualization
+--------------
+
+Currently, access from userspace (EL0) and kernelspace (EL1) on the KVM
+guest side is disabled due to:
+ - Security reasons: they might expose information about code executed
+   by other guests or the host.
+
+Any attempt to access the AMU registers will result in an UNDEFINED
+exception being injected into the guest.
diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index 5d78a6f5b0ae..a3f1a47b6f1c 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
     - HCR_EL2.APK (bit 40) must be initialised to 0b1
     - HCR_EL2.API (bit 41) must be initialised to 0b1
 
+  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
+  - If EL3 is present:
+    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
+    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
+    AMCNTENSET0_EL0 must be initialised to 0b1111
+    AMCNTENSET1_EL0 must be initialised to a platform specific value
+    having 0b1 set for the corresponding bit for each of the auxiliary
+    counters present.
+  - If the kernel is entered at EL1:
+    AMCNTENSET0_EL0 must be initialised to 0b1111
+    AMCNTENSET1_EL0 must be initialised to a platform specific value
+    having 0b1 set for the corresponding bit for each of the auxiliary
+    counters present.
+
 The requirements described above for CPU mode, caches, MMUs, architected
 timers, coherency and system registers apply to all CPUs.  All CPUs must
 enter the kernel in the same exception level.
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 5c0c69dc58aa..09cbb4ed2237 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -6,6 +6,7 @@ ARM64 Architecture
     :maxdepth: 1
 
     acpi_object_usage
+    amu
     arm-acpi
     booting
     cpu-feature-registers
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
                   ` (3 preceding siblings ...)
  2020-02-11 18:45 ` [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12  4:14   ` Viresh Kumar
  2020-02-13 11:59   ` Valentin Schneider
  2020-02-11 18:45 ` [PATCH v3 6/7] arm64: use activity monitors for frequency invariance Ionela Voinescu
  2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
  6 siblings, 2 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Add weak function to return the hardware maximum frequency of a
CPU, with the default implementation returning cpuinfo.max_freq.

The default can be overwritten by a strong function in platforms
that want to provide an alternative implementation.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 20 ++++++++++++++++++++
 include/linux/cpufreq.h   |  5 +++++
 2 files changed, 25 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 77114a3897fb..d2ff3018861d 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1733,6 +1733,26 @@ unsigned int cpufreq_quick_get_max(unsigned int cpu)
 }
 EXPORT_SYMBOL(cpufreq_quick_get_max);
 
+/**
+ * cpufreq_get_hw_max_freq - get the max hardware frequency of the CPU
+ * @cpu: CPU number
+ *
+ * The default return value is the max_freq field of cpuinfo.
+ */
+__weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
+{
+	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+	unsigned int ret_freq = 0;
+
+	if (policy) {
+		ret_freq = policy->cpuinfo.max_freq;
+		cpufreq_cpu_put(policy);
+	}
+
+	return ret_freq;
+}
+EXPORT_SYMBOL(cpufreq_get_hw_max_freq);
+
 static unsigned int __cpufreq_get(struct cpufreq_policy *policy)
 {
 	if (unlikely(policy_is_inactive(policy)))
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 31b1b0e03df8..b4423ff619f4 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -194,6 +194,7 @@ extern struct kobject *cpufreq_global_kobject;
 unsigned int cpufreq_get(unsigned int cpu);
 unsigned int cpufreq_quick_get(unsigned int cpu);
 unsigned int cpufreq_quick_get_max(unsigned int cpu);
+unsigned int cpufreq_get_hw_max_freq(unsigned int cpu);
 void disable_cpufreq(void);
 
 u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy);
@@ -223,6 +224,10 @@ static inline unsigned int cpufreq_quick_get_max(unsigned int cpu)
 {
 	return 0;
 }
+static inline unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
+{
+	return 0;
+}
 static inline void disable_cpufreq(void) { }
 #endif
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 6/7] arm64: use activity monitors for frequency invariance
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
                   ` (4 preceding siblings ...)
  2020-02-11 18:45 ` [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12 18:59   ` Lukasz Luba
  2020-02-17 16:59   ` Valentin Schneider
  2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
  6 siblings, 2 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

The Frequency Invariance Engine (FIE) is providing a frequency
scaling correction factor that helps achieve more accurate
load-tracking.

So far, for arm and arm64 platforms, this scale factor has been
obtained based on the ratio between the current frequency and the
maximum supported frequency recorded by the cpufreq policy. The
setting of this scale factor is triggered from cpufreq drivers by
calling arch_set_freq_scale. The current frequency used in computation
is the frequency requested by a governor, but it may not be the
frequency that was implemented by the platform.

This correction factor can also be obtained using a core counter and a
constant counter to get information on the performance (frequency based
only) obtained in a period of time. This will more accurately reflect
the actual current frequency of the CPU, compared with the alternative
implementation that reflects the request of a performance level from
the OS.

Therefore, implement arch_scale_freq_tick to use activity monitors, if
present, for the computation of the frequency scale factor.

The use of AMU counters depends on:
 - CONFIG_ARM64_AMU_EXTN - depents on the AMU extension being present
 - (optional) CONFIG_CPU_FREQ - the current frequency obtained using
   counter information is divided by the maximum frequency obtained from
   the cpufreq policy. But the use of cpufreq policy maximum frequency
   is weak and cpu_get_max_freq can be redefined to provide the data
   some other way.

While it is possible to have a combination of CPUs in the system with
and without support for activity monitors, the use of counters for
frequency invariance is only enabled for a CPU if all related CPUs
(CPUs in the same frequency domain) support and have enabled the core
and constant activity monitor counters. In this way, there is a clear
separation between the policies for which arch_set_freq_scale (cpufreq
based FIE) is used, and the policies for which arch_scale_freq_tick
(counter based FIE) is used to set the frequency scale factor. For
this purpose, a late_initcall_sync is registered to trigger validation
work for policies that will enable or disable the use of AMU counters
for frequency invariance. If CONFIG_CPU_FREQ is not defined, the use
of counters is enabled on all CPUs only if all possible CPUs correctly
support the necessary counters.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
---
 arch/arm64/include/asm/topology.h |  16 +++
 arch/arm64/kernel/cpufeature.c    |   4 +
 arch/arm64/kernel/topology.c      | 185 ++++++++++++++++++++++++++++++
 drivers/base/arch_topology.c      |   8 ++
 include/linux/topology.h          |   7 ++
 5 files changed, 220 insertions(+)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index a4d945db95a2..d910d463cf76 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -16,6 +16,22 @@ int pcibus_to_node(struct pci_bus *bus);
 
 #include <linux/arch_topology.h>
 
+#ifdef CONFIG_ARM64_AMU_EXTN
+extern unsigned int cpu_get_max_freq(unsigned int cpu);
+/*
+ * Replace default function that signals use of counters
+ * for frequency invariance for given CPUs.
+ */
+bool topology_cpu_freq_counters(struct cpumask *cpus);
+#define arch_cpu_freq_counters topology_cpu_freq_counters
+/*
+ * Replace task scheduler's default counter-based
+ * frequency-invariance scale factor setting.
+ */
+void topology_scale_freq_tick(void);
+#define arch_scale_freq_tick topology_scale_freq_tick
+#endif /* CONFIG_ARM64_AMU_EXTN */
+
 /* Replace task scheduler's default frequency-invariant accounting */
 #define arch_scale_freq_capacity topology_get_freq_scale
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 029a473ad273..a4620b269b56 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1172,12 +1172,16 @@ bool cpu_has_amu_feat(int cpu)
 	return false;
 }
 
+/* Initialize the use of AMU counters for frequency invariance */
+extern void init_cpu_freq_invariance_counters(void);
+
 static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
 {
 	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) {
 		pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
 			smp_processor_id());
 		cpumask_set_cpu(smp_processor_id(), amu_cpus);
+		init_cpu_freq_invariance_counters();
 	}
 }
 
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index fa9528dfd0ce..3f6e379f07b3 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -14,6 +14,7 @@
 #include <linux/acpi.h>
 #include <linux/arch_topology.h>
 #include <linux/cacheinfo.h>
+#include <linux/cpufreq.h>
 #include <linux/init.h>
 #include <linux/percpu.h>
 
@@ -120,4 +121,188 @@ int __init parse_acpi_topology(void)
 }
 #endif
 
+#ifdef CONFIG_ARM64_AMU_EXTN
 
+#undef pr_fmt
+#define pr_fmt(fmt) "AMU: " fmt
+
+static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale);
+static DEFINE_PER_CPU(u64, arch_const_cycles_prev);
+static DEFINE_PER_CPU(u64, arch_core_cycles_prev);
+static cpumask_var_t amu_fie_cpus;
+
+/* Obtain max frequency (in KHz) as reported by hardware */
+__weak unsigned int cpu_get_max_freq(unsigned int cpu)
+{
+	return 0;
+}
+
+#ifdef CONFIG_CPU_FREQ
+/* Replace max frequency getter with cpufreq based function */
+#define cpu_get_max_freq cpufreq_get_hw_max_freq
+#endif
+
+/* Initialize counter reference per-cpu variables for the current CPU */
+void init_cpu_freq_invariance_counters(void)
+{
+	this_cpu_write(arch_core_cycles_prev,
+		       read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0));
+	this_cpu_write(arch_const_cycles_prev,
+		       read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0));
+}
+
+static int validate_cpu_freq_invariance_counters(int cpu)
+{
+	u64 max_freq_hz, ratio;
+
+	if (!cpu_has_amu_feat(cpu)) {
+		pr_debug("CPU%d: counters are not supported.\n", cpu);
+		return -EINVAL;
+	}
+
+	if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) ||
+		     !per_cpu(arch_core_cycles_prev, cpu))) {
+		pr_debug("CPU%d: cycle counters are not enabled.\n", cpu);
+		return -EINVAL;
+	}
+
+	/* Convert maximum frequency from KHz to Hz and validate */
+	max_freq_hz = cpu_get_max_freq(cpu) * 1000;
+	if (unlikely(!max_freq_hz)) {
+		pr_debug("CPU%d: invalid maximum frequency.\n", cpu);
+		return -EINVAL;
+	}
+
+	/*
+	 * Pre-compute the fixed ratio between the frequency of the constant
+	 * counter and the maximum frequency of the CPU.
+	 *
+	 *			      const_freq
+	 * arch_max_freq_scale =   ---------------- * SCHED_CAPACITY_SCALE²
+	 *			   cpuinfo_max_freq
+	 *
+	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
+	 * in order to ensure a good resolution for arch_max_freq_scale for
+	 * very low arch timer frequencies (up to the KHz range which should be
+	 * unlikely).
+	 */
+	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
+	ratio = div64_u64(ratio, max_freq_hz);
+	if (!ratio) {
+		pr_err("System timer frequency too low.\n");
+		return -EINVAL;
+	}
+
+	per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio;
+
+	return 0;
+}
+
+static inline int
+enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
+{
+	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
+
+	if (!policy) {
+		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
+		return false;
+	}
+
+	if (cpumask_subset(policy->related_cpus, valid_cpus)) {
+		cpumask_or(amu_fie_cpus, policy->related_cpus,
+			   amu_fie_cpus);
+		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
+			cpumask_pr_args(amu_fie_cpus));
+	}
+
+	cpufreq_cpu_put(policy);
+
+	return true;
+}
+
+static int __init init_amu_fie(void)
+{
+	cpumask_var_t valid_cpus;
+	bool have_policy = false;
+	int cpu;
+
+	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL) ||
+	    !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))
+		return -ENOMEM;
+
+	for_each_possible_cpu(cpu) {
+		if (validate_cpu_freq_invariance_counters(cpu))
+			continue;
+		cpumask_set_cpu(cpu, valid_cpus);
+		have_policy = enable_policy_freq_counters(cpu, valid_cpus) ||
+			      have_policy;
+	}
+
+	if (!have_policy) {
+		/*
+		 * If we are not restricted by cpufreq policies, we only enable
+		 * the use of the AMU feature for FIE if all CPUs support AMU.
+		 * Otherwise, enable_policy_freq_counters has already enabled
+		 * policy cpus.
+		 */
+		if (cpumask_equal(valid_cpus, cpu_possible_mask)) {
+			cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
+			pr_info("CPUs[%*pbl]: counters will be used for FIE.",
+				cpumask_pr_args(amu_fie_cpus));
+		}
+	}
+
+	free_cpumask_var(valid_cpus);
+
+	return 0;
+}
+late_initcall_sync(init_amu_fie);
+
+bool topology_cpu_freq_counters(struct cpumask *cpus)
+{
+	return cpumask_available(amu_fie_cpus) &&
+	       cpumask_subset(cpus, amu_fie_cpus);
+}
+
+void topology_scale_freq_tick(void)
+{
+	u64 prev_core_cnt, prev_const_cnt;
+	u64 core_cnt, const_cnt, scale;
+	int cpu = smp_processor_id();
+
+	if (!cpumask_available(amu_fie_cpus) ||
+	    !cpumask_test_cpu(cpu, amu_fie_cpus))
+		return;
+
+	const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0);
+	core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0);
+	prev_const_cnt = this_cpu_read(arch_const_cycles_prev);
+	prev_core_cnt = this_cpu_read(arch_core_cycles_prev);
+
+	if (unlikely(core_cnt <= prev_core_cnt ||
+		     const_cnt <= prev_const_cnt))
+		goto store_and_exit;
+
+	/*
+	 *	    /\core    arch_max_freq_scale
+	 * scale =  ------- * --------------------
+	 *	    /\const   SCHED_CAPACITY_SCALE
+	 *
+	 * We shift by SCHED_CAPACITY_SHIFT (divide by SCHED_CAPACITY_SCALE)
+	 * in order to compensate for the SCHED_CAPACITY_SCALE² factor in
+	 * arch_max_freq_scale (used to ensure its resolution) while keeping
+	 * the scale value in the 0-SCHED_CAPACITY_SCALE capacity range.
+	 */
+	scale = core_cnt - prev_core_cnt;
+	scale *= this_cpu_read(arch_max_freq_scale);
+	scale = div64_u64(scale >> SCHED_CAPACITY_SHIFT,
+			  const_cnt - prev_const_cnt);
+
+	scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE);
+	this_cpu_write(freq_scale, (unsigned long)scale);
+
+store_and_exit:
+	this_cpu_write(arch_core_cycles_prev, core_cnt);
+	this_cpu_write(arch_const_cycles_prev, const_cnt);
+}
+#endif /* CONFIG_ARM64_AMU_EXTN */
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1eb81f113786..1ab2b7503d63 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -29,6 +29,14 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
 	unsigned long scale;
 	int i;
 
+	/*
+	 * If the use of counters for FIE is enabled, just return as we don't
+	 * want to update the scale factor with information from CPUFREQ.
+	 * Instead the scale factor will be updated from arch_scale_freq_tick.
+	 */
+	if (arch_cpu_freq_counters(cpus))
+		return;
+
 	scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq;
 
 	for_each_cpu(i, cpus)
diff --git a/include/linux/topology.h b/include/linux/topology.h
index eb2fe6edd73c..397aad6ae163 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -227,5 +227,12 @@ static inline const struct cpumask *cpu_cpu_mask(int cpu)
 	return cpumask_of_node(cpu_to_node(cpu));
 }
 
+#ifndef arch_cpu_freq_counters
+static __always_inline
+bool arch_cpu_freq_counters(struct cpumask *cpus)
+{
+	return false;
+}
+#endif
 
 #endif /* _LINUX_TOPOLOGY_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
                   ` (5 preceding siblings ...)
  2020-02-11 18:45 ` [PATCH v3 6/7] arm64: use activity monitors for frequency invariance Ionela Voinescu
@ 2020-02-11 18:45 ` Ionela Voinescu
  2020-02-12  9:30   ` Valentin Schneider
                     ` (2 more replies)
  6 siblings, 3 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-11 18:45 UTC (permalink / raw)
  To: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw,
	ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

From: Valentin Schneider <valentin.schneider@arm.com>

Using an arch timer with a frequency of less than 1MHz can result in an
incorrect functionality of the system which assumes a reasonable rate.

One example is the use of activity monitors for frequency invariance
which uses the rate of the arch timer as the known rate of the constant
cycle counter in computing its ratio compared to the maximum frequency
of a CPU. For arch timer frequencies less than 1MHz this ratio could
end up being 0 which is an invalid value for its use.

Therefore, warn if the arch timer rate is below 1MHz which contravenes
the recommended architecture interval of 1 to 50MHz.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
---
 drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 9a5464c625b4..4faa930eabf8 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int cpu)
 	return 0;
 }
 
+static int validate_timer_rate(void)
+{
+	if (!arch_timer_rate)
+		return -EINVAL;
+
+	/* Arch timer frequency < 1MHz can cause trouble */
+	WARN_ON(arch_timer_rate < 1000000);
+
+	return 0;
+}
+
 /*
  * For historical reasons, when probing with DT we use whichever (non-zero)
  * rate was probed first, and don't verify that others match. If the first node
@@ -900,7 +911,7 @@ static void arch_timer_of_configure_rate(u32 rate, struct device_node *np)
 		arch_timer_rate = rate;
 
 	/* Check the timer frequency. */
-	if (arch_timer_rate == 0)
+	if (validate_timer_rate())
 		pr_warn("frequency not available\n");
 }
 
@@ -1594,9 +1605,10 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table)
 	 * CNTFRQ value. This *must* be correct.
 	 */
 	arch_timer_rate = arch_timer_get_cntfrq();
-	if (!arch_timer_rate) {
+	ret = validate_timer_rate();
+	if (ret) {
 		pr_err(FW_BUG "frequency not available.\n");
-		return -EINVAL;
+		return ret;
 	}
 
 	arch_timer_uses_ppi = arch_timer_select_ppi();
-- 
2.17.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency
  2020-02-11 18:45 ` [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency Ionela Voinescu
@ 2020-02-12  4:14   ` Viresh Kumar
  2020-02-13 11:59   ` Valentin Schneider
  1 sibling, 0 replies; 42+ messages in thread
From: Viresh Kumar @ 2020-02-12  4:14 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

On 11-02-20, 18:45, Ionela Voinescu wrote:
> Add weak function to return the hardware maximum frequency of a
> CPU, with the default implementation returning cpuinfo.max_freq.
> 
> The default can be overwritten by a strong function in platforms
> that want to provide an alternative implementation.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c | 20 ++++++++++++++++++++
>  include/linux/cpufreq.h   |  5 +++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 77114a3897fb..d2ff3018861d 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1733,6 +1733,26 @@ unsigned int cpufreq_quick_get_max(unsigned int cpu)
>  }
>  EXPORT_SYMBOL(cpufreq_quick_get_max);
>  
> +/**
> + * cpufreq_get_hw_max_freq - get the max hardware frequency of the CPU
> + * @cpu: CPU number
> + *
> + * The default return value is the max_freq field of cpuinfo.
> + */
> +__weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +	unsigned int ret_freq = 0;
> +
> +	if (policy) {
> +		ret_freq = policy->cpuinfo.max_freq;
> +		cpufreq_cpu_put(policy);
> +	}
> +
> +	return ret_freq;
> +}
> +EXPORT_SYMBOL(cpufreq_get_hw_max_freq);
> +
>  static unsigned int __cpufreq_get(struct cpufreq_policy *policy)
>  {
>  	if (unlikely(policy_is_inactive(policy)))
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 31b1b0e03df8..b4423ff619f4 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -194,6 +194,7 @@ extern struct kobject *cpufreq_global_kobject;
>  unsigned int cpufreq_get(unsigned int cpu);
>  unsigned int cpufreq_quick_get(unsigned int cpu);
>  unsigned int cpufreq_quick_get_max(unsigned int cpu);
> +unsigned int cpufreq_get_hw_max_freq(unsigned int cpu);
>  void disable_cpufreq(void);
>  
>  u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy);
> @@ -223,6 +224,10 @@ static inline unsigned int cpufreq_quick_get_max(unsigned int cpu)
>  {
>  	return 0;
>  }
> +static inline unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
> +{
> +	return 0;
> +}
>  static inline void disable_cpufreq(void) { }
>  #endif

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
@ 2020-02-12  9:30   ` Valentin Schneider
  2020-02-12 10:32     ` Ionela Voinescu
  2020-02-12 10:01   ` Lukasz Luba
  2020-02-14  0:35   ` Thomas Gleixner
  2 siblings, 1 reply; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12  9:30 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On 11/02/2020 18:45, Ionela Voinescu wrote:
> From: Valentin Schneider <valentin.schneider@arm.com>
> 
> Using an arch timer with a frequency of less than 1MHz can result in an
> incorrect functionality of the system which assumes a reasonable rate.
> 
> One example is the use of activity monitors for frequency invariance
> which uses the rate of the arch timer as the known rate of the constant
> cycle counter in computing its ratio compared to the maximum frequency
> of a CPU. For arch timer frequencies less than 1MHz this ratio could
> end up being 0 which is an invalid value for its use.
> 

I'm being pedantic here (as usual), but I'd contrast this a bit more. The
activity monitor code checks by itself that the ratio doesn't end up being
0, which is why we don't slam the brakes if the arch timer freq is < 1MHz.

It's just a CNTFRQ sanity check that goes a bit beyond the 0 value check,
IMO.

> Therefore, warn if the arch timer rate is below 1MHz which contravenes
> the recommended architecture interval of 1 to 50MHz.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>

ISTR something somewhere that says the first signoff should be that of the
author of the patch, and seeing as I just provided an untested diff that
ought to be you :)


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
  2020-02-12  9:30   ` Valentin Schneider
@ 2020-02-12 10:01   ` Lukasz Luba
  2020-02-12 10:12     ` Marc Zyngier
  2020-02-14  0:35   ` Thomas Gleixner
  2 siblings, 1 reply; 42+ messages in thread
From: Lukasz Luba @ 2020-02-12 10:01 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Hi Ionela, Valentin

On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> From: Valentin Schneider <valentin.schneider@arm.com>
> 
> Using an arch timer with a frequency of less than 1MHz can result in an
> incorrect functionality of the system which assumes a reasonable rate.
> 
> One example is the use of activity monitors for frequency invariance
> which uses the rate of the arch timer as the known rate of the constant
> cycle counter in computing its ratio compared to the maximum frequency
> of a CPU. For arch timer frequencies less than 1MHz this ratio could
> end up being 0 which is an invalid value for its use.
> 
> Therefore, warn if the arch timer rate is below 1MHz which contravenes
> the recommended architecture interval of 1 to 50MHz.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> ---
>   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
>   1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 9a5464c625b4..4faa930eabf8 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int cpu)
>   	return 0;
>   }
>   
> +static int validate_timer_rate(void)
> +{
> +	if (!arch_timer_rate)
> +		return -EINVAL;
> +
> +	/* Arch timer frequency < 1MHz can cause trouble */
> +	WARN_ON(arch_timer_rate < 1000000);

I don't see a big value of having a patch just to add one extra warning,
in a situation which we handle in our code with in 6/7 with:

+	if (!ratio) {
+		pr_err("System timer frequency too low.\n");
+		return -EINVAL;
+	}

Furthermore, the value '100000' here is because of our code and
calculation in there, so it does not belong to arch timer. Someone
might ask why it's not 200000 or a define in our header...
Or questions asking why do you warn when that arch timer and cpu is not
AMU capable...

> +
> +	return 0;
> +}
> +
>   /*
>    * For historical reasons, when probing with DT we use whichever (non-zero)
>    * rate was probed first, and don't verify that others match. If the first node
> @@ -900,7 +911,7 @@ static void arch_timer_of_configure_rate(u32 rate, struct device_node *np)
>   		arch_timer_rate = rate;
>   
>   	/* Check the timer frequency. */
> -	if (arch_timer_rate == 0)
> +	if (validate_timer_rate())
>   		pr_warn("frequency not available\n");
>   }
>   
> @@ -1594,9 +1605,10 @@ static int __init arch_timer_acpi_init(struct acpi_table_header *table)
>   	 * CNTFRQ value. This *must* be correct.
>   	 */
>   	arch_timer_rate = arch_timer_get_cntfrq();
> -	if (!arch_timer_rate) {
> +	ret = validate_timer_rate();
> +	if (ret) {
>   		pr_err(FW_BUG "frequency not available.\n");
> -		return -EINVAL;
> +		return ret;
>   	}
>   
>   	arch_timer_uses_ppi = arch_timer_select_ppi();
> 

Lastly, this is arch timer.
To increase chances of getting merge soon, I would recommend to drop
the patch from this series.

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 10:01   ` Lukasz Luba
@ 2020-02-12 10:12     ` Marc Zyngier
  2020-02-12 10:54       ` Ionela Voinescu
  2020-02-12 10:55       ` Lukasz Luba
  0 siblings, 2 replies; 42+ messages in thread
From: Marc Zyngier @ 2020-02-12 10:12 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Ionela Voinescu, catalin.marinas, will, mark.rutland,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On 2020-02-12 10:01, Lukasz Luba wrote:
> Hi Ionela, Valentin
> 
> On 2/11/20 6:45 PM, Ionela Voinescu wrote:
>> From: Valentin Schneider <valentin.schneider@arm.com>
>> 
>> Using an arch timer with a frequency of less than 1MHz can result in 
>> an
>> incorrect functionality of the system which assumes a reasonable rate.
>> 
>> One example is the use of activity monitors for frequency invariance
>> which uses the rate of the arch timer as the known rate of the 
>> constant
>> cycle counter in computing its ratio compared to the maximum frequency
>> of a CPU. For arch timer frequencies less than 1MHz this ratio could
>> end up being 0 which is an invalid value for its use.
>> 
>> Therefore, warn if the arch timer rate is below 1MHz which contravenes
>> the recommended architecture interval of 1 to 50MHz.
>> 
>> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> ---
>>   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>> b/drivers/clocksource/arm_arch_timer.c
>> index 9a5464c625b4..4faa930eabf8 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int 
>> cpu)
>>   	return 0;
>>   }
>>   +static int validate_timer_rate(void)
>> +{
>> +	if (!arch_timer_rate)
>> +		return -EINVAL;
>> +
>> +	/* Arch timer frequency < 1MHz can cause trouble */
>> +	WARN_ON(arch_timer_rate < 1000000);
> 
> I don't see a big value of having a patch just to add one extra 
> warning,
> in a situation which we handle in our code with in 6/7 with:
> 
> +	if (!ratio) {
> +		pr_err("System timer frequency too low.\n");
> +		return -EINVAL;
> +	}
> 
> Furthermore, the value '100000' here is because of our code and
> calculation in there, so it does not belong to arch timer. Someone
> might ask why it's not 200000 or a define in our header...
> Or questions asking why do you warn when that arch timer and cpu is not
> AMU capable...

Because, as the commit message outlines it, such a frequency is terribly
out of spec?

> 
>> +
>> +	return 0;
>> +}
>> +
>>   /*
>>    * For historical reasons, when probing with DT we use whichever 
>> (non-zero)
>>    * rate was probed first, and don't verify that others match. If the 
>> first node
>> @@ -900,7 +911,7 @@ static void arch_timer_of_configure_rate(u32 rate, 
>> struct device_node *np)
>>   		arch_timer_rate = rate;
>>     	/* Check the timer frequency. */
>> -	if (arch_timer_rate == 0)
>> +	if (validate_timer_rate())
>>   		pr_warn("frequency not available\n");
>>   }
>>   @@ -1594,9 +1605,10 @@ static int __init arch_timer_acpi_init(struct 
>> acpi_table_header *table)
>>   	 * CNTFRQ value. This *must* be correct.
>>   	 */
>>   	arch_timer_rate = arch_timer_get_cntfrq();
>> -	if (!arch_timer_rate) {
>> +	ret = validate_timer_rate();
>> +	if (ret) {
>>   		pr_err(FW_BUG "frequency not available.\n");
>> -		return -EINVAL;
>> +		return ret;
>>   	}
>>     	arch_timer_uses_ppi = arch_timer_select_ppi();
>> 
> 
> Lastly, this is arch timer.
> To increase chances of getting merge soon, I would recommend to drop
> the patch from this series.

And? It seems to address a potential issue where the time frequency
is out of spec, and makes sure we don't end up with additional problems
in the AMU code.

On its own, it is perfectly sensible and could be merged as part of this
series with my

Acked-by: Marc Zyngier <maz@kernel.org>

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12  9:30   ` Valentin Schneider
@ 2020-02-12 10:32     ` Ionela Voinescu
  0 siblings, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 10:32 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, rjw, peterz, mingo, vincent.guittot,
	viresh.kumar, linux-arm-kernel, linux-doc, linux-kernel,
	linux-pm

Hi Valentin,

On Wednesday 12 Feb 2020 at 09:30:34 (+0000), Valentin Schneider wrote:
> On 11/02/2020 18:45, Ionela Voinescu wrote:
> > From: Valentin Schneider <valentin.schneider@arm.com>
> > 
> > Using an arch timer with a frequency of less than 1MHz can result in an
> > incorrect functionality of the system which assumes a reasonable rate.
> > 
> > One example is the use of activity monitors for frequency invariance
> > which uses the rate of the arch timer as the known rate of the constant
> > cycle counter in computing its ratio compared to the maximum frequency
> > of a CPU. For arch timer frequencies less than 1MHz this ratio could
> > end up being 0 which is an invalid value for its use.
> > 
> 
> I'm being pedantic here (as usual), but I'd contrast this a bit more. The
> activity monitor code checks by itself that the ratio doesn't end up being
> 0, which is why we don't slam the brakes if the arch timer freq is < 1MHz.
> 
> It's just a CNTFRQ sanity check that goes a bit beyond the 0 value check,
> IMO.
> 

I agree, but this part was just given as an example of functionality
that relies on a reasonable arch timer rate. The AMU code checks for the
ratio not to be 0 so it does not end up breaking frequency invariance.
But if the ratio does end up being 0 due to the value of arch_time_rate,
we bypass the use of activity monitors which I'd argue it's incorrect
functionality by disabling a potential better source of information for
frequency invariance.

But I can rewrite this part for more clarity.

> > Therefore, warn if the arch timer rate is below 1MHz which contravenes
> > the recommended architecture interval of 1 to 50MHz.
> > 
> > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Marc Zyngier <maz@kernel.org>
> 
> ISTR something somewhere that says the first signoff should be that of the
> author of the patch, and seeing as I just provided an untested diff that
> ought to be you :)

Will do!

Thanks,
Ionela.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 10:12     ` Marc Zyngier
@ 2020-02-12 10:54       ` Ionela Voinescu
  2020-02-12 10:55       ` Lukasz Luba
  1 sibling, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 10:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Lukasz Luba, catalin.marinas, will, mark.rutland, suzuki.poulose,
	sudeep.holla, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi guys,

On Wednesday 12 Feb 2020 at 10:12:32 (+0000), Marc Zyngier wrote:
> On 2020-02-12 10:01, Lukasz Luba wrote:
> > Hi Ionela, Valentin
> > 
> > On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> > > From: Valentin Schneider <valentin.schneider@arm.com>
> > > 
> > > Using an arch timer with a frequency of less than 1MHz can result in
> > > an
> > > incorrect functionality of the system which assumes a reasonable rate.
> > > 
> > > One example is the use of activity monitors for frequency invariance
> > > which uses the rate of the arch timer as the known rate of the
> > > constant
> > > cycle counter in computing its ratio compared to the maximum frequency
> > > of a CPU. For arch timer frequencies less than 1MHz this ratio could
> > > end up being 0 which is an invalid value for its use.
> > > 
> > > Therefore, warn if the arch timer rate is below 1MHz which contravenes
> > > the recommended architecture interval of 1 to 50MHz.
> > > 
> > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Marc Zyngier <maz@kernel.org>
> > > ---
> > >   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
> > >   1 file changed, 15 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/clocksource/arm_arch_timer.c
> > > b/drivers/clocksource/arm_arch_timer.c
> > > index 9a5464c625b4..4faa930eabf8 100644
> > > --- a/drivers/clocksource/arm_arch_timer.c
> > > +++ b/drivers/clocksource/arm_arch_timer.c
> > > @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int
> > > cpu)
> > >   	return 0;
> > >   }
> > >   +static int validate_timer_rate(void)
> > > +{
> > > +	if (!arch_timer_rate)
> > > +		return -EINVAL;
> > > +
> > > +	/* Arch timer frequency < 1MHz can cause trouble */
> > > +	WARN_ON(arch_timer_rate < 1000000);
> > 
> > I don't see a big value of having a patch just to add one extra warning,
> > in a situation which we handle in our code with in 6/7 with:
> > 
> > +	if (!ratio) {
> > +		pr_err("System timer frequency too low.\n");
> > +		return -EINVAL;
> > +	}
> > 
> > Furthermore, the value '100000' here is because of our code and
> > calculation in there, so it does not belong to arch timer. Someone
> > might ask why it's not 200000 or a define in our header...
> > Or questions asking why do you warn when that arch timer and cpu is not
> > AMU capable...
> 
> Because, as the commit message outlines it, such a frequency is terribly
> out of spec?
> 

Probably it could have been better emphasised in the commit message but,
yes, [1] specifies a typical range of 1-50Mhz. Therefore, taken
independently the role of this patch is to warn about an out of spec arch
timer rate. 
The link to AMU is here just as an example of a scenario where an out of
spec rate can affect functionality, but there is no dependency between
the threshold chosen here and AMU. AMU functionality only assumes the rate
recommended in the spec.

[1] https://static.docs.arm.com/ddi0487/ea/DDI0487E_a_armv8_arm.pdf

> > 
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >   /*
> > >    * For historical reasons, when probing with DT we use whichever
> > > (non-zero)
> > >    * rate was probed first, and don't verify that others match. If
> > > the first node
> > > @@ -900,7 +911,7 @@ static void arch_timer_of_configure_rate(u32
> > > rate, struct device_node *np)
> > >   		arch_timer_rate = rate;
> > >     	/* Check the timer frequency. */
> > > -	if (arch_timer_rate == 0)
> > > +	if (validate_timer_rate())
> > >   		pr_warn("frequency not available\n");
> > >   }
> > >   @@ -1594,9 +1605,10 @@ static int __init
> > > arch_timer_acpi_init(struct acpi_table_header *table)
> > >   	 * CNTFRQ value. This *must* be correct.
> > >   	 */
> > >   	arch_timer_rate = arch_timer_get_cntfrq();
> > > -	if (!arch_timer_rate) {
> > > +	ret = validate_timer_rate();
> > > +	if (ret) {
> > >   		pr_err(FW_BUG "frequency not available.\n");
> > > -		return -EINVAL;
> > > +		return ret;
> > >   	}
> > >     	arch_timer_uses_ppi = arch_timer_select_ppi();
> > > 
> > 
> > Lastly, this is arch timer.
> > To increase chances of getting merge soon, I would recommend to drop
> > the patch from this series.
> 
> And? It seems to address a potential issue where the time frequency
> is out of spec, and makes sure we don't end up with additional problems
> in the AMU code.
> 
> On its own, it is perfectly sensible and could be merged as part of this
> series with my
> 
> Acked-by: Marc Zyngier <maz@kernel.org>
> 
>         M.

Thanks, Marc! I'll keep this patch here then with some changes in the commit
message for more clarity and the change in author.

Thank you all,
Ionela.

> -- 
> Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 10:12     ` Marc Zyngier
  2020-02-12 10:54       ` Ionela Voinescu
@ 2020-02-12 10:55       ` Lukasz Luba
  2020-02-12 11:10         ` Marc Zyngier
  2020-02-12 11:12         ` Valentin Schneider
  1 sibling, 2 replies; 42+ messages in thread
From: Lukasz Luba @ 2020-02-12 10:55 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Ionela Voinescu, catalin.marinas, will, mark.rutland,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm



On 2/12/20 10:12 AM, Marc Zyngier wrote:
> On 2020-02-12 10:01, Lukasz Luba wrote:
>> Hi Ionela, Valentin
>>
>> On 2/11/20 6:45 PM, Ionela Voinescu wrote:
>>> From: Valentin Schneider <valentin.schneider@arm.com>
>>>
>>> Using an arch timer with a frequency of less than 1MHz can result in an
>>> incorrect functionality of the system which assumes a reasonable rate.
>>>
>>> One example is the use of activity monitors for frequency invariance
>>> which uses the rate of the arch timer as the known rate of the constant
>>> cycle counter in computing its ratio compared to the maximum frequency
>>> of a CPU. For arch timer frequencies less than 1MHz this ratio could
>>> end up being 0 which is an invalid value for its use.
>>>
>>> Therefore, warn if the arch timer rate is below 1MHz which contravenes
>>> the recommended architecture interval of 1 to 50MHz.
>>>
>>> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Marc Zyngier <maz@kernel.org>
>>> ---
>>>   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
>>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>>> b/drivers/clocksource/arm_arch_timer.c
>>> index 9a5464c625b4..4faa930eabf8 100644
>>> --- a/drivers/clocksource/arm_arch_timer.c
>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>> @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int 
>>> cpu)
>>>       return 0;
>>>   }
>>>   +static int validate_timer_rate(void)
>>> +{
>>> +    if (!arch_timer_rate)
>>> +        return -EINVAL;
>>> +
>>> +    /* Arch timer frequency < 1MHz can cause trouble */
>>> +    WARN_ON(arch_timer_rate < 1000000);
>>
>> I don't see a big value of having a patch just to add one extra warning,
>> in a situation which we handle in our code with in 6/7 with:
>>
>> +    if (!ratio) {
>> +        pr_err("System timer frequency too low.\n");
>> +        return -EINVAL;
>> +    }
>>
>> Furthermore, the value '100000' here is because of our code and
>> calculation in there, so it does not belong to arch timer. Someone
>> might ask why it's not 200000 or a define in our header...
>> Or questions asking why do you warn when that arch timer and cpu is not
>> AMU capable...
> 
> Because, as the commit message outlines it, such a frequency is terribly
> out of spec?

I don't see in the RM that < 1MHz is terribly out of spec.
'Frequency
Increments at a fixed frequency, typically in the range 1-50MHz.
Can support one or more alternative operating modes in which it 
increments by larger amounts at a
lower frequency, typically for power-saving.'

There is even an example how to operate at 20kHz and increment by 500.

I don't know the code if it's supported, thought.

> 
>>
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>   /*
>>>    * For historical reasons, when probing with DT we use whichever 
>>> (non-zero)
>>>    * rate was probed first, and don't verify that others match. If 
>>> the first node
>>> @@ -900,7 +911,7 @@ static void arch_timer_of_configure_rate(u32 
>>> rate, struct device_node *np)
>>>           arch_timer_rate = rate;
>>>         /* Check the timer frequency. */
>>> -    if (arch_timer_rate == 0)
>>> +    if (validate_timer_rate())
>>>           pr_warn("frequency not available\n");
>>>   }
>>>   @@ -1594,9 +1605,10 @@ static int __init 
>>> arch_timer_acpi_init(struct acpi_table_header *table)
>>>        * CNTFRQ value. This *must* be correct.
>>>        */
>>>       arch_timer_rate = arch_timer_get_cntfrq();
>>> -    if (!arch_timer_rate) {
>>> +    ret = validate_timer_rate();
>>> +    if (ret) {
>>>           pr_err(FW_BUG "frequency not available.\n");
>>> -        return -EINVAL;
>>> +        return ret;
>>>       }
>>>         arch_timer_uses_ppi = arch_timer_select_ppi();
>>>
>>
>> Lastly, this is arch timer.
>> To increase chances of getting merge soon, I would recommend to drop
>> the patch from this series.
> 
> And? It seems to address a potential issue where the time frequency
> is out of spec, and makes sure we don't end up with additional problems
> in the AMU code.

This patch just prints warning, does not change anything in booting or
in any code related to AMU.

> 
> On its own, it is perfectly sensible and could be merged as part of this
> series with my
> 
> Acked-by: Marc Zyngier <maz@kernel.org>
> 
>          M.

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 10:55       ` Lukasz Luba
@ 2020-02-12 11:10         ` Marc Zyngier
  2020-02-12 11:43           ` Lukasz Luba
  2020-02-12 11:12         ` Valentin Schneider
  1 sibling, 1 reply; 42+ messages in thread
From: Marc Zyngier @ 2020-02-12 11:10 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Ionela Voinescu, catalin.marinas, will, mark.rutland,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On 2020-02-12 10:55, Lukasz Luba wrote:
> On 2/12/20 10:12 AM, Marc Zyngier wrote:
>> On 2020-02-12 10:01, Lukasz Luba wrote:
>>> Hi Ionela, Valentin
>>> 
>>> On 2/11/20 6:45 PM, Ionela Voinescu wrote:
>>>> From: Valentin Schneider <valentin.schneider@arm.com>
>>>> 
>>>> Using an arch timer with a frequency of less than 1MHz can result in 
>>>> an
>>>> incorrect functionality of the system which assumes a reasonable 
>>>> rate.
>>>> 
>>>> One example is the use of activity monitors for frequency invariance
>>>> which uses the rate of the arch timer as the known rate of the 
>>>> constant
>>>> cycle counter in computing its ratio compared to the maximum 
>>>> frequency
>>>> of a CPU. For arch timer frequencies less than 1MHz this ratio could
>>>> end up being 0 which is an invalid value for its use.
>>>> 
>>>> Therefore, warn if the arch timer rate is below 1MHz which 
>>>> contravenes
>>>> the recommended architecture interval of 1 to 50MHz.
>>>> 
>>>> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>> ---
>>>>   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
>>>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>>> 
>>>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>>>> b/drivers/clocksource/arm_arch_timer.c
>>>> index 9a5464c625b4..4faa930eabf8 100644
>>>> --- a/drivers/clocksource/arm_arch_timer.c
>>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>>> @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned int 
>>>> cpu)
>>>>       return 0;
>>>>   }
>>>>   +static int validate_timer_rate(void)
>>>> +{
>>>> +    if (!arch_timer_rate)
>>>> +        return -EINVAL;
>>>> +
>>>> +    /* Arch timer frequency < 1MHz can cause trouble */
>>>> +    WARN_ON(arch_timer_rate < 1000000);
>>> 
>>> I don't see a big value of having a patch just to add one extra 
>>> warning,
>>> in a situation which we handle in our code with in 6/7 with:
>>> 
>>> +    if (!ratio) {
>>> +        pr_err("System timer frequency too low.\n");
>>> +        return -EINVAL;
>>> +    }
>>> 
>>> Furthermore, the value '100000' here is because of our code and
>>> calculation in there, so it does not belong to arch timer. Someone
>>> might ask why it's not 200000 or a define in our header...
>>> Or questions asking why do you warn when that arch timer and cpu is 
>>> not
>>> AMU capable...
>> 
>> Because, as the commit message outlines it, such a frequency is 
>> terribly
>> out of spec?
> 
> I don't see in the RM that < 1MHz is terribly out of spec.
> 'Frequency
> Increments at a fixed frequency, typically in the range 1-50MHz.
> Can support one or more alternative operating modes in which it
> increments by larger amounts at a
> lower frequency, typically for power-saving.'

Hint: constant apparent frequency.

> There is even an example how to operate at 20kHz and increment by 500.
> 
> I don't know the code if it's supported, thought.

You're completely missing the point, I'm afraid. Nobody has to know that
this is happening. For all intent and purposes, the counter has always
the same frequency, even if the HW does fewer ticks of larger 
increments.


[...]

>>> Lastly, this is arch timer.
>>> To increase chances of getting merge soon, I would recommend to drop
>>> the patch from this series.
>> 
>> And? It seems to address a potential issue where the time frequency
>> is out of spec, and makes sure we don't end up with additional 
>> problems
>> in the AMU code.
> 
> This patch just prints warning, does not change anything in booting or
> in any code related to AMU.

It seems to solve an issue with an assumption made in the AMU driver,
and would help debugging the problem on broken systems. Are you saying
that this is not the case and that the AMU code can perfectly cope with
the frequency being less than 1MHz?

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 10:55       ` Lukasz Luba
  2020-02-12 11:10         ` Marc Zyngier
@ 2020-02-12 11:12         ` Valentin Schneider
  1 sibling, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12 11:12 UTC (permalink / raw)
  To: Lukasz Luba, Marc Zyngier
  Cc: Ionela Voinescu, catalin.marinas, will, mark.rutland,
	suzuki.poulose, sudeep.holla, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

On 12/02/2020 10:55, Lukasz Luba wrote:
>> Because, as the commit message outlines it, such a frequency is terribly
>> out of spec?
> 
> I don't see in the RM that < 1MHz is terribly out of spec.
> 'Frequency
> Increments at a fixed frequency, typically in the range 1-50MHz.
> Can support one or more alternative operating modes in which it increments by larger amounts at a
> lower frequency, typically for power-saving.'
> 
> There is even an example how to operate at 20kHz and increment by 500.
> 
> I don't know the code if it's supported, thought.
> 

For that one case the value reported by CNTFRQ shouldn't change - it's still
a timer that looks like is operating at 10MHz, but under the hood is doing
bigger increments at lower freq.

As I was trying to get to, this patch isn't validating the actual frequency
the timer operates on, rather that whatever is reported by CNTFRQ is
somewhat sane (which here means [1, 50]MHz, although we just check the
lower bound).

[...]

>> And? It seems to address a potential issue where the time frequency
>> is out of spec, and makes sure we don't end up with additional problems
>> in the AMU code.
> 
> This patch just prints warning, does not change anything in booting or
> in any code related to AMU.
> 

Right, but it should still be worth having - at least it shows up in
dmesg, and when someone reports something fishy we get a hint that we can
blame the hardware.

>>
>> On its own, it is perfectly sensible and could be merged as part of this
>> series with my
>>
>> Acked-by: Marc Zyngier <maz@kernel.org>
>>
>>          M.
> 
> Regards,
> Lukasz

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-11 18:45 ` [PATCH v3 1/7] arm64: add support for the AMU extension v1 Ionela Voinescu
@ 2020-02-12 11:30   ` Suzuki Kuruppassery Poulose
  2020-02-12 14:54     ` Valentin Schneider
  2020-02-12 16:10     ` Ionela Voinescu
  0 siblings, 2 replies; 42+ messages in thread
From: Suzuki Kuruppassery Poulose @ 2020-02-12 11:30 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Hi Ionela,

On 11/02/2020 18:45, Ionela Voinescu wrote:
> The activity monitors extension is an optional extension introduced
> by the ARMv8.4 CPU architecture. This implements basic support for
> version 1 of the activity monitors architecture, AMUv1.
> 
> This support includes:
> - Extension detection on each CPU (boot, secondary, hotplugged)
> - Register interface for AMU aarch64 registers
> - disable_amu kernel parameter to disable detection/counter access
>    at runtime
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> ---
>   .../admin-guide/kernel-parameters.txt         | 10 ++
>   arch/arm64/Kconfig                            | 31 ++++++
>   arch/arm64/include/asm/cpucaps.h              |  3 +-
>   arch/arm64/include/asm/cpufeature.h           |  5 +
>   arch/arm64/include/asm/sysreg.h               | 38 ++++++++
>   arch/arm64/kernel/cpufeature.c                | 97 +++++++++++++++++++
>   6 files changed, 183 insertions(+), 1 deletion(-)
> 
...

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 04cf64e9f0c9..029a473ad273 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -156,6 +156,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
>   	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_DIT_SHIFT, 4, 0),
> +	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_AMU_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE),
>   				   FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
> @@ -1150,6 +1151,84 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
>   
>   #endif
>   
> +#ifdef CONFIG_ARM64_AMU_EXTN
> +
> +/*
> + * The "amu_cpus" cpumask only signals that the CPU implementation for the
> + * flagged CPUs supports the Activity Monitors Unit (AMU) but does not provide
> + * information regarding all the events that it supports. When a CPU bit is
> + * set in the cpumask, the user of this feature can only rely on the presence
> + * of the 4 fixed counters for that CPU. But this does not guarantee that the
> + * counters are enabled or access to these counters is enabled by code
> + * executed at higher exception levels (firmware).
> + */
> +static cpumask_var_t amu_cpus;
> +
> +bool cpu_has_amu_feat(int cpu)
> +{
> +	if (cpumask_available(amu_cpus))
> +		return cpumask_test_cpu(cpu, amu_cpus);
> +
> +	return false;
> +}
> +
> +static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
> +{
> +	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) {
> +		pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
> +			smp_processor_id());
> +		cpumask_set_cpu(smp_processor_id(), amu_cpus);
> +	}
> +}
> +
> +/*
> + * For known broken firmware, a kernel parameter ("disable_amu") is provided
> + * to ensure access to AMU counter registers is not attempted. By default,
> + * the feature is enabled, but disable_amu can both be used to disable or
> + * enable the capability at runtime in case the default changes in the future.
> + *
> + * To be noted that for security considerations, this does not bypass the
> + * setting of AMUSERENR_EL0 to trap accesses from EL0 (userspace) to EL1
> + * (kernel). Therefore, firmware should still ensure accesses to AMU registers
> + * are not trapped in EL2/EL3.
> + */
> +static bool disable_amu;
> +
> +static int __init set_disable_amu(char *str)
> +{
> +	int value = 0;
> +
> +	disable_amu = get_option(&str, &value) ? !!value : true;

minor nit: You could simply use strtobool(str) here, which accepts:

disable_amu= [0/1/on/off/y/n]


> +
> +	return 0;
> +}
> +early_param("disable_amu", set_disable_amu);
> +
> +static bool has_amu(const struct arm64_cpu_capabilities *cap,
> +		       int __unused)
> +{
> +	/*
> +	 * The AMU extension is a non-conflicting feature: the kernel can
> +	 * safely run a mix of CPUs with and without support for the
> +	 * activity monitors extension. Therefore, if not disabled through
> +	 * the kernel command line early parameter, enable the capability
> +	 * to allow any late CPU to use the feature.
> +	 *
> +	 * With this feature enabled, the cpu_enable function will be called
> +	 * for all CPUs that match the criteria, including secondary and
> +	 * hotplugged, marking this feature as present on that respective CPU.
> +	 * The enable function will also print a detection message.
> +	 */
> +
> +	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {

This looks problematic. Don't we end up in allocating the memory during
"each CPU" check and thus leaking memory ? Do we really need to allocate
this dynamically ?

> +		pr_err("Activity Monitors Unit (AMU): fail to allocate memory");
> +		disable_amu = true;
> +	}
> +
> +	return !disable_amu;
> +}
> +#endif
> +
>   #ifdef CONFIG_ARM64_VHE
>   static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused)
>   {
> @@ -1419,6 +1498,24 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>   		.cpu_enable = cpu_clear_disr,
>   	},
>   #endif /* CONFIG_ARM64_RAS_EXTN */
> +#ifdef CONFIG_ARM64_AMU_EXTN
> +	{
> +		/*
> +		 * The feature is enabled by default if CONFIG_ARM64_AMU_EXTN=y.
> +		 * Therefore, don't provide .desc as we don't want the detection
> +		 * message to be shown until at least one CPU is detected to
> +		 * support the feature.
> +		 */
> +		.capability = ARM64_HAS_AMU_EXTN,
> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> +		.matches = has_amu,
> +		.sys_reg = SYS_ID_AA64PFR0_EL1,
> +		.sign = FTR_UNSIGNED,
> +		.field_pos = ID_AA64PFR0_AMU_SHIFT,
> +		.min_field_value = ID_AA64PFR0_AMU,
> +		.cpu_enable = cpu_amu_enable,
> +	},
> +#endif /* CONFIG_ARM64_AMU_EXTN */
>   	{
>   		.desc = "Data cache clean to the PoU not required for I/D coherence",
>   		.capability = ARM64_HAS_CACHE_IDC,
> 


Thanks
Suzuki

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-12 11:10         ` Marc Zyngier
@ 2020-02-12 11:43           ` Lukasz Luba
  0 siblings, 0 replies; 42+ messages in thread
From: Lukasz Luba @ 2020-02-12 11:43 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Ionela Voinescu, catalin.marinas, will, mark.rutland,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm



On 2/12/20 11:10 AM, Marc Zyngier wrote:
> On 2020-02-12 10:55, Lukasz Luba wrote:
>> On 2/12/20 10:12 AM, Marc Zyngier wrote:
>>> On 2020-02-12 10:01, Lukasz Luba wrote:
>>>> Hi Ionela, Valentin
>>>>
>>>> On 2/11/20 6:45 PM, Ionela Voinescu wrote:
>>>>> From: Valentin Schneider <valentin.schneider@arm.com>
>>>>>
>>>>> Using an arch timer with a frequency of less than 1MHz can result 
>>>>> in an
>>>>> incorrect functionality of the system which assumes a reasonable rate.
>>>>>
>>>>> One example is the use of activity monitors for frequency invariance
>>>>> which uses the rate of the arch timer as the known rate of the 
>>>>> constant
>>>>> cycle counter in computing its ratio compared to the maximum frequency
>>>>> of a CPU. For arch timer frequencies less than 1MHz this ratio could
>>>>> end up being 0 which is an invalid value for its use.
>>>>>
>>>>> Therefore, warn if the arch timer rate is below 1MHz which contravenes
>>>>> the recommended architecture interval of 1 to 50MHz.
>>>>>
>>>>> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>>> ---
>>>>>   drivers/clocksource/arm_arch_timer.c | 18 +++++++++++++++---
>>>>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>>>>> b/drivers/clocksource/arm_arch_timer.c
>>>>> index 9a5464c625b4..4faa930eabf8 100644
>>>>> --- a/drivers/clocksource/arm_arch_timer.c
>>>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>>>> @@ -885,6 +885,17 @@ static int arch_timer_starting_cpu(unsigned 
>>>>> int cpu)
>>>>>       return 0;
>>>>>   }
>>>>>   +static int validate_timer_rate(void)
>>>>> +{
>>>>> +    if (!arch_timer_rate)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    /* Arch timer frequency < 1MHz can cause trouble */
>>>>> +    WARN_ON(arch_timer_rate < 1000000);
>>>>
>>>> I don't see a big value of having a patch just to add one extra 
>>>> warning,
>>>> in a situation which we handle in our code with in 6/7 with:
>>>>
>>>> +    if (!ratio) {
>>>> +        pr_err("System timer frequency too low.\n");
>>>> +        return -EINVAL;
>>>> +    }
>>>>
>>>> Furthermore, the value '100000' here is because of our code and
>>>> calculation in there, so it does not belong to arch timer. Someone
>>>> might ask why it's not 200000 or a define in our header...
>>>> Or questions asking why do you warn when that arch timer and cpu is not
>>>> AMU capable...
>>>
>>> Because, as the commit message outlines it, such a frequency is terribly
>>> out of spec?
>>
>> I don't see in the RM that < 1MHz is terribly out of spec.
>> 'Frequency
>> Increments at a fixed frequency, typically in the range 1-50MHz.
>> Can support one or more alternative operating modes in which it
>> increments by larger amounts at a
>> lower frequency, typically for power-saving.'
> 
> Hint: constant apparent frequency.
> 
>> There is even an example how to operate at 20kHz and increment by 500.
>>
>> I don't know the code if it's supported, thought.
> 
> You're completely missing the point, I'm afraid. Nobody has to know that
> this is happening. For all intent and purposes, the counter has always
> the same frequency, even if the HW does fewer ticks of larger increments.

Fair enough. As I said I don't know details of that code.

> 
> 
> [...]
> 
>>>> Lastly, this is arch timer.
>>>> To increase chances of getting merge soon, I would recommend to drop
>>>> the patch from this series.
>>>
>>> And? It seems to address a potential issue where the time frequency
>>> is out of spec, and makes sure we don't end up with additional problems
>>> in the AMU code.
>>
>> This patch just prints warning, does not change anything in booting or
>> in any code related to AMU.
> 
> It seems to solve an issue with an assumption made in the AMU driver,
> and would help debugging the problem on broken systems. Are you saying
> that this is not the case and that the AMU code can perfectly cope with
> the frequency being less than 1MHz?

What I was saying is that patch 6/7 has the code which checks the rate
and reacts, so it does not need this patch. In case of helping with
debugging, the patch 6/7 also prints error
"System timer frequency too low" and bails out.
The commit message could have better emphasize it.

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0
  2020-02-11 18:45 ` [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0 Ionela Voinescu
@ 2020-02-12 11:44   ` Suzuki Kuruppassery Poulose
  2020-02-12 15:36   ` Valentin Schneider
  1 sibling, 0 replies; 42+ messages in thread
From: Suzuki Kuruppassery Poulose @ 2020-02-12 11:44 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, Steve Capper

On 11/02/2020 18:45, Ionela Voinescu wrote:
> The activity monitors extension is an optional extension introduced
> by the ARMv8.4 CPU architecture. In order to access the activity
> monitors counters safely, if desired, the kernel should detect the
> presence of the extension through the feature register, and mediate
> the access.
> 
> Therefore, disable direct accesses to activity monitors counters
> from EL0 (userspace) and trap them to EL1 (kernel).
> 
> To be noted that the ARM64_AMU_EXTN kernel config and the disable_amu
> kernel parameter do not have an effect on this code. Given that the
> amuserenr_el0 resets to an UNKNOWN value, setting the trap of EL0
> accesses to EL1 is always attempted for safety and security
> considerations. Therefore firmware should still ensure accesses to
> AMU registers are not trapped in EL2/EL3 as this code cannot be
> bypassed if the CPU implements the Activity Monitors Unit.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Steve Capper <steve.capper@arm.com>
> ---

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 11:30   ` Suzuki Kuruppassery Poulose
@ 2020-02-12 14:54     ` Valentin Schneider
  2020-02-12 16:10     ` Ionela Voinescu
  1 sibling, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12 14:54 UTC (permalink / raw)
  To: Suzuki Kuruppassery Poulose, Ionela Voinescu, catalin.marinas,
	will, mark.rutland, maz, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On 12/02/2020 11:30, Suzuki Kuruppassery Poulose wrote:
>> +static bool has_amu(const struct arm64_cpu_capabilities *cap,
>> +               int __unused)
>> +{
>> +    /*
>> +     * The AMU extension is a non-conflicting feature: the kernel can
>> +     * safely run a mix of CPUs with and without support for the
>> +     * activity monitors extension. Therefore, if not disabled through
>> +     * the kernel command line early parameter, enable the capability
>> +     * to allow any late CPU to use the feature.
>> +     *
>> +     * With this feature enabled, the cpu_enable function will be called
>> +     * for all CPUs that match the criteria, including secondary and
>> +     * hotplugged, marking this feature as present on that respective CPU.
>> +     * The enable function will also print a detection message.
>> +     */
>> +
>> +    if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> 
> This looks problematic. Don't we end up in allocating the memory during
> "each CPU" check and thus leaking memory ? Do we really need to allocate
> this dynamically ?
> 

For the static vs dynamic thing, I think it's not *too* important here since
we don't risk pwning the stack because of the cpumask. That said, if we are
somewhat pedantic about memory usage, the static allocation is done
against NR_CPUS whereas the dynamic one is done against nr_cpu_ids.
Pretty inconsequential for a single cpumask, but I guess it all adds up
eventually...

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0
  2020-02-11 18:45 ` [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0 Ionela Voinescu
  2020-02-12 11:44   ` Suzuki Kuruppassery Poulose
@ 2020-02-12 15:36   ` Valentin Schneider
  1 sibling, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12 15:36 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, Steve Capper

On 11/02/2020 18:45, Ionela Voinescu wrote:
> The activity monitors extension is an optional extension introduced
> by the ARMv8.4 CPU architecture. In order to access the activity
> monitors counters safely, if desired, the kernel should detect the
> presence of the extension through the feature register, and mediate
> the access.
> 
> Therefore, disable direct accesses to activity monitors counters
> from EL0 (userspace) and trap them to EL1 (kernel).
> 
> To be noted that the ARM64_AMU_EXTN kernel config and the disable_amu
> kernel parameter do not have an effect on this code. Given that the
> amuserenr_el0 resets to an UNKNOWN value, setting the trap of EL0
> accesses to EL1 is always attempted for safety and security
> considerations. Therefore firmware should still ensure accesses to
> AMU registers are not trapped in EL2/EL3 as this code cannot be
> bypassed if the CPU implements the Activity Monitors Unit.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Steve Capper <steve.capper@arm.com>

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests
  2020-02-11 18:45 ` [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests Ionela Voinescu
@ 2020-02-12 15:36   ` Valentin Schneider
  2020-02-12 16:33   ` Suzuki Kuruppassery Poulose
  1 sibling, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12 15:36 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, James Morse, Julien Thierry

On 11/02/2020 18:45, Ionela Voinescu wrote:
> Access to the AMU counters should be disabled by default in kvm guests,
> as information from the counters might reveal activity in other guests
> or activity on the host.
> 
> Therefore, disable access to AMU registers from EL0 and EL1 in kvm
> guests by:
>  - Hiding the presence of the extension in the feature register
>    (SYS_ID_AA64PFR0_EL1) on the VCPU.
>  - Disabling access to the AMU registers before switching to the guest.
>  - Trapping accesses and injecting an undefined instruction into the
>    guest.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: James Morse <james.morse@arm.com>
> Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>

Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension
  2020-02-11 18:45 ` [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension Ionela Voinescu
@ 2020-02-12 15:36   ` Valentin Schneider
  0 siblings, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-12 15:36 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, Jonathan Corbet

On 11/02/2020 18:45, Ionela Voinescu wrote:
> The activity monitors extension is an optional extension introduced
> by the ARMv8.4 CPU architecture.
> 
> Add initial documentation for the AMUv1 extension:
>  - arm64/amu.txt: AMUv1 documentation
>  - arm64/booting.txt: system registers initialisation
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Jonathan Corbet <corbet@lwn.net>


Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 11:30   ` Suzuki Kuruppassery Poulose
  2020-02-12 14:54     ` Valentin Schneider
@ 2020-02-12 16:10     ` Ionela Voinescu
  2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
  2020-02-12 16:24       ` Vladimir Murzin
  1 sibling, 2 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 16:10 UTC (permalink / raw)
  To: Suzuki Kuruppassery Poulose
  Cc: catalin.marinas, will, mark.rutland, maz, sudeep.holla,
	lukasz.luba, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi Suzuki,

On Wednesday 12 Feb 2020 at 11:30:44 (+0000), Suzuki Kuruppassery Poulose wrote:
> > +static int __init set_disable_amu(char *str)
> > +{
> > +	int value = 0;
> > +
> > +	disable_amu = get_option(&str, &value) ? !!value : true;
> 
> minor nit: You could simply use strtobool(str) here, which accepts:
> 
> disable_amu= [0/1/on/off/y/n]
>

Yes, this was intentional as I wanted "disable_amu" to be a valid option
as well, not only "disable_amu=<option>".

If you don't mind I'd like to keep it like this. Currently the use of
AMU is enabled by default, and the most common kernel parameter to
disable it would be "disable_amu". Allowing "disable_amu=0" is just in
case we change the default in the kernel to not support AMU and we'd
like platforms to be able to enable it. 

> 
> > +
> > +	return 0;
> > +}
> > +early_param("disable_amu", set_disable_amu);
> > +
> > +static bool has_amu(const struct arm64_cpu_capabilities *cap,
> > +		       int __unused)
> > +{
> > +	/*
> > +	 * The AMU extension is a non-conflicting feature: the kernel can
> > +	 * safely run a mix of CPUs with and without support for the
> > +	 * activity monitors extension. Therefore, if not disabled through
> > +	 * the kernel command line early parameter, enable the capability
> > +	 * to allow any late CPU to use the feature.
> > +	 *
> > +	 * With this feature enabled, the cpu_enable function will be called
> > +	 * for all CPUs that match the criteria, including secondary and
> > +	 * hotplugged, marking this feature as present on that respective CPU.
> > +	 * The enable function will also print a detection message.
> > +	 */
> > +
> > +	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> 
> This looks problematic. Don't we end up in allocating the memory during
> "each CPU" check and thus leaking memory ? Do we really need to allocate
> this dynamically ?
> 

Yes, it does make some assumptions. Given that the AMU capability is
a WEAK_LOCAL_CPU_FEATURE I relied on the match function being called
only once, when the return value is true. If the return value is false,
which will result in it being called multiple times, it's either because
disable_amu == false, or it has become false due to a previous failed
allocation, in which case a new allocation will not be attempted.

For better handling I could have a cpumask_available check before the
allocation just in case the capability type changes in the future, or to
at least not rely on assumptions based on the type of the capability.

The reason this is dynamic is that I wanted to avoid the memory being
allocated when disable_amu is true - as Valentin mentioned in a comment
in the meantime "the static allocation is done against NR_CPUS whereas
the dynamic one is done against nr_cpu_ids".

Would this be alright?

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 182e05ca3410..4cee6b147ddd 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1222,7 +1222,11 @@ static bool has_amu(const struct arm64_cpu_capabilities *cap,
         * The enable function will also print a detection message.
         */
 
-       if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
+       if (disable_amu)
+               return false;
+
+       if (!cpumask_available(amu_cpus) &&
+           !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
                pr_err("Activity Monitors Unit (AMU): fail to allocate memory");
                disable_amu = true;
        }

Otherwise I can go for static allocation.

Thank you,
Ionela.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 16:10     ` Ionela Voinescu
@ 2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
  2020-02-12 18:20         ` Ionela Voinescu
  2020-02-12 20:19         ` Ionela Voinescu
  2020-02-12 16:24       ` Vladimir Murzin
  1 sibling, 2 replies; 42+ messages in thread
From: Suzuki Kuruppassery Poulose @ 2020-02-12 16:20 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: catalin.marinas, will, mark.rutland, maz, sudeep.holla,
	lukasz.luba, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi Ionela,

On 12/02/2020 16:10, Ionela Voinescu wrote:
> Hi Suzuki,
> 
> On Wednesday 12 Feb 2020 at 11:30:44 (+0000), Suzuki Kuruppassery Poulose wrote:
>>> +static int __init set_disable_amu(char *str)
>>> +{
>>> +	int value = 0;
>>> +
>>> +	disable_amu = get_option(&str, &value) ? !!value : true;
>>
>> minor nit: You could simply use strtobool(str) here, which accepts:
>>
>> disable_amu= [0/1/on/off/y/n]
>>
> 
> Yes, this was intentional as I wanted "disable_amu" to be a valid option
> as well, not only "disable_amu=<option>".
> 
> If you don't mind I'd like to keep it like this. Currently the use of

Sure, thats fine.

>>> +
>>> +	return 0;
>>> +}
>>> +early_param("disable_amu", set_disable_amu);
>>> +
>>> +static bool has_amu(const struct arm64_cpu_capabilities *cap,
>>> +		       int __unused)
>>> +{
>>> +	/*
>>> +	 * The AMU extension is a non-conflicting feature: the kernel can
>>> +	 * safely run a mix of CPUs with and without support for the
>>> +	 * activity monitors extension. Therefore, if not disabled through
>>> +	 * the kernel command line early parameter, enable the capability
>>> +	 * to allow any late CPU to use the feature.
>>> +	 *
>>> +	 * With this feature enabled, the cpu_enable function will be called
>>> +	 * for all CPUs that match the criteria, including secondary and
>>> +	 * hotplugged, marking this feature as present on that respective CPU.
>>> +	 * The enable function will also print a detection message.
>>> +	 */
>>> +
>>> +	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
>>
>> This looks problematic. Don't we end up in allocating the memory during
>> "each CPU" check and thus leaking memory ? Do we really need to allocate
>> this dynamically ?
>>
> 
> Yes, it does make some assumptions. Given that the AMU capability is
> a WEAK_LOCAL_CPU_FEATURE I relied on the match function being called
> only once, when the return value is true. If the return value is false,

That is not correct. A WEAK_LOCAL_CPU_FEATURE is still SCOPE_LOCAL_CPU,
implies it is run on all the booting CPUs (including the hotplugged
ones). The WEAK is there to imply that its "permitted" or "optional"
for a hotplugged CPU. So, eventually you will re-allocate this variable
every single time a CPU turns up, where you could also loose the current
state.

> which will result in it being called multiple times, it's either because
> disable_amu == false, or it has become false due to a previous failed
> allocation, in which case a new allocation will not be attempted.
> 
> For better handling I could have a cpumask_available check before the
> allocation just in case the capability type changes in the future, or to
> at least not rely on assumptions based on the type of the capability.
> 
> The reason this is dynamic is that I wanted to avoid the memory being
> allocated when disable_amu is true - as Valentin mentioned in a comment
> in the meantime "the static allocation is done against NR_CPUS whereas
> the dynamic one is done against nr_cpu_ids".
> 
> Would this be alright?
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 182e05ca3410..4cee6b147ddd 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1222,7 +1222,11 @@ static bool has_amu(const struct arm64_cpu_capabilities *cap,
>           * The enable function will also print a detection message.
>           */
>   
> -       if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> +       if (disable_amu)
> +               return false;
> +
> +       if (!cpumask_available(amu_cpus) &&
> +           !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
>                  pr_err("Activity Monitors Unit (AMU): fail to allocate memory");
>                  disable_amu = true;
>          }

This looks fine.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 16:10     ` Ionela Voinescu
  2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
@ 2020-02-12 16:24       ` Vladimir Murzin
  2020-02-12 18:27         ` Ionela Voinescu
  1 sibling, 1 reply; 42+ messages in thread
From: Vladimir Murzin @ 2020-02-12 16:24 UTC (permalink / raw)
  To: Ionela Voinescu, Suzuki Kuruppassery Poulose
  Cc: mark.rutland, maz, linux-doc, peterz, catalin.marinas, linux-pm,
	rjw, linux-kernel, mingo, viresh.kumar, linux-arm-kernel,
	sudeep.holla, will, valentin.schneider, lukasz.luba

Hi,

On 2/12/20 4:10 PM, Ionela Voinescu wrote:
> Hi Suzuki,
> 
> On Wednesday 12 Feb 2020 at 11:30:44 (+0000), Suzuki Kuruppassery Poulose wrote:
>>> +static int __init set_disable_amu(char *str)
>>> +{
>>> +	int value = 0;
>>> +
>>> +	disable_amu = get_option(&str, &value) ? !!value : true;
>> minor nit: You could simply use strtobool(str) here, which accepts:
>>
>> disable_amu= [0/1/on/off/y/n]
>>
> Yes, this was intentional as I wanted "disable_amu" to be a valid option
> as well, not only "disable_amu=<option>".
> 
> If you don't mind I'd like to keep it like this. Currently the use of
> AMU is enabled by default, and the most common kernel parameter to
> disable it would be "disable_amu". Allowing "disable_amu=0" is just in
> case we change the default in the kernel to not support AMU and we'd
> like platforms to be able to enable it. 
> 

Sorry for jumping into thread, but can we avoid negatives into naming which
accept values? If is always tricky to get expected effect when both are combined.

If value doesn't really mater than can it be just "noamu"?

If value does matter can it be (per Suzuki) amu=[0/1/on/off/y/n]?

Or can you postpone introduction of "just in case" option till that case happens?

Cheers
Vladimir

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests
  2020-02-11 18:45 ` [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests Ionela Voinescu
  2020-02-12 15:36   ` Valentin Schneider
@ 2020-02-12 16:33   ` Suzuki Kuruppassery Poulose
  1 sibling, 0 replies; 42+ messages in thread
From: Suzuki Kuruppassery Poulose @ 2020-02-12 16:33 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm, James Morse, Julien Thierry

On 11/02/2020 18:45, Ionela Voinescu wrote:
> Access to the AMU counters should be disabled by default in kvm guests,
> as information from the counters might reveal activity in other guests
> or activity on the host.
> 
> Therefore, disable access to AMU registers from EL0 and EL1 in kvm
> guests by:
>   - Hiding the presence of the extension in the feature register
>     (SYS_ID_AA64PFR0_EL1) on the VCPU.
>   - Disabling access to the AMU registers before switching to the guest.
>   - Trapping accesses and injecting an undefined instruction into the
>     guest.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: James Morse <james.morse@arm.com>
> Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
@ 2020-02-12 18:20         ` Ionela Voinescu
  2020-02-12 19:24           ` Suzuki K Poulose
  2020-02-12 20:19         ` Ionela Voinescu
  1 sibling, 1 reply; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 18:20 UTC (permalink / raw)
  To: Suzuki Kuruppassery Poulose
  Cc: catalin.marinas, will, mark.rutland, maz, sudeep.holla,
	lukasz.luba, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi Suzuki,

On Wednesday 12 Feb 2020 at 16:20:56 (+0000), Suzuki Kuruppassery Poulose wrote:
> > > > +static bool has_amu(const struct arm64_cpu_capabilities *cap,
> > > > +		       int __unused)
> > > > +{
> > > > +	/*
> > > > +	 * The AMU extension is a non-conflicting feature: the kernel can
> > > > +	 * safely run a mix of CPUs with and without support for the
> > > > +	 * activity monitors extension. Therefore, if not disabled through
> > > > +	 * the kernel command line early parameter, enable the capability
> > > > +	 * to allow any late CPU to use the feature.
> > > > +	 *
> > > > +	 * With this feature enabled, the cpu_enable function will be called
> > > > +	 * for all CPUs that match the criteria, including secondary and
> > > > +	 * hotplugged, marking this feature as present on that respective CPU.
> > > > +	 * The enable function will also print a detection message.
> > > > +	 */
> > > > +
> > > > +	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> > > 
> > > This looks problematic. Don't we end up in allocating the memory during
> > > "each CPU" check and thus leaking memory ? Do we really need to allocate
> > > this dynamically ?
> > > 
> > 
> > Yes, it does make some assumptions. Given that the AMU capability is
> > a WEAK_LOCAL_CPU_FEATURE I relied on the match function being called
> > only once, when the return value is true. If the return value is false,
> 
> That is not correct. A WEAK_LOCAL_CPU_FEATURE is still SCOPE_LOCAL_CPU,
> implies it is run on all the booting CPUs (including the hotplugged
> ones). The WEAK is there to imply that its "permitted" or "optional"
> for a hotplugged CPU. So, eventually you will re-allocate this variable
> every single time a CPU turns up, where you could also loose the current
> state.
> 

> > which will result in it being called multiple times, it's either because
> > disable_amu == false, or it has become false due to a previous failed
> > allocation, in which case a new allocation will not be attempted.

First of all, I agree with you that this should be corrected.

But for completion (and my education) I retraced my steps in regards
to my assumption above. While cpu_enable is called for all CPUs - boot,
secondary, hotplugged, the matches function (in this case has_amu) is
not always called for all CPUs, and that's where the confusion came
from.

Looking over the update_cpu_capabilities function, that's called from
both setup_boot_cpu_capabilities and check_local_cpu_capabilities
(secondary CPUs) for SCOPE_LOCAL_CPU:

-----
static void update_cpu_capabilities(u16 scope_mask)
{
        int i;
        const struct arm64_cpu_capabilities *caps;

        scope_mask &= ARM64_CPUCAP_SCOPE_MASK;
        for (i = 0; i < ARM64_NCAPS; i++) {
                caps = cpu_hwcaps_ptrs[i];
                if (!caps || !(caps->type & scope_mask) ||
                    cpus_have_cap(caps->capability) ||
                    !caps->matches(caps, cpucap_default_scope(caps)))
                        continue;

--> The matches function is only called if !cpus_have_cap


                if (caps->desc)
                        pr_info("detected: %s\n", caps->desc);
                cpus_set_cap(caps->capability);

--> If matches returns true we mark it as present in cpu_hwcaps.

                if ((scope_mask & SCOPE_BOOT_CPU) && (caps->type & SCOPE_BOOT_CPU))
                        set_bit(caps->capability, boot_capabilities);
        }   
}
---

Therefore caps->matches (in this case has_amu) will only be called as
long as it returns false. This is where my assumption above came from.
Also, this is the reason it was working nicely in my testing, as I did
not test hotplug this time.

Where the has_amu code breaks is when we end up calling
verify_local_cpu_capabilities instead of update_cpu_capabilities after
sys_caps_initialised, which will happen for hotplugged CPUs.
In that case we call caps->matches for all CPUs. Also, if anyone in the
future ends up calling this_cpu_has_cap for the AMU capability.

I will fix this.

Thank you,
Ionela.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 16:24       ` Vladimir Murzin
@ 2020-02-12 18:27         ` Ionela Voinescu
  0 siblings, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 18:27 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Suzuki Kuruppassery Poulose, mark.rutland, maz, linux-doc,
	peterz, catalin.marinas, linux-pm, rjw, linux-kernel, mingo,
	viresh.kumar, linux-arm-kernel, sudeep.holla, will,
	valentin.schneider, lukasz.luba

On Wednesday 12 Feb 2020 at 16:24:42 (+0000), Vladimir Murzin wrote:
> Hi,
> 
> On 2/12/20 4:10 PM, Ionela Voinescu wrote:
> > Hi Suzuki,
> > 
> > On Wednesday 12 Feb 2020 at 11:30:44 (+0000), Suzuki Kuruppassery Poulose wrote:
> >>> +static int __init set_disable_amu(char *str)
> >>> +{
> >>> +	int value = 0;
> >>> +
> >>> +	disable_amu = get_option(&str, &value) ? !!value : true;
> >> minor nit: You could simply use strtobool(str) here, which accepts:
> >>
> >> disable_amu= [0/1/on/off/y/n]
> >>
> > Yes, this was intentional as I wanted "disable_amu" to be a valid option
> > as well, not only "disable_amu=<option>".
> > 
> > If you don't mind I'd like to keep it like this. Currently the use of
> > AMU is enabled by default, and the most common kernel parameter to
> > disable it would be "disable_amu". Allowing "disable_amu=0" is just in
> > case we change the default in the kernel to not support AMU and we'd
> > like platforms to be able to enable it. 
> > 
> 
> Sorry for jumping into thread, but can we avoid negatives into naming which
> accept values? If is always tricky to get expected effect when both are combined.
> 
> If value doesn't really mater than can it be just "noamu"?
> 
> If value does matter can it be (per Suzuki) amu=[0/1/on/off/y/n]?
> 
> Or can you postpone introduction of "just in case" option till that case happens?
>

No worries, thank you very much for the input. I'll change it to
amu=[0/1/on/off/y/n] for clarity.

Thank you,
Ionela.

> Cheers
> Vladimir

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 6/7] arm64: use activity monitors for frequency invariance
  2020-02-11 18:45 ` [PATCH v3 6/7] arm64: use activity monitors for frequency invariance Ionela Voinescu
@ 2020-02-12 18:59   ` Lukasz Luba
  2020-02-13  9:47     ` Ionela Voinescu
  2020-02-17 16:59   ` Valentin Schneider
  1 sibling, 1 reply; 42+ messages in thread
From: Lukasz Luba @ 2020-02-12 18:59 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, valentin.schneider, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Hi Ionela,

On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> The Frequency Invariance Engine (FIE) is providing a frequency
> scaling correction factor that helps achieve more accurate
> load-tracking.
> 
> So far, for arm and arm64 platforms, this scale factor has been
> obtained based on the ratio between the current frequency and the
> maximum supported frequency recorded by the cpufreq policy. The
> setting of this scale factor is triggered from cpufreq drivers by
> calling arch_set_freq_scale. The current frequency used in computation
> is the frequency requested by a governor, but it may not be the
> frequency that was implemented by the platform.
> 
> This correction factor can also be obtained using a core counter and a
> constant counter to get information on the performance (frequency based
> only) obtained in a period of time. This will more accurately reflect
> the actual current frequency of the CPU, compared with the alternative
> implementation that reflects the request of a performance level from
> the OS.
> 
> Therefore, implement arch_scale_freq_tick to use activity monitors, if
> present, for the computation of the frequency scale factor.
> 
> The use of AMU counters depends on:
>   - CONFIG_ARM64_AMU_EXTN - depents on the AMU extension being present
>   - (optional) CONFIG_CPU_FREQ - the current frequency obtained using
>     counter information is divided by the maximum frequency obtained from
>     the cpufreq policy. But the use of cpufreq policy maximum frequency
>     is weak and cpu_get_max_freq can be redefined to provide the data
>     some other way.
> 
> While it is possible to have a combination of CPUs in the system with
> and without support for activity monitors, the use of counters for
> frequency invariance is only enabled for a CPU if all related CPUs
> (CPUs in the same frequency domain) support and have enabled the core
> and constant activity monitor counters. In this way, there is a clear
> separation between the policies for which arch_set_freq_scale (cpufreq
> based FIE) is used, and the policies for which arch_scale_freq_tick
> (counter based FIE) is used to set the frequency scale factor. For
> this purpose, a late_initcall_sync is registered to trigger validation
> work for policies that will enable or disable the use of AMU counters
> for frequency invariance. If CONFIG_CPU_FREQ is not defined, the use
> of counters is enabled on all CPUs only if all possible CPUs correctly
> support the necessary counters.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> ---
>   arch/arm64/include/asm/topology.h |  16 +++
>   arch/arm64/kernel/cpufeature.c    |   4 +
>   arch/arm64/kernel/topology.c      | 185 ++++++++++++++++++++++++++++++
>   drivers/base/arch_topology.c      |   8 ++
>   include/linux/topology.h          |   7 ++
>   5 files changed, 220 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index a4d945db95a2..d910d463cf76 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -16,6 +16,22 @@ int pcibus_to_node(struct pci_bus *bus);
>   
>   #include <linux/arch_topology.h>
>   
> +#ifdef CONFIG_ARM64_AMU_EXTN
> +extern unsigned int cpu_get_max_freq(unsigned int cpu);
> +/*
> + * Replace default function that signals use of counters
> + * for frequency invariance for given CPUs.
> + */
> +bool topology_cpu_freq_counters(struct cpumask *cpus);
> +#define arch_cpu_freq_counters topology_cpu_freq_counters
> +/*
> + * Replace task scheduler's default counter-based
> + * frequency-invariance scale factor setting.
> + */
> +void topology_scale_freq_tick(void);
> +#define arch_scale_freq_tick topology_scale_freq_tick
> +#endif /* CONFIG_ARM64_AMU_EXTN */
> +
>   /* Replace task scheduler's default frequency-invariant accounting */
>   #define arch_scale_freq_capacity topology_get_freq_scale
>   
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 029a473ad273..a4620b269b56 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1172,12 +1172,16 @@ bool cpu_has_amu_feat(int cpu)
>   	return false;
>   }
>   
> +/* Initialize the use of AMU counters for frequency invariance */
> +extern void init_cpu_freq_invariance_counters(void);
> +
>   static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
>   {
>   	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) {
>   		pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
>   			smp_processor_id());
>   		cpumask_set_cpu(smp_processor_id(), amu_cpus);
> +		init_cpu_freq_invariance_counters();
>   	}
>   }
>   
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index fa9528dfd0ce..3f6e379f07b3 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -14,6 +14,7 @@
>   #include <linux/acpi.h>
>   #include <linux/arch_topology.h>
>   #include <linux/cacheinfo.h>
> +#include <linux/cpufreq.h>
>   #include <linux/init.h>
>   #include <linux/percpu.h>
>   
> @@ -120,4 +121,188 @@ int __init parse_acpi_topology(void)
>   }
>   #endif
>   
> +#ifdef CONFIG_ARM64_AMU_EXTN
>   
> +#undef pr_fmt
> +#define pr_fmt(fmt) "AMU: " fmt
> +
> +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale);
> +static DEFINE_PER_CPU(u64, arch_const_cycles_prev);
> +static DEFINE_PER_CPU(u64, arch_core_cycles_prev);
> +static cpumask_var_t amu_fie_cpus;
> +
> +/* Obtain max frequency (in KHz) as reported by hardware */
> +__weak unsigned int cpu_get_max_freq(unsigned int cpu)
> +{
> +	return 0;
> +}
> +
> +#ifdef CONFIG_CPU_FREQ
> +/* Replace max frequency getter with cpufreq based function */
> +#define cpu_get_max_freq cpufreq_get_hw_max_freq
> +#endif

Can we just use cpufreq_get_hw_max_freq()?
We have cpufreq_get_hw_max_freq returning 0 in such case, so it should
be OK.

Is there a possibility that some platform which has !CONFIG_CPU_FREQ
would define its own cpu_get_max_freq() overwriting the weak function
above?
Based on the code which checks 'if (unlikely(!max_freq_hz))' it should,
otherwise 'valid_cpus' is not set.

I would assume that we won't see such platform, interested
in AMU freq invariance without CONFIG_CPU_FREQ.

We already have a lot of these defines or __weak functions, which is
hard to follow.

Apart from that and the issue with cpu_has_amu_feat() it looks good.

> +
> +/* Initialize counter reference per-cpu variables for the current CPU */
> +void init_cpu_freq_invariance_counters(void)
> +{
> +	this_cpu_write(arch_core_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0));
> +	this_cpu_write(arch_const_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0));
> +}
> +
> +static int validate_cpu_freq_invariance_counters(int cpu)
> +{
> +	u64 max_freq_hz, ratio;
> +
> +	if (!cpu_has_amu_feat(cpu)) {

As Suzuki pointed out with 'amu_cpus', make sure we read
a one instance of mask here.


Regards,
Lukasz

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 18:20         ` Ionela Voinescu
@ 2020-02-12 19:24           ` Suzuki K Poulose
  0 siblings, 0 replies; 42+ messages in thread
From: Suzuki K Poulose @ 2020-02-12 19:24 UTC (permalink / raw)
  To: Ionela.Voinescu
  Cc: catalin.marinas, will, mark.rutland, maz, sudeep.holla,
	lukasz.luba, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi Ionela,

On 02/12/2020 06:20 PM, Ionela Voinescu wrote:
> Hi Suzuki,
> 
> On Wednesday 12 Feb 2020 at 16:20:56 (+0000), Suzuki Kuruppassery Poulose wrote:
>>>>> +static bool has_amu(const struct arm64_cpu_capabilities *cap,
>>>>> +		       int __unused)
>>>>> +{
>>>>> +	/*
>>>>> +	 * The AMU extension is a non-conflicting feature: the kernel can
>>>>> +	 * safely run a mix of CPUs with and without support for the
>>>>> +	 * activity monitors extension. Therefore, if not disabled through
>>>>> +	 * the kernel command line early parameter, enable the capability
>>>>> +	 * to allow any late CPU to use the feature.
>>>>> +	 *
>>>>> +	 * With this feature enabled, the cpu_enable function will be called
>>>>> +	 * for all CPUs that match the criteria, including secondary and
>>>>> +	 * hotplugged, marking this feature as present on that respective CPU.
>>>>> +	 * The enable function will also print a detection message.
>>>>> +	 */
>>>>> +
>>>>> +	if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
>>>>
>>>> This looks problematic. Don't we end up in allocating the memory during
>>>> "each CPU" check and thus leaking memory ? Do we really need to allocate
>>>> this dynamically ?
>>>>
>>>
>>> Yes, it does make some assumptions. Given that the AMU capability is
>>> a WEAK_LOCAL_CPU_FEATURE I relied on the match function being called
>>> only once, when the return value is true. If the return value is false,
>>
>> That is not correct. A WEAK_LOCAL_CPU_FEATURE is still SCOPE_LOCAL_CPU,
>> implies it is run on all the booting CPUs (including the hotplugged
>> ones). The WEAK is there to imply that its "permitted" or "optional"
>> for a hotplugged CPU. So, eventually you will re-allocate this variable
>> every single time a CPU turns up, where you could also loose the current
>> state.
>>
> 
>>> which will result in it being called multiple times, it's either because
>>> disable_amu == false, or it has become false due to a previous failed
>>> allocation, in which case a new allocation will not be attempted.
> 
> First of all, I agree with you that this should be corrected.
> 
> But for completion (and my education) I retraced my steps in regards
> to my assumption above. While cpu_enable is called for all CPUs - boot,
> secondary, hotplugged, the matches function (in this case has_amu) is
> not always called for all CPUs, and that's where the confusion came
> from.
> 
> Looking over the update_cpu_capabilities function, that's called from
> both setup_boot_cpu_capabilities and check_local_cpu_capabilities
> (secondary CPUs) for SCOPE_LOCAL_CPU:
> 
> -----
> static void update_cpu_capabilities(u16 scope_mask)
> {
>          int i;
>          const struct arm64_cpu_capabilities *caps;
> 
>          scope_mask &= ARM64_CPUCAP_SCOPE_MASK;
>          for (i = 0; i < ARM64_NCAPS; i++) {
>                  caps = cpu_hwcaps_ptrs[i];
>                  if (!caps || !(caps->type & scope_mask) ||
>                      cpus_have_cap(caps->capability) ||
>                      !caps->matches(caps, cpucap_default_scope(caps)))
>                          continue;
> 
> --> The matches function is only called if !cpus_have_cap

Agreed. Your analysis is correct. This was done as a micro
optimization(!) as it is pointless to check if something should be set, 
when it is already set.

> 
> 
>                  if (caps->desc)
>                          pr_info("detected: %s\n", caps->desc);
>                  cpus_set_cap(caps->capability);
> 
> --> If matches returns true we mark it as present in cpu_hwcaps.
> 
>                  if ((scope_mask & SCOPE_BOOT_CPU) && (caps->type & SCOPE_BOOT_CPU))
>                          set_bit(caps->capability, boot_capabilities);
>          }
> }
> ---
> 
> Therefore caps->matches (in this case has_amu) will only be called as
> long as it returns false. This is where my assumption above came from.
> Also, this is the reason it was working nicely in my testing, as I did
> not test hotplug this time.
> 
> Where the has_amu code breaks is when we end up calling
> verify_local_cpu_capabilities instead of update_cpu_capabilities after
> sys_caps_initialised, which will happen for hotplugged CPUs.
> In that case we call caps->matches for all CPUs. Also, if anyone in the
> future ends up calling this_cpu_has_cap for the AMU capability.

True.

> 
> I will fix this.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
  2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
  2020-02-12 18:20         ` Ionela Voinescu
@ 2020-02-12 20:19         ` Ionela Voinescu
  1 sibling, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-12 20:19 UTC (permalink / raw)
  To: Suzuki Kuruppassery Poulose
  Cc: catalin.marinas, will, mark.rutland, maz, sudeep.holla,
	lukasz.luba, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi guys,

On Wednesday 12 Feb 2020 at 16:20:56 (+0000), Suzuki Kuruppassery Poulose wrote:
> > For better handling I could have a cpumask_available check before the
> > allocation just in case the capability type changes in the future, or to
> > at least not rely on assumptions based on the type of the capability.
> > 
> > The reason this is dynamic is that I wanted to avoid the memory being
> > allocated when disable_amu is true - as Valentin mentioned in a comment
> > in the meantime "the static allocation is done against NR_CPUS whereas
> > the dynamic one is done against nr_cpu_ids".
> > 
> > Would this be alright?
> > 
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 182e05ca3410..4cee6b147ddd 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -1222,7 +1222,11 @@ static bool has_amu(const struct arm64_cpu_capabilities *cap,
> >           * The enable function will also print a detection message.
> >           */
> > -       if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> > +       if (disable_amu)
> > +               return false;
> > +
> > +       if (!cpumask_available(amu_cpus) &&
> > +           !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
> >                  pr_err("Activity Monitors Unit (AMU): fail to allocate memory");
> >                  disable_amu = true;
> >          }

Going down the rabbit hole in regards to this section, it seems it's
actually not fine. When CONFIG_CPUMASK_OFFSTACK==y it fails to
allocate memory because zalloc_cpumask_var cannot be used before
initializing the slub allocator (mm_init) to allocate a cpumask.

The alternative alloc_bootmem_cpumask_var is an __init function that
can be used only during the initialization phase, which is not the case
for has_amu that can be called later (for hotplugged CPUs). Therefore,
dynamic allocation is not an option here.

Thanks,
Ionela.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 6/7] arm64: use activity monitors for frequency invariance
  2020-02-12 18:59   ` Lukasz Luba
@ 2020-02-13  9:47     ` Ionela Voinescu
  0 siblings, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-13  9:47 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, valentin.schneider, rjw, peterz, mingo,
	vincent.guittot, viresh.kumar, linux-arm-kernel, linux-doc,
	linux-kernel, linux-pm

Hi Lukasz,

[..]
> > +
> > +/* Obtain max frequency (in KHz) as reported by hardware */
> > +__weak unsigned int cpu_get_max_freq(unsigned int cpu)
> > +{
> > +	return 0;
> > +}
> > +
> > +#ifdef CONFIG_CPU_FREQ
> > +/* Replace max frequency getter with cpufreq based function */
> > +#define cpu_get_max_freq cpufreq_get_hw_max_freq
> > +#endif
> 
> Can we just use cpufreq_get_hw_max_freq()?
> We have cpufreq_get_hw_max_freq returning 0 in such case, so it should
> be OK.
> 

The reasoning for the implementation is the following:
 - For CONFIG_CPU_FREQ we use cpufreq_get_hw_max_freq (weak default or
   strong alternative)
 - For !CONFIG_CPU_FREQ cpufreq_get_hw_max_freq returns 0 - it signals
   that cpufreq cannot return the hardware max frequency. In this case
   cpu_get_max_freq is used (weak default or strong alternative
   implementation).

> Is there a possibility that some platform which has !CONFIG_CPU_FREQ
> would define its own cpu_get_max_freq() overwriting the weak function
> above?
> Based on the code which checks 'if (unlikely(!max_freq_hz))' it should,
> otherwise 'valid_cpus' is not set.
> 
> I would assume that we won't see such platform, interested
> in AMU freq invariance without CONFIG_CPU_FREQ.
> 
> We already have a lot of these defines or __weak functions, which is
> hard to follow.

There is no dependency between CONFIG_CPU_FREQ and frequency invariance.
Therefore, I did not see a reason to potentially bypass the use of AMU
for frequency invariance for !CONFIG_CPU_FREQ.

But I agree it makes the code harder to read so I can remove
cpu_get_max_freq and keep cpufreq_get_hw_max_freq only until there is a
provable need for this. 

Thank you for the review,
Ionela.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency
  2020-02-11 18:45 ` [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency Ionela Voinescu
  2020-02-12  4:14   ` Viresh Kumar
@ 2020-02-13 11:59   ` Valentin Schneider
  2020-02-13 12:59     ` Ionela Voinescu
  1 sibling, 1 reply; 42+ messages in thread
From: Valentin Schneider @ 2020-02-13 11:59 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> +/**
> + * cpufreq_get_hw_max_freq - get the max hardware frequency of the CPU
> + * @cpu: CPU number
> + *
> + * The default return value is the max_freq field of cpuinfo.
> + */
> +__weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +	unsigned int ret_freq = 0;
> +
> +	if (policy) {
> +		ret_freq = policy->cpuinfo.max_freq;
> +		cpufreq_cpu_put(policy);

What about intel_pstate / turbo stuff? IIRC one of Giovanni's issues was that
turbo freq is not always reported as the max freq. Dunno if we can do
anything about it; at the very least maybe document the caveat?

> +	}
> +
> +	return ret_freq;
> +}
> +EXPORT_SYMBOL(cpufreq_get_hw_max_freq);
> +
>  static unsigned int __cpufreq_get(struct cpufreq_policy *policy)
>  {
>  	if (unlikely(policy_is_inactive(policy)))


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency
  2020-02-13 11:59   ` Valentin Schneider
@ 2020-02-13 12:59     ` Ionela Voinescu
  2020-02-13 15:22       ` Valentin Schneider
  0 siblings, 1 reply; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-13 12:59 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, rjw, peterz, mingo, vincent.guittot,
	viresh.kumar, linux-arm-kernel, linux-doc, linux-kernel,
	linux-pm

On Thursday 13 Feb 2020 at 11:59:56 (+0000), Valentin Schneider wrote:
> On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> > +/**
> > + * cpufreq_get_hw_max_freq - get the max hardware frequency of the CPU
> > + * @cpu: CPU number
> > + *
> > + * The default return value is the max_freq field of cpuinfo.
> > + */
> > +__weak unsigned int cpufreq_get_hw_max_freq(unsigned int cpu)
> > +{
> > +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> > +	unsigned int ret_freq = 0;
> > +
> > +	if (policy) {
> > +		ret_freq = policy->cpuinfo.max_freq;
> > +		cpufreq_cpu_put(policy);
> 
> What about intel_pstate / turbo stuff? IIRC one of Giovanni's issues was that
> turbo freq is not always reported as the max freq. Dunno if we can do
> anything about it; at the very least maybe document the caveat?
>

Okay, I can add details in the description in regards to potential
reasons to overwrite this function. But basically this is one of the
reasons for making this a weak function. The best information we can
generically get for maximum hardware frequency is cpuinfo.max_freq.
But if platforms have the possibility to obtain this differently from
either hardware or firmware they can overwrite this.

Thanks,
Ionela.

> > +	}
> > +
> > +	return ret_freq;
> > +}
> > +EXPORT_SYMBOL(cpufreq_get_hw_max_freq);
> > +
> >  static unsigned int __cpufreq_get(struct cpufreq_policy *policy)
> >  {
> >  	if (unlikely(policy_is_inactive(policy)))
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency
  2020-02-13 12:59     ` Ionela Voinescu
@ 2020-02-13 15:22       ` Valentin Schneider
  0 siblings, 0 replies; 42+ messages in thread
From: Valentin Schneider @ 2020-02-13 15:22 UTC (permalink / raw)
  To: Ionela Voinescu
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, rjw, peterz, mingo, vincent.guittot,
	viresh.kumar, linux-arm-kernel, linux-doc, linux-kernel,
	linux-pm

On 2/13/20 12:59 PM, Ionela Voinescu wrote:
>> What about intel_pstate / turbo stuff? IIRC one of Giovanni's issues was that
>> turbo freq is not always reported as the max freq. Dunno if we can do
>> anything about it; at the very least maybe document the caveat?
>>
> 
> Okay, I can add details in the description in regards to potential
> reasons to overwrite this function. But basically this is one of the
> reasons for making this a weak function. The best information we can
> generically get for maximum hardware frequency is cpuinfo.max_freq.
> But if platforms have the possibility to obtain this differently from
> either hardware or firmware they can overwrite this.
> 

Right, that would be handled by a different implementation of
that function, so this wasn't too relevant of a comment. Sorry!


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
  2020-02-12  9:30   ` Valentin Schneider
  2020-02-12 10:01   ` Lukasz Luba
@ 2020-02-14  0:35   ` Thomas Gleixner
  2020-02-14 15:45     ` Ionela Voinescu
  2 siblings, 1 reply; 42+ messages in thread
From: Thomas Gleixner @ 2020-02-14  0:35 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, valentin.schneider,
	rjw, ionela.voinescu
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Ionela Voinescu <ionela.voinescu@arm.com> writes:

> From: Valentin Schneider <valentin.schneider@arm.com>
>
> Using an arch timer with a frequency of less than 1MHz can result in an
> incorrect functionality of the system which assumes a reasonable rate.
>
> One example is the use of activity monitors for frequency invariance
> which uses the rate of the arch timer as the known rate of the constant
> cycle counter in computing its ratio compared to the maximum frequency
> of a CPU. For arch timer frequencies less than 1MHz this ratio could
> end up being 0 which is an invalid value for its use.
>
> Therefore, warn if the arch timer rate is below 1MHz which contravenes
> the recommended architecture interval of 1 to 50MHz.
>
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>

So this patch is from Valentin. Where is his Signed-off-by?

>  
> +static int validate_timer_rate(void)
> +{
> +	if (!arch_timer_rate)
> +		return -EINVAL;
> +
> +	/* Arch timer frequency < 1MHz can cause trouble */
> +	WARN_ON(arch_timer_rate < 1000000);

This does not make sense to me. If the rate is out of bounds then why
warn an just continue instead of making it fail?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-14  0:35   ` Thomas Gleixner
@ 2020-02-14 15:45     ` Ionela Voinescu
  2020-02-14 15:57       ` Ionela Voinescu
  0 siblings, 1 reply; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-14 15:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Hi Thomas,

On Friday 14 Feb 2020 at 01:35:58 (+0100), Thomas Gleixner wrote:
> Ionela Voinescu <ionela.voinescu@arm.com> writes:
> 
> > From: Valentin Schneider <valentin.schneider@arm.com>
> >
> > Using an arch timer with a frequency of less than 1MHz can result in an
> > incorrect functionality of the system which assumes a reasonable rate.
> >
> > One example is the use of activity monitors for frequency invariance
> > which uses the rate of the arch timer as the known rate of the constant
> > cycle counter in computing its ratio compared to the maximum frequency
> > of a CPU. For arch timer frequencies less than 1MHz this ratio could
> > end up being 0 which is an invalid value for its use.
> >
> > Therefore, warn if the arch timer rate is below 1MHz which contravenes
> > the recommended architecture interval of 1 to 50MHz.
> >
> > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> 
> So this patch is from Valentin. Where is his Signed-off-by?
> 

Yes, sorry about this. This was based on a diff that Valentin provided
in v2. I'll change the author as agreed at:
https://lore.kernel.org/lkml/20200212103249.GA19041@arm.com/

> >  
> > +static int validate_timer_rate(void)
> > +{
> > +	if (!arch_timer_rate)
> > +		return -EINVAL;
> > +
> > +	/* Arch timer frequency < 1MHz can cause trouble */
> > +	WARN_ON(arch_timer_rate < 1000000);
> 
> This does not make sense to me. If the rate is out of bounds then why
> warn an just continue instead of making it fail?
> 

Because it's not a hard restriction, it's just atypical for the rate to
be below 1Mhz. The spec only mentions a typical range of 1 to 50MHz and
the warning is only here to flag a potentially problematic rate, below
what is assumed typical in the spec.

In [1], where I'm actually relying on arch_timer_rate being higher than
than 1/SCHED_CAPACITY_SCALE² of the maximum frequency, I am making it
fail, as, for that scenario, it is a hard restriction.


+	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
+	 * in order to ensure a good resolution for arch_max_freq_scale for
+	 * very low arch timer frequencies (up to the KHz range which should be
+	 * unlikely).
+	 */
+	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
+	ratio = div64_u64(ratio, max_freq_hz);
+	if (!ratio) {
+		pr_err("System timer frequency too low.\n");
+		return -EINVAL;
+	}
+

[1] https://lore.kernel.org/lkml/89339501-5ee4-e871-3076-c8b02c6fbf6e@arm.com/

Thanks,
Ionela.

> Thanks,
> 
>         tglx

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  2020-02-14 15:45     ` Ionela Voinescu
@ 2020-02-14 15:57       ` Ionela Voinescu
  0 siblings, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-14 15:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, valentin.schneider, rjw, peterz,
	mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

On Friday 14 Feb 2020 at 15:45:25 (+0000), Ionela Voinescu wrote:
> Hi Thomas,
> 
> On Friday 14 Feb 2020 at 01:35:58 (+0100), Thomas Gleixner wrote:
> > Ionela Voinescu <ionela.voinescu@arm.com> writes:
> > 
> > > From: Valentin Schneider <valentin.schneider@arm.com>
> > >
> > > Using an arch timer with a frequency of less than 1MHz can result in an
> > > incorrect functionality of the system which assumes a reasonable rate.
> > >
> > > One example is the use of activity monitors for frequency invariance
> > > which uses the rate of the arch timer as the known rate of the constant
> > > cycle counter in computing its ratio compared to the maximum frequency
> > > of a CPU. For arch timer frequencies less than 1MHz this ratio could
> > > end up being 0 which is an invalid value for its use.
> > >
> > > Therefore, warn if the arch timer rate is below 1MHz which contravenes
> > > the recommended architecture interval of 1 to 50MHz.
> > >
> > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > 
> > So this patch is from Valentin. Where is his Signed-off-by?
> > 
> 
> Yes, sorry about this. This was based on a diff that Valentin provided
> in v2. I'll change the author as agreed at:
> https://lore.kernel.org/lkml/20200212103249.GA19041@arm.com/
> 
> > >  
> > > +static int validate_timer_rate(void)
> > > +{
> > > +	if (!arch_timer_rate)
> > > +		return -EINVAL;
> > > +
> > > +	/* Arch timer frequency < 1MHz can cause trouble */
> > > +	WARN_ON(arch_timer_rate < 1000000);
> > 
> > This does not make sense to me. If the rate is out of bounds then why
> > warn an just continue instead of making it fail?
> > 
> 
> Because it's not a hard restriction, it's just atypical for the rate to
> be below 1Mhz. The spec only mentions a typical range of 1 to 50MHz and
> the warning is only here to flag a potentially problematic rate, below
> what is assumed typical in the spec.
> 
> In [1], where I'm actually relying on arch_timer_rate being higher than
> than 1/SCHED_CAPACITY_SCALE² of the maximum frequency, I am making it
> fail, as, for that scenario, it is a hard restriction.
> 
> 
> +	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
> +	 * in order to ensure a good resolution for arch_max_freq_scale for
> +	 * very low arch timer frequencies (up to the KHz range which should be
> +	 * unlikely).
> +	 */
> +	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
> +	ratio = div64_u64(ratio, max_freq_hz);
> +	if (!ratio) {
> +		pr_err("System timer frequency too low.\n");
> +		return -EINVAL;
> +	}
> +
> 
> [1] https://lore.kernel.org/lkml/89339501-5ee4-e871-3076-c8b02c6fbf6e@arm.com/

I've mistakenly referenced a bad link ^

It was supposed to be:

[1] https://lore.kernel.org/lkml/20200211184542.29585-7-ionela.voinescu@arm.com/

Thanks,
Ionela.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 6/7] arm64: use activity monitors for frequency invariance
  2020-02-11 18:45 ` [PATCH v3 6/7] arm64: use activity monitors for frequency invariance Ionela Voinescu
  2020-02-12 18:59   ` Lukasz Luba
@ 2020-02-17 16:59   ` Valentin Schneider
  2020-02-23 18:49     ` Ionela Voinescu
  1 sibling, 1 reply; 42+ messages in thread
From: Valentin Schneider @ 2020-02-17 16:59 UTC (permalink / raw)
  To: Ionela Voinescu, catalin.marinas, will, mark.rutland, maz,
	suzuki.poulose, sudeep.holla, lukasz.luba, rjw
  Cc: peterz, mingo, vincent.guittot, viresh.kumar, linux-arm-kernel,
	linux-doc, linux-kernel, linux-pm

Hi Ionela,

Overall I think this approach is much better, and apart from a few nits below
this is looking pretty good.

On 2/11/20 6:45 PM, Ionela Voinescu wrote:
> @@ -120,4 +121,188 @@ int __init parse_acpi_topology(void)
>  }
>  #endif
>  
> +#ifdef CONFIG_ARM64_AMU_EXTN
>  
> +#undef pr_fmt
> +#define pr_fmt(fmt) "AMU: " fmt
> +
> +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale);
> +static DEFINE_PER_CPU(u64, arch_const_cycles_prev);
> +static DEFINE_PER_CPU(u64, arch_core_cycles_prev);
> +static cpumask_var_t amu_fie_cpus;
> +
> +/* Obtain max frequency (in KHz) as reported by hardware */
> +__weak unsigned int cpu_get_max_freq(unsigned int cpu)
> +{
> +	return 0;
> +}
> +
> +#ifdef CONFIG_CPU_FREQ
> +/* Replace max frequency getter with cpufreq based function */
> +#define cpu_get_max_freq cpufreq_get_hw_max_freq
> +#endif
> +
> +/* Initialize counter reference per-cpu variables for the current CPU */
> +void init_cpu_freq_invariance_counters(void)
> +{
> +	this_cpu_write(arch_core_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0));
> +	this_cpu_write(arch_const_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0));
> +}
> +
> +static int validate_cpu_freq_invariance_counters(int cpu)
> +{
> +	u64 max_freq_hz, ratio;
> +
> +	if (!cpu_has_amu_feat(cpu)) {
> +		pr_debug("CPU%d: counters are not supported.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) ||
> +		     !per_cpu(arch_core_cycles_prev, cpu))) {
> +		pr_debug("CPU%d: cycle counters are not enabled.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	/* Convert maximum frequency from KHz to Hz and validate */
> +	max_freq_hz = cpu_get_max_freq(cpu) * 1000;
> +	if (unlikely(!max_freq_hz)) {
> +		pr_debug("CPU%d: invalid maximum frequency.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Pre-compute the fixed ratio between the frequency of the constant
> +	 * counter and the maximum frequency of the CPU.
> +	 *
> +	 *			      const_freq
> +	 * arch_max_freq_scale =   ---------------- * SCHED_CAPACITY_SCALE²
> +	 *			   cpuinfo_max_freq
> +	 *
> +	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
> +	 * in order to ensure a good resolution for arch_max_freq_scale for
> +	 * very low arch timer frequencies (up to the KHz range which should be
                                            ^^^^^
<pedantic hat on>: s/up to/down to/

> +	 * unlikely).
> +	 */
> +	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
> +	ratio = div64_u64(ratio, max_freq_hz);
> +	if (!ratio) {
> +		pr_err("System timer frequency too low.\n");

Should that be a WARN_ONCE() instead? If the arch timer freq is too low,
we'll end up spamming this message, since we go through this for all CPUs.


> +		return -EINVAL;
> +	}
> +
> +	per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio;
> +

It occurred to me that this isn't strictly speaking a per-CPU information as
it only depends on the max possible frequency. Not really worth bothering
about though, I think.

> +	return 0;
> +}
> +
> +static inline int
> +enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +
> +	if (!policy) {
> +		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> +		return false;
> +	}
> +
> +	if (cpumask_subset(policy->related_cpus, valid_cpus)) {
> +		cpumask_or(amu_fie_cpus, policy->related_cpus,
> +			   amu_fie_cpus);
> +		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> +			cpumask_pr_args(amu_fie_cpus));

Could we have a single print of all CPUs in one go? AIUI this will generate a
line per cpufreq policy. Maybe just something at the tail of init_amu_fie():

if (!cpumask_empty(amu_fie_cpus))
	pr_info(<blah>);

> +	}
> +
> +	cpufreq_cpu_put(policy);
> +
> +	return true;
> +}
> +
> +static int __init init_amu_fie(void)
> +{
> +	cpumask_var_t valid_cpus;
> +	bool have_policy = false;
> +	int cpu;
> +
> +	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL) ||
> +	    !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	for_each_possible_cpu(cpu) {
> +		if (validate_cpu_freq_invariance_counters(cpu))
> +			continue;
> +		cpumask_set_cpu(cpu, valid_cpus);
> +		have_policy = enable_policy_freq_counters(cpu, valid_cpus) ||
> +			      have_policy;

What about:
		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);

> +	}
> +
> +	if (!have_policy) {
> +		/*
> +		 * If we are not restricted by cpufreq policies, we only enable
> +		 * the use of the AMU feature for FIE if all CPUs support AMU.
> +		 * Otherwise, enable_policy_freq_counters has already enabled
> +		 * policy cpus.
> +		 */
> +		if (cpumask_equal(valid_cpus, cpu_possible_mask)) {

Mmm so I'm thinking what we want here is the cpu_present_mask rather than
the possible one. This is very corner-casy, but I think that if we fail to
boot a secondary, we'll have it possible but not present.

While at it you could make the loop only target present CPUs, but I think the
one bit that matters is this check right here (!present should fail at
validate_cpu_freq_invariance_counters()).

> +			cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
> +			pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> +				cpumask_pr_args(amu_fie_cpus));
> +		}
> +	}
> +
> +	free_cpumask_var(valid_cpus);
> +
> +	return 0;
> +}
> +late_initcall_sync(init_amu_fie);
> +
> +bool topology_cpu_freq_counters(struct cpumask *cpus)
> +{
> +	return cpumask_available(amu_fie_cpus) &&
> +	       cpumask_subset(cpus, amu_fie_cpus);
> +}
> +
> +void topology_scale_freq_tick(void)
> +{
> +	u64 prev_core_cnt, prev_const_cnt;
> +	u64 core_cnt, const_cnt, scale;
> +	int cpu = smp_processor_id();
> +
> +	if (!cpumask_available(amu_fie_cpus) ||
> +	    !cpumask_test_cpu(cpu, amu_fie_cpus))
> +		return;

It might be a good idea to have a static key to gate our entry into this
function - that way we can lessen our impact on older platforms (without AMUs)
running a recent kernel with CONFIG_ARM64_AMU_EXTN=y.

x86 does just that, if you look at their arch_scale_freq_tick()
implementation. FWIW I don't think we should bother with playing with the
key counter to count AMU-enabled CPUs, just enable it at startup if we have
> 1 such CPU and let the cpumask drive the rest.

In your check here, the static key check could replace the cpumask_available()
check. The static key could also be used for topology_cpu_freq_counters().

> +
> +	const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0);
> +	core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0);
> +	prev_const_cnt = this_cpu_read(arch_const_cycles_prev);
> +	prev_core_cnt = this_cpu_read(arch_core_cycles_prev);
> +
> +	if (unlikely(core_cnt <= prev_core_cnt ||
> +		     const_cnt <= prev_const_cnt))
> +		goto store_and_exit;
> +
> +	/*
> +	 *	    /\core    arch_max_freq_scale
> +	 * scale =  ------- * --------------------
> +	 *	    /\const   SCHED_CAPACITY_SCALE
> +	 *
> +	 * We shift by SCHED_CAPACITY_SHIFT (divide by SCHED_CAPACITY_SCALE)
> +	 * in order to compensate for the SCHED_CAPACITY_SCALE² factor in
> +	 * arch_max_freq_scale (used to ensure its resolution) while keeping
> +	 * the scale value in the 0-SCHED_CAPACITY_SCALE capacity range.
> +	 */

A simple "See validate_cpu_freq_invariance_counters() for details on the
scale factor" would suffice wrt the shifting details.

> +	scale = core_cnt - prev_core_cnt;
> +	scale *= this_cpu_read(arch_max_freq_scale);
> +	scale = div64_u64(scale >> SCHED_CAPACITY_SHIFT,
> +			  const_cnt - prev_const_cnt);
> +
> +	scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE);
> +	this_cpu_write(freq_scale, (unsigned long)scale);
> +
> +store_and_exit:
> +	this_cpu_write(arch_core_cycles_prev, core_cnt);
> +	this_cpu_write(arch_const_cycles_prev, const_cnt);
> +}
> +#endif /* CONFIG_ARM64_AMU_EXTN */
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 1eb81f113786..1ab2b7503d63 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -29,6 +29,14 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
>  	unsigned long scale;
>  	int i;
>  
> +	/*
> +	 * If the use of counters for FIE is enabled, just return as we don't
> +	 * want to update the scale factor with information from CPUFREQ.
> +	 * Instead the scale factor will be updated from arch_scale_freq_tick.
> +	 */
> +	if (arch_cpu_freq_counters(cpus))
> +		return;
> +
>  	scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq;
>  
>  	for_each_cpu(i, cpus)
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index eb2fe6edd73c..397aad6ae163 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -227,5 +227,12 @@ static inline const struct cpumask *cpu_cpu_mask(int cpu)
>  	return cpumask_of_node(cpu_to_node(cpu));
>  }
>  
> +#ifndef arch_cpu_freq_counters
> +static __always_inline
> +bool arch_cpu_freq_counters(struct cpumask *cpus)
> +{
> +	return false;
> +}
> +#endif
>  
>  #endif /* _LINUX_TOPOLOGY_H */
> 

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v3 6/7] arm64: use activity monitors for frequency invariance
  2020-02-17 16:59   ` Valentin Schneider
@ 2020-02-23 18:49     ` Ionela Voinescu
  0 siblings, 0 replies; 42+ messages in thread
From: Ionela Voinescu @ 2020-02-23 18:49 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: catalin.marinas, will, mark.rutland, maz, suzuki.poulose,
	sudeep.holla, lukasz.luba, rjw, peterz, mingo, vincent.guittot,
	viresh.kumar, linux-arm-kernel, linux-doc, linux-kernel,
	linux-pm

Hi Valentin,

Sorry for the delay in my reply and thank you very much for the review!

I will push v4 very soon with these changes.

On Monday 17 Feb 2020 at 16:59:24 (+0000), Valentin Schneider wrote:
> > +	 * Pre-compute the fixed ratio between the frequency of the constant
> > +	 * counter and the maximum frequency of the CPU.
> > +	 *
> > +	 *			      const_freq
> > +	 * arch_max_freq_scale =   ---------------- * SCHED_CAPACITY_SCALE²
> > +	 *			   cpuinfo_max_freq
> > +	 *
> > +	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
> > +	 * in order to ensure a good resolution for arch_max_freq_scale for
> > +	 * very low arch timer frequencies (up to the KHz range which should be
>                                             ^^^^^
> <pedantic hat on>: s/up to/down to/

Done!

> > +	 * unlikely).
> > +	 */
> > +	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
> > +	ratio = div64_u64(ratio, max_freq_hz);
> > +	if (!ratio) {
> > +		pr_err("System timer frequency too low.\n");
> 
> Should that be a WARN_ONCE() instead? If the arch timer freq is too low,
> we'll end up spamming this message, since we go through this for all CPUs.

Done!

> > +		return -EINVAL;
> > +	}
> > +
> > +	per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio;
> > +
> 
> It occurred to me that this isn't strictly speaking a per-CPU information as
> it only depends on the max possible frequency. Not really worth bothering
> about though, I think.
> 

Yes, it depends on the max possible frequency of all CPUs in a frequency
domain. But I wanted to put this factor in a per-cpu variable in order
to be able to retrieve it faster in topology_scale_freq_tick, rather
than having to consider policies and related CPUs in that function.

> > +	return 0;
> > +}
> > +
> > +static inline int
> > +enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
> > +{
> > +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> > +
> > +	if (!policy) {
> > +		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> > +		return false;
> > +	}
> > +
> > +	if (cpumask_subset(policy->related_cpus, valid_cpus)) {
> > +		cpumask_or(amu_fie_cpus, policy->related_cpus,
> > +			   amu_fie_cpus);
> > +		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> > +			cpumask_pr_args(amu_fie_cpus));
> 
> Could we have a single print of all CPUs in one go? AIUI this will generate a
> line per cpufreq policy. Maybe just something at the tail of init_amu_fie():
> 
> if (!cpumask_empty(amu_fie_cpus))
> 	pr_info(<blah>);
> 

Done. I've used this location as well to set the static key that you've
suggested below.

> > +	}
> > +
> > +	cpufreq_cpu_put(policy);
> > +
> > +	return true;
> > +}
> > +
> > +static int __init init_amu_fie(void)
> > +{
> > +	cpumask_var_t valid_cpus;
> > +	bool have_policy = false;
> > +	int cpu;
> > +
> > +	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL) ||
> > +	    !zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL))
> > +		return -ENOMEM;
> > +
> > +	for_each_possible_cpu(cpu) {
> > +		if (validate_cpu_freq_invariance_counters(cpu))
> > +			continue;
> > +		cpumask_set_cpu(cpu, valid_cpus);
> > +		have_policy = enable_policy_freq_counters(cpu, valid_cpus) ||
> > +			      have_policy;
> 
> What about:
> 		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);
> 

Done as well.

> > +	}
> > +
> > +	if (!have_policy) {
> > +		/*
> > +		 * If we are not restricted by cpufreq policies, we only enable
> > +		 * the use of the AMU feature for FIE if all CPUs support AMU.
> > +		 * Otherwise, enable_policy_freq_counters has already enabled
> > +		 * policy cpus.
> > +		 */
> > +		if (cpumask_equal(valid_cpus, cpu_possible_mask)) {
> 
> Mmm so I'm thinking what we want here is the cpu_present_mask rather than
> the possible one. This is very corner-casy, but I think that if we fail to
> boot a secondary, we'll have it possible but not present.
>

Yes, this is correct. It does depend on the stage it fails at: for
example if some feature checks fail, a CPU will not be marked in
cpu_present_mask (see cpu_die_early()), while the following will result
in possible == present.

---
[    0.056524] EFI services will not be available.
[    0.065690] smp: Bringing up secondary CPUs ...
[    0.098010] psci: failed to boot CPU1 (-22)
[    0.098037] CPU1: failed to boot: -22
[    0.130290] psci: failed to boot CPU2 (-22)
[    0.130315] CPU2: failed to boot: -22
[    0.162568] psci: failed to boot CPU3 (-22)
[    0.162594] CPU3: failed to boot: -22
[    0.194890] Detected PIPT I-cache on CPU4
[    0.194990] GICv3: CPU4: found redistributor 100 region
0:0x000000002f120000
[    0.195046] GICv3: CPU4: using allocated LPI pending table
@0x00000000fc0d0000
[    0.195133] CPU4: Booted secondary processor 0x0000000100
[0x410fd0f0]
[    0.227190] psci: failed to boot CPU5 (-22)
[    0.227412] CPU5: failed to boot: -22
[    0.259431] psci: failed to boot CPU6 (-22)
[    0.259522] CPU6: failed to boot: -22
[    0.291683] psci: failed to boot CPU7 (-22)
[    0.291709] CPU7: failed to boot: -22
[    0.291990] smp: Brought up 1 node, 2 CPUs  
[..]
root@buildroot:~# cat present
0-7
root@buildroot:~# cat possible
0-7

This failure happens while the CPU is being brought up (__cpu_up).
I'm not sure if this should result in set_cpu_present(cpu, 0) as well.
But it's unrelated to this..

In any case, your suggestion is valid and cpu_present_mask is better to
be used here.


> While at it you could make the loop only target present CPUs, but I think the
> one bit that matters is this check right here (!present should fail at
> validate_cpu_freq_invariance_counters()).
> 

Will change the loop as well. Thanks!

> > +			cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
> > +			pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> > +				cpumask_pr_args(amu_fie_cpus));
> > +		}
> > +	}
> > +
> > +	free_cpumask_var(valid_cpus);
> > +
> > +	return 0;
> > +}
> > +late_initcall_sync(init_amu_fie);
> > +
> > +bool topology_cpu_freq_counters(struct cpumask *cpus)
> > +{
> > +	return cpumask_available(amu_fie_cpus) &&
> > +	       cpumask_subset(cpus, amu_fie_cpus);
> > +}
> > +
> > +void topology_scale_freq_tick(void)
> > +{
> > +	u64 prev_core_cnt, prev_const_cnt;
> > +	u64 core_cnt, const_cnt, scale;
> > +	int cpu = smp_processor_id();
> > +
> > +	if (!cpumask_available(amu_fie_cpus) ||
> > +	    !cpumask_test_cpu(cpu, amu_fie_cpus))
> > +		return;
> 
> It might be a good idea to have a static key to gate our entry into this
> function - that way we can lessen our impact on older platforms (without AMUs)
> running a recent kernel with CONFIG_ARM64_AMU_EXTN=y.
> 
> x86 does just that, if you look at their arch_scale_freq_tick()
> implementation. FWIW I don't think we should bother with playing with the
> key counter to count AMU-enabled CPUs, just enable it at startup if we have
> > 1 such CPU and let the cpumask drive the rest.
> 
> In your check here, the static key check could replace the cpumask_available()
> check. The static key could also be used for topology_cpu_freq_counters().
>

Very good idea! Done as well. Yes, the counter (number of AMU enabled
CPUs) would not be of much help for the moment.

> > +
> > +	const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0);
> > +	core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0);
> > +	prev_const_cnt = this_cpu_read(arch_const_cycles_prev);
> > +	prev_core_cnt = this_cpu_read(arch_core_cycles_prev);
> > +
> > +	if (unlikely(core_cnt <= prev_core_cnt ||
> > +		     const_cnt <= prev_const_cnt))
> > +		goto store_and_exit;
> > +
> > +	/*
> > +	 *	    /\core    arch_max_freq_scale
> > +	 * scale =  ------- * --------------------
> > +	 *	    /\const   SCHED_CAPACITY_SCALE
> > +	 *
> > +	 * We shift by SCHED_CAPACITY_SHIFT (divide by SCHED_CAPACITY_SCALE)
> > +	 * in order to compensate for the SCHED_CAPACITY_SCALE² factor in
> > +	 * arch_max_freq_scale (used to ensure its resolution) while keeping
> > +	 * the scale value in the 0-SCHED_CAPACITY_SCALE capacity range.
> > +	 */
> 
> A simple "See validate_cpu_freq_invariance_counters() for details on the
> scale factor" would suffice wrt the shifting details.
>

Done!

Thank you,
Ionela.

> > +	scale = core_cnt - prev_core_cnt;
> > +	scale *= this_cpu_read(arch_max_freq_scale);
> > +	scale = div64_u64(scale >> SCHED_CAPACITY_SHIFT,
> > +			  const_cnt - prev_const_cnt);
> > +
> > +	scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE);
> > +	this_cpu_write(freq_scale, (unsigned long)scale);
> > +
> > +store_and_exit:
> > +	this_cpu_write(arch_core_cycles_prev, core_cnt);
> > +	this_cpu_write(arch_const_cycles_prev, const_cnt);
> > +}
> > +#endif /* CONFIG_ARM64_AMU_EXTN */
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 1eb81f113786..1ab2b7503d63 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -29,6 +29,14 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
> >  	unsigned long scale;
> >  	int i;
> >  
> > +	/*
> > +	 * If the use of counters for FIE is enabled, just return as we don't
> > +	 * want to update the scale factor with information from CPUFREQ.
> > +	 * Instead the scale factor will be updated from arch_scale_freq_tick.
> > +	 */
> > +	if (arch_cpu_freq_counters(cpus))
> > +		return;
> > +
> >  	scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq;
> >  
> >  	for_each_cpu(i, cpus)
> > diff --git a/include/linux/topology.h b/include/linux/topology.h
> > index eb2fe6edd73c..397aad6ae163 100644
> > --- a/include/linux/topology.h
> > +++ b/include/linux/topology.h
> > @@ -227,5 +227,12 @@ static inline const struct cpumask *cpu_cpu_mask(int cpu)
> >  	return cpumask_of_node(cpu_to_node(cpu));
> >  }
> >  
> > +#ifndef arch_cpu_freq_counters
> > +static __always_inline
> > +bool arch_cpu_freq_counters(struct cpumask *cpus)
> > +{
> > +	return false;
> > +}
> > +#endif
> >  
> >  #endif /* _LINUX_TOPOLOGY_H */
> > 

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, back to index

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 18:45 [PATCH v3 0/7] arm64: ARMv8.4 Activity Monitors support Ionela Voinescu
2020-02-11 18:45 ` [PATCH v3 1/7] arm64: add support for the AMU extension v1 Ionela Voinescu
2020-02-12 11:30   ` Suzuki Kuruppassery Poulose
2020-02-12 14:54     ` Valentin Schneider
2020-02-12 16:10     ` Ionela Voinescu
2020-02-12 16:20       ` Suzuki Kuruppassery Poulose
2020-02-12 18:20         ` Ionela Voinescu
2020-02-12 19:24           ` Suzuki K Poulose
2020-02-12 20:19         ` Ionela Voinescu
2020-02-12 16:24       ` Vladimir Murzin
2020-02-12 18:27         ` Ionela Voinescu
2020-02-11 18:45 ` [PATCH v3 2/7] arm64: trap to EL1 accesses to AMU counters from EL0 Ionela Voinescu
2020-02-12 11:44   ` Suzuki Kuruppassery Poulose
2020-02-12 15:36   ` Valentin Schneider
2020-02-11 18:45 ` [PATCH v3 3/7] arm64/kvm: disable access to AMU registers from kvm guests Ionela Voinescu
2020-02-12 15:36   ` Valentin Schneider
2020-02-12 16:33   ` Suzuki Kuruppassery Poulose
2020-02-11 18:45 ` [PATCH v3 4/7] Documentation: arm64: document support for the AMU extension Ionela Voinescu
2020-02-12 15:36   ` Valentin Schneider
2020-02-11 18:45 ` [PATCH v3 5/7] cpufreq: add function to get the hardware max frequency Ionela Voinescu
2020-02-12  4:14   ` Viresh Kumar
2020-02-13 11:59   ` Valentin Schneider
2020-02-13 12:59     ` Ionela Voinescu
2020-02-13 15:22       ` Valentin Schneider
2020-02-11 18:45 ` [PATCH v3 6/7] arm64: use activity monitors for frequency invariance Ionela Voinescu
2020-02-12 18:59   ` Lukasz Luba
2020-02-13  9:47     ` Ionela Voinescu
2020-02-17 16:59   ` Valentin Schneider
2020-02-23 18:49     ` Ionela Voinescu
2020-02-11 18:45 ` [PATCH v3 7/7] clocksource/drivers/arm_arch_timer: validate arch_timer_rate Ionela Voinescu
2020-02-12  9:30   ` Valentin Schneider
2020-02-12 10:32     ` Ionela Voinescu
2020-02-12 10:01   ` Lukasz Luba
2020-02-12 10:12     ` Marc Zyngier
2020-02-12 10:54       ` Ionela Voinescu
2020-02-12 10:55       ` Lukasz Luba
2020-02-12 11:10         ` Marc Zyngier
2020-02-12 11:43           ` Lukasz Luba
2020-02-12 11:12         ` Valentin Schneider
2020-02-14  0:35   ` Thomas Gleixner
2020-02-14 15:45     ` Ionela Voinescu
2020-02-14 15:57       ` Ionela Voinescu

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org
	public-inbox-index linux-pm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git