linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling
@ 2014-03-10 11:10 Gautham R. Shenoy
  2014-03-10 11:10 ` [PATCH v2 1/6] powernv: cpufreq driver for powernv platform Gautham R. Shenoy
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the v2 of the consolidated patchset consisting
patches for enabling cpufreq on IBM POWERNV platforms
along with some enhancements. 

The v1 of these patches have been previously 
submitted on linuxppc-dev [1][2].

- This patchset contains code for the platform driver to support CPU
  frequency scaling on IBM POWERNV platforms.

- In addition to the standard control and status files exposed by the
  cpufreq core, the patchset exposes the nominal frequency through the
  file named "cpuinfo_nominal_freq".

The patchset is based against commit c3bebc71c4bcdafa24b506adf0c1de3c1f77e2e0
of the mainline tree.

[1]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-February/115244.html
[2]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-March/115703.html

Gautham R. Shenoy (3):
  powernv:cpufreq: Create pstate_id_to_freq() helper
  powernv:cpufreq: Export nominal frequency via sysfs.
  powernv:cpufreq: Implement the driver->get() method

Srivatsa S. Bhat (2):
  powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
  powernv,cpufreq:Add per-core locking to serialize frequency
    transitions

Vaidyanathan Srinivasan (1):
  powernv: cpufreq driver for powernv platform

 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 397 +++++++++++++++++++++++++++++++++
 6 files changed, 417 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/6] powernv: cpufreq driver for powernv platform
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
@ 2014-03-10 11:10 ` Gautham R. Shenoy
  2014-03-10 11:10 ` [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper Gautham R. Shenoy
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Anton Blanchard, srivatsa.bhat, Gautham R. Shenoy

From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Backend driver to dynamically set voltage and frequency on
IBM POWER non-virtualized platforms.  Power management SPRs
are used to set the required PState.

This driver works in conjunction with cpufreq governors
like 'ondemand' to provide a demand based frequency and
voltage setting on IBM POWER non-virtualized platforms.

PState table is obtained from OPAL v3 firmware through device
tree.

powernv_cpufreq back-end driver would parse the relevant device-tree
nodes and initialise the cpufreq subsystem on powernv platform.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 277 +++++++++++++++++++++++++++++++++
 6 files changed, 297 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 90c06ec..84f92ca 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -271,6 +271,10 @@
 #define SPRN_HSRR1	0x13B	/* Hypervisor Save/Restore 1 */
 #define SPRN_IC		0x350	/* Virtual Instruction Count */
 #define SPRN_VTB	0x351	/* Virtual Time Base */
+#define SPRN_PMICR	0x354   /* Power Management Idle Control Reg */
+#define SPRN_PMSR	0x355   /* Power Management Status Reg */
+#define SPRN_PMCR	0x374	/* Power Management Control Register */
+
 /* HFSCR and FSCR bit numbers are the same */
 #define FSCR_TAR_LG	8	/* Enable Target Address Register */
 #define FSCR_EBB_LG	7	/* Enable Event Based Branching */
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 895e8a2..1fe12b1 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -11,6 +11,7 @@ config PPC_POWERNV
 	select PPC_UDBG_16550
 	select PPC_SCOM
 	select ARCH_RANDOM
+	select CPU_FREQ
 	default y
 
 config PPC_POWERNV_RTAS
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 4b029c0..4ba1632 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
 choice
 	prompt "Default CPUFreq governor"
 	default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
+	default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
 	default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
 	help
 	  This option sets which CPUFreq governor shall be loaded at
diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
index ca0021a..93f8689 100644
--- a/drivers/cpufreq/Kconfig.powerpc
+++ b/drivers/cpufreq/Kconfig.powerpc
@@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
 	help
 	  This adds the support for frequency switching on PA Semi
 	  PWRficient processors.
+
+config POWERNV_CPUFREQ
+       tristate "CPU frequency scaling for IBM POWERNV platform"
+       depends on PPC_POWERNV
+       select CPU_FREQ_GOV_PERFORMANCE
+       select CPU_FREQ_GOV_POWERSAVE
+       select CPU_FREQ_GOV_USERSPACE
+       select CPU_FREQ_GOV_ONDEMAND
+       select CPU_FREQ_GOV_CONSERVATIVE
+       default y
+       help
+	 This adds support for CPU frequency switching on IBM POWERNV
+	 platform
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 7494565..0dbb963 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ)   += ppc-corenet-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC)		+= pmac32-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC64)		+= pmac64-cpufreq.o
 obj-$(CONFIG_PPC_PASEMI_CPUFREQ)	+= pasemi-cpufreq.o
+obj-$(CONFIG_POWERNV_CPUFREQ)		+= powernv-cpufreq.o
 
 ##################################################################################
 # Other platform drivers
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
new file mode 100644
index 0000000..ab1551f
--- /dev/null
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -0,0 +1,277 @@
+/*
+ * POWERNV cpufreq driver for the IBM POWER processors
+ *
+ * (C) Copyright IBM 2014
+ *
+ * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define pr_fmt(fmt)	"powernv-cpufreq: " fmt
+
+#include <linux/module.h>
+#include <linux/cpufreq.h>
+#include <linux/of.h>
+#include <asm/cputhreads.h>
+
+/* FIXME: Make this per-core */
+static DEFINE_MUTEX(freq_switch_mutex);
+
+#define POWERNV_MAX_PSTATES	256
+
+static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
+
+/*
+ * Initialize the freq table based on data obtained
+ * from the firmware passed via device-tree
+ */
+
+static int init_powernv_pstates(void)
+{
+	struct device_node *power_mgt;
+	int nr_pstates = 0;
+	int pstate_min, pstate_max, pstate_nominal;
+	const __be32 *pstate_ids, *pstate_freqs;
+	int i;
+	u32 len_ids, len_freqs;
+
+	power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
+	if (!power_mgt) {
+		pr_warn("power-mgt node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
+		pr_warn("ibm,pstate-min node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
+		pr_warn("ibm,pstate-max node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
+				 &pstate_nominal)) {
+		pr_warn("ibm,pstate-nominal not found\n");
+		return -ENODEV;
+	}
+	pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+		pstate_nominal, pstate_max);
+
+	pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
+	if (!pstate_ids) {
+		pr_warn("ibm,pstate-ids not found\n");
+		return -ENODEV;
+	}
+
+	pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
+				      &len_freqs);
+	if (!pstate_freqs) {
+		pr_warn("ibm,pstate-frequencies-mhz not found\n");
+		return -ENODEV;
+	}
+
+	WARN_ON(len_ids != len_freqs);
+	nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
+	WARN_ON(!nr_pstates);
+
+	pr_debug("NR PStates %d\n", nr_pstates);
+	for (i = 0; i < nr_pstates; i++) {
+		u32 id = be32_to_cpu(pstate_ids[i]);
+		u32 freq = be32_to_cpu(pstate_freqs[i]);
+
+		pr_debug("PState id %d freq %d MHz\n", id, freq);
+		powernv_freqs[i].driver_data = i;
+		powernv_freqs[i].frequency = freq * 1000; /* kHz */
+		powernv_pstate_ids[i] = id;
+	}
+	/* End of list marker entry */
+	powernv_freqs[i].driver_data = 0;
+	powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
+
+	/* Print frequency table */
+	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
+		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
+
+	return 0;
+}
+
+static struct freq_attr *powernv_cpu_freq_attr[] = {
+	&cpufreq_freq_attr_scaling_available_freqs,
+	NULL,
+};
+
+/* Helper routines */
+
+/* Access helpers to power mgt SPR */
+
+static inline unsigned long get_pmspr(unsigned long sprn)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		return mfspr(SPRN_PMCR);
+
+	case SPRN_PMICR:
+		return mfspr(SPRN_PMICR);
+
+	case SPRN_PMSR:
+		return mfspr(SPRN_PMSR);
+	}
+	BUG();
+}
+
+static inline void set_pmspr(unsigned long sprn, unsigned long val)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		mtspr(SPRN_PMCR, val);
+		return;
+
+	case SPRN_PMICR:
+		mtspr(SPRN_PMICR, val);
+		return;
+
+	case SPRN_PMSR:
+		mtspr(SPRN_PMSR, val);
+		return;
+	}
+	BUG();
+}
+
+static void set_pstate(void *pstate)
+{
+	unsigned long val;
+	unsigned long pstate_ul = *(unsigned long *) pstate;
+
+	val = get_pmspr(SPRN_PMCR);
+	val = val & 0x0000ffffffffffffULL;
+	/* Set both global(bits 56..63) and local(bits 48..55) PStates */
+	val = val | (pstate_ul << 56) | (pstate_ul << 48);
+	pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
+	set_pmspr(SPRN_PMCR, val);
+}
+
+static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
+{
+	unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
+
+	/*
+	 * Use smp_call_function to send IPI and execute the
+	 * mtspr on target cpu.  We could do that without IPI
+	 * if current CPU is within policy->cpus (core)
+	 */
+
+	val = val & 0xFF;
+	smp_call_function_any(cpus, set_pstate, &val, 1);
+	return 0;
+}
+
+static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
+{
+	int base, i;
+
+#ifdef CONFIG_SMP
+	base = cpu_first_thread_sibling(policy->cpu);
+
+	for (i = 0; i < threads_per_core; i++)
+		cpumask_set_cpu(base + i, policy->cpus);
+#endif
+	policy->cpuinfo.transition_latency = 25000;
+
+	policy->cur = powernv_freqs[0].frequency;
+	cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
+	return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+	cpufreq_frequency_table_put_attr(policy->cpu);
+	return 0;
+}
+
+static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
+{
+	return cpufreq_frequency_table_verify(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_target(struct cpufreq_policy *policy,
+			      unsigned int target_freq,
+			      unsigned int relation)
+{
+	int rc;
+	struct cpufreq_freqs freqs;
+	unsigned int new_index;
+
+	cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
+				       relation, &new_index);
+
+	freqs.old = policy->cur;
+	freqs.new = powernv_freqs[new_index].frequency;
+	freqs.cpu = policy->cpu;
+
+	mutex_lock(&freq_switch_mutex);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+
+	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
+		 policy->cpu,
+		 powernv_freqs[new_index].frequency,
+		 new_index,
+		 powernv_pstate_ids[new_index]);
+
+	rc = powernv_set_freq(policy->cpus, new_index);
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+	mutex_unlock(&freq_switch_mutex);
+
+	return rc;
+}
+
+static struct cpufreq_driver powernv_cpufreq_driver = {
+	.verify		= powernv_cpufreq_verify,
+	.target		= powernv_cpufreq_target,
+	.init		= powernv_cpufreq_cpu_init,
+	.exit		= powernv_cpufreq_cpu_exit,
+	.name		= "powernv-cpufreq",
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.attr		= powernv_cpu_freq_attr,
+};
+
+static int __init powernv_cpufreq_init(void)
+{
+	int rc = 0;
+
+	/* Discover pstates from device tree and init */
+
+	rc = init_powernv_pstates();
+
+	if (rc) {
+		pr_info("powernv-cpufreq disabled\n");
+		return rc;
+	}
+
+	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
+	return rc;
+}
+
+static void __exit powernv_cpufreq_exit(void)
+{
+	cpufreq_unregister_driver(&powernv_cpufreq_driver);
+}
+
+module_init(powernv_cpufreq_init);
+module_exit(powernv_cpufreq_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
  2014-03-10 11:10 ` [PATCH v2 1/6] powernv: cpufreq driver for powernv platform Gautham R. Shenoy
@ 2014-03-10 11:10 ` Gautham R. Shenoy
  2014-03-18 23:37   ` Benjamin Herrenschmidt
  2014-03-10 11:10 ` [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions Gautham R. Shenoy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>

Create a helper method that computes the cpumask corresponding to the
thread-siblings of a cpu. Use this for initializing the policy->cpus
mask for a given cpu.

(Original code written by Srivatsa S. Bhat. Gautham moved this to a
helper function!)

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index ab1551f..4cad727 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -115,6 +115,23 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
 
 /* Helper routines */
 
+/**
+ * Sets the bits corresponding to the thread-siblings of cpu in its core
+ * in 'cpus'.
+ */
+static void powernv_cpu_to_core_mask(unsigned int cpu, cpumask_var_t cpus)
+{
+	int base, i;
+
+	base = cpu_first_thread_sibling(cpu);
+
+	for (i = 0; i < threads_per_core; i++) {
+		cpumask_set_cpu(base + i, cpus);
+	}
+
+	return;
+}
+
 /* Access helpers to power mgt SPR */
 
 static inline unsigned long get_pmspr(unsigned long sprn)
@@ -180,13 +197,8 @@ static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
 
 static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
 {
-	int base, i;
-
 #ifdef CONFIG_SMP
-	base = cpu_first_thread_sibling(policy->cpu);
-
-	for (i = 0; i < threads_per_core; i++)
-		cpumask_set_cpu(base + i, policy->cpus);
+	powernv_cpu_to_core_mask(policy->cpu, policy->cpus);
 #endif
 	policy->cpuinfo.transition_latency = 25000;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
  2014-03-10 11:10 ` [PATCH v2 1/6] powernv: cpufreq driver for powernv platform Gautham R. Shenoy
  2014-03-10 11:10 ` [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper Gautham R. Shenoy
@ 2014-03-10 11:10 ` Gautham R. Shenoy
  2014-03-17  9:04   ` Preeti U Murthy
  2014-03-10 11:10 ` [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper Gautham R. Shenoy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>

On POWER systems, the CPU frequency is controlled at a core-level and
hence we need to serialize so that only one of the threads in the core
switches the core's frequency at a time.

Using a global mutex lock would needlessly serialize _all_ frequency
transitions in the system (across all cores). So introduce per-core
locking to enable finer-grained synchronization and thereby enhance
the speed and responsiveness of the cpufreq driver to varying workload
demands.

The design of per-core locking is very simple and straight-forward: we
first define a Per-CPU lock and use the ones that belongs to the first
thread sibling of the core.

cpu_first_thread_sibling() macro is used to find the *common* lock for
all thread siblings belonging to a core.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 4cad727..4c2e8ca 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -24,8 +24,15 @@
 #include <linux/of.h>
 #include <asm/cputhreads.h>
 
-/* FIXME: Make this per-core */
-static DEFINE_MUTEX(freq_switch_mutex);
+/* Per-Core locking for frequency transitions */
+static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
+
+#define lock_core_freq(cpu)				\
+			mutex_lock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
+#define unlock_core_freq(cpu)				\
+			mutex_unlock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
 
 #define POWERNV_MAX_PSTATES	256
 
@@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	freqs.new = powernv_freqs[new_index].frequency;
 	freqs.cpu = policy->cpu;
 
-	mutex_lock(&freq_switch_mutex);
+	lock_core_freq(policy->cpu);
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
@@ -245,7 +252,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	rc = powernv_set_freq(policy->cpus, new_index);
 
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
-	mutex_unlock(&freq_switch_mutex);
+	unlock_core_freq(policy->cpu);
 
 	return rc;
 }
@@ -262,7 +269,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 
 static int __init powernv_cpufreq_init(void)
 {
-	int rc = 0;
+	int cpu, rc = 0;
 
 	/* Discover pstates from device tree and init */
 
@@ -272,6 +279,10 @@ static int __init powernv_cpufreq_init(void)
 		pr_info("powernv-cpufreq disabled\n");
 		return rc;
 	}
+	/* Init per-core mutex */
+	for_each_possible_cpu(cpu) {
+		mutex_init(&per_cpu(freq_switch_lock, cpu));
+	}
 
 	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
 	return rc;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2014-03-10 11:10 ` [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions Gautham R. Shenoy
@ 2014-03-10 11:10 ` Gautham R. Shenoy
  2014-03-17  9:06   ` Preeti U Murthy
  2014-03-10 11:11 ` [PATCH v2 5/6] powernv:cpufreq: Export nominal frequency via sysfs Gautham R. Shenoy
  2014-03-10 11:11 ` [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method Gautham R. Shenoy
  5 siblings, 1 reply; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a helper routine that can return the cpu-frequency for the
corresponding pstate_id.

Also, cache the values of the pstate_max, pstate_min and
pstate_nominal and nr_pstates in a static structure so that they can
be reused in the future to perform any validations.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 4c2e8ca..0ecd163 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
 
+struct powernv_pstate_info {
+	int pstate_min_id;
+	int pstate_max_id;
+	int pstate_nominal_id;
+	int nr_pstates;
+};
+static struct powernv_pstate_info powernv_pstate_info;
+
 /*
  * Initialize the freq table based on data obtained
  * from the firmware passed via device-tree
@@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
 	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
 		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
 
+	powernv_pstate_info.pstate_min_id = pstate_min;
+	powernv_pstate_info.pstate_max_id = pstate_max;
+	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
+	powernv_pstate_info.nr_pstates = nr_pstates;
+
 	return 0;
 }
 
+/**
+ * Returns the cpu frequency corresponding to the pstate_id.
+ */
+static unsigned int pstate_id_to_freq(int pstate_id)
+{
+	int i;
+
+	i = powernv_pstate_info.pstate_max_id - pstate_id;
+
+	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
+	WARN_ON(powernv_pstate_ids[i] != pstate_id);
+	return powernv_freqs[i].frequency;
+}
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
 	NULL,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/6] powernv:cpufreq: Export nominal frequency via sysfs.
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
                   ` (3 preceding siblings ...)
  2014-03-10 11:10 ` [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper Gautham R. Shenoy
@ 2014-03-10 11:11 ` Gautham R. Shenoy
  2014-03-10 11:11 ` [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method Gautham R. Shenoy
  5 siblings, 0 replies; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a driver attribute named cpuinfo_nominal_freq which
creates a sysfs read-only file named cpuinfo_nominal_freq. Export
the frequency corresponding to the nominal_pstate through this
interface.

Nominal frequency is the highest non-turbo frequency for the
platform.  This is generally used for setting governor policies from
user space for optimal energy efficiency.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 0ecd163..183bbc4 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
 	return powernv_freqs[i].frequency;
 }
 
+/**
+ * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
+ * the firmware
+ */
+static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
+					char *buf)
+{
+	int nominal_freq;
+	nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
+	return sprintf(buf, "%u\n", nominal_freq);
+}
+
+
+struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
+	.attr = { .name = "cpuinfo_nominal_freq",
+		  .mode = 0444,
+		},
+	.show = show_cpuinfo_nominal_freq,
+};
+
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
+	&cpufreq_freq_attr_cpuinfo_nominal_freq,
 	NULL,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method
  2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
                   ` (4 preceding siblings ...)
  2014-03-10 11:11 ` [PATCH v2 5/6] powernv:cpufreq: Export nominal frequency via sysfs Gautham R. Shenoy
@ 2014-03-10 11:11 ` Gautham R. Shenoy
  2014-03-17  9:11   ` Preeti U Murthy
  5 siblings, 1 reply; 13+ messages in thread
From: Gautham R. Shenoy @ 2014-03-10 11:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: srivatsa.bhat, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The current frequency of a cpu is reported through the sysfs file
cpuinfo_cur_freq. This requires the driver to implement a
"->get(unsigned int cpu)" method which will return the current
operating frequency.

Implement a function named powernv_cpufreq_get() which reads the local
pstate from the PMSR and returns the corresponding frequency.

Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 48 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 183bbc4..6f3b6e1 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -223,6 +223,53 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
 	BUG();
 }
 
+/*
+ * Computes the current frequency on this cpu
+ * and stores the result in *ret_freq.
+ */
+static void powernv_read_cpu_freq(void *ret_freq)
+{
+	unsigned long pmspr_val;
+	s8 local_pstate_id;
+	int *cur_freq, freq, pstate_id;
+
+	cur_freq = (int *)ret_freq;
+	pmspr_val = get_pmspr(SPRN_PMSR);
+
+	/* The local pstate id corresponds bits 48..55 in the PMSR.
+         * Note: Watch out for the sign! */
+	local_pstate_id = (pmspr_val >> 48) & 0xFF;
+	pstate_id = local_pstate_id;
+
+	freq = pstate_id_to_freq(pstate_id);
+	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
+		smp_processor_id(), pmspr_val, pstate_id, freq);
+	*cur_freq = freq;
+}
+
+/*
+ * Returns the cpu frequency as reported by the firmware for 'cpu'.
+ * This value is reported through the sysfs file cpuinfo_cur_freq.
+ */
+unsigned int powernv_cpufreq_get(unsigned int cpu)
+{
+	int ret_freq;
+	cpumask_var_t sibling_mask;
+
+	if (unlikely(!zalloc_cpumask_var(&sibling_mask, GFP_KERNEL))) {
+		smp_call_function_single(cpu, powernv_read_cpu_freq,
+					&ret_freq, 1);
+		return ret_freq;
+	}
+
+	powernv_cpu_to_core_mask(cpu, sibling_mask);
+	smp_call_function_any(sibling_mask, powernv_read_cpu_freq,
+			&ret_freq, 1);
+
+	free_cpumask_var(sibling_mask);
+	return ret_freq;
+}
+
 static void set_pstate(void *pstate)
 {
 	unsigned long val;
@@ -309,6 +356,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 static struct cpufreq_driver powernv_cpufreq_driver = {
 	.verify		= powernv_cpufreq_verify,
 	.target		= powernv_cpufreq_target,
+	.get		= powernv_cpufreq_get,
 	.init		= powernv_cpufreq_cpu_init,
 	.exit		= powernv_cpufreq_cpu_exit,
 	.name		= "powernv-cpufreq",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-10 11:10 ` [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions Gautham R. Shenoy
@ 2014-03-17  9:04   ` Preeti U Murthy
  0 siblings, 0 replies; 13+ messages in thread
From: Preeti U Murthy @ 2014-03-17  9:04 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, srivatsa.bhat

On 03/10/2014 04:40 PM, Gautham R. Shenoy wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> 
> On POWER systems, the CPU frequency is controlled at a core-level and
> hence we need to serialize so that only one of the threads in the core
> switches the core's frequency at a time.
> 
> Using a global mutex lock would needlessly serialize _all_ frequency
> transitions in the system (across all cores). So introduce per-core
> locking to enable finer-grained synchronization and thereby enhance
> the speed and responsiveness of the cpufreq driver to varying workload
> demands.
> 
> The design of per-core locking is very simple and straight-forward: we
> first define a Per-CPU lock and use the ones that belongs to the first
> thread sibling of the core.
> 
> cpu_first_thread_sibling() macro is used to find the *common* lock for
> all thread siblings belonging to a core.
> 
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 4cad727..4c2e8ca 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -24,8 +24,15 @@
>  #include <linux/of.h>
>  #include <asm/cputhreads.h>
>  
> -/* FIXME: Make this per-core */
> -static DEFINE_MUTEX(freq_switch_mutex);
> +/* Per-Core locking for frequency transitions */
> +static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
> +
> +#define lock_core_freq(cpu)				\
> +			mutex_lock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
> +#define unlock_core_freq(cpu)				\
> +			mutex_unlock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
>  
>  #define POWERNV_MAX_PSTATES	256
>  
> @@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	freqs.new = powernv_freqs[new_index].frequency;
>  	freqs.cpu = policy->cpu;
>  
> -	mutex_lock(&freq_switch_mutex);
> +	lock_core_freq(policy->cpu);
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
>  
>  	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> @@ -245,7 +252,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	rc = powernv_set_freq(policy->cpus, new_index);
>  
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> -	mutex_unlock(&freq_switch_mutex);
> +	unlock_core_freq(policy->cpu);
>  
>  	return rc;
>  }
> @@ -262,7 +269,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
>  
>  static int __init powernv_cpufreq_init(void)
>  {
> -	int rc = 0;
> +	int cpu, rc = 0;
>  
>  	/* Discover pstates from device tree and init */
>  
> @@ -272,6 +279,10 @@ static int __init powernv_cpufreq_init(void)
>  		pr_info("powernv-cpufreq disabled\n");
>  		return rc;
>  	}
> +	/* Init per-core mutex */
> +	for_each_possible_cpu(cpu) {
> +		mutex_init(&per_cpu(freq_switch_lock, cpu));
> +	}
>  
>  	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
>  	return rc;
> 
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper
  2014-03-10 11:10 ` [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper Gautham R. Shenoy
@ 2014-03-17  9:06   ` Preeti U Murthy
  0 siblings, 0 replies; 13+ messages in thread
From: Preeti U Murthy @ 2014-03-17  9:06 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, srivatsa.bhat

On 03/10/2014 04:40 PM, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Create a helper routine that can return the cpu-frequency for the
> corresponding pstate_id.
> 
> Also, cache the values of the pstate_max, pstate_min and
> pstate_nominal and nr_pstates in a static structure so that they can
> be reused in the future to perform any validations.
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 4c2e8ca..0ecd163 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>  static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
>  
> +struct powernv_pstate_info {
> +	int pstate_min_id;
> +	int pstate_max_id;
> +	int pstate_nominal_id;
> +	int nr_pstates;
> +};
> +static struct powernv_pstate_info powernv_pstate_info;
> +
>  /*
>   * Initialize the freq table based on data obtained
>   * from the firmware passed via device-tree
> @@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
>  	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
>  		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
>  
> +	powernv_pstate_info.pstate_min_id = pstate_min;
> +	powernv_pstate_info.pstate_max_id = pstate_max;
> +	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
> +	powernv_pstate_info.nr_pstates = nr_pstates;
> +
>  	return 0;
>  }
>  
> +/**
> + * Returns the cpu frequency corresponding to the pstate_id.
> + */
> +static unsigned int pstate_id_to_freq(int pstate_id)
> +{
> +	int i;
> +
> +	i = powernv_pstate_info.pstate_max_id - pstate_id;
> +
> +	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
> +	WARN_ON(powernv_pstate_ids[i] != pstate_id);
> +	return powernv_freqs[i].frequency;
> +}
> +
>  static struct freq_attr *powernv_cpu_freq_attr[] = {
>  	&cpufreq_freq_attr_scaling_available_freqs,
>  	NULL,
> 
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method
  2014-03-10 11:11 ` [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method Gautham R. Shenoy
@ 2014-03-17  9:11   ` Preeti U Murthy
  0 siblings, 0 replies; 13+ messages in thread
From: Preeti U Murthy @ 2014-03-17  9:11 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, srivatsa.bhat

On 03/10/2014 04:41 PM, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
> 
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
> 
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 48 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 183bbc4..6f3b6e1 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -223,6 +223,53 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>  	BUG();
>  }
>  
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +	unsigned long pmspr_val;
> +	s8 local_pstate_id;
> +	int *cur_freq, freq, pstate_id;
> +
> +	cur_freq = (int *)ret_freq;
> +	pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +	/* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +	local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +	pstate_id = local_pstate_id;
> +
> +	freq = pstate_id_to_freq(pstate_id);
> +	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +		smp_processor_id(), pmspr_val, pstate_id, freq);
> +	*cur_freq = freq;
> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +	int ret_freq;
> +	cpumask_var_t sibling_mask;
> +
> +	if (unlikely(!zalloc_cpumask_var(&sibling_mask, GFP_KERNEL))) {
> +		smp_call_function_single(cpu, powernv_read_cpu_freq,
> +					&ret_freq, 1);
> +		return ret_freq;
> +	}
> +
> +	powernv_cpu_to_core_mask(cpu, sibling_mask);
> +	smp_call_function_any(sibling_mask, powernv_read_cpu_freq,
> +			&ret_freq, 1);
> +
> +	free_cpumask_var(sibling_mask);
> +	return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>  	unsigned long val;
> @@ -309,6 +356,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>  	.verify		= powernv_cpufreq_verify,
>  	.target		= powernv_cpufreq_target,
> +	.get		= powernv_cpufreq_get,
>  	.init		= powernv_cpufreq_cpu_init,
>  	.exit		= powernv_cpufreq_cpu_exit,
>  	.name		= "powernv-cpufreq",
> 
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
  2014-03-10 11:10 ` [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper Gautham R. Shenoy
@ 2014-03-18 23:37   ` Benjamin Herrenschmidt
  2014-03-19  6:35     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2014-03-18 23:37 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, srivatsa.bhat

On Mon, 2014-03-10 at 16:40 +0530, Gautham R. Shenoy wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> 
> Create a helper method that computes the cpumask corresponding to the
> thread-siblings of a cpu. Use this for initializing the policy->cpus
> mask for a given cpu.
> 
> (Original code written by Srivatsa S. Bhat. Gautham moved this to a
> helper function!)

How does that differ from the existing entry in the sibling map  which
you can obtain with the helper cpu_sibling_mask() ? (You probably want
to *copy* the mask of course but I don't see the need of re-creating
one from scratch).

Also, this should have been CCed to the cpufreq mailing list and maybe
the relevant maintainer too.

Cheers,
Ben.

> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index ab1551f..4cad727 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -115,6 +115,23 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
>  
>  /* Helper routines */
>  
> +/**
> + * Sets the bits corresponding to the thread-siblings of cpu in its core
> + * in 'cpus'.
> + */
> +static void powernv_cpu_to_core_mask(unsigned int cpu, cpumask_var_t cpus)
> +{
> +	int base, i;
> +
> +	base = cpu_first_thread_sibling(cpu);
> +
> +	for (i = 0; i < threads_per_core; i++) {
> +		cpumask_set_cpu(base + i, cpus);
> +	}
> +
> +	return;
> +}
> +
>  /* Access helpers to power mgt SPR */
>  
>  static inline unsigned long get_pmspr(unsigned long sprn)
> @@ -180,13 +197,8 @@ static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
>  
>  static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>  {
> -	int base, i;
> -
>  #ifdef CONFIG_SMP
> -	base = cpu_first_thread_sibling(policy->cpu);
> -
> -	for (i = 0; i < threads_per_core; i++)
> -		cpumask_set_cpu(base + i, policy->cpus);
> +	powernv_cpu_to_core_mask(policy->cpu, policy->cpus);
>  #endif
>  	policy->cpuinfo.transition_latency = 25000;
>  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
  2014-03-18 23:37   ` Benjamin Herrenschmidt
@ 2014-03-19  6:35     ` Srivatsa S. Bhat
  2014-03-19 11:02       ` Gautham R Shenoy
  0 siblings, 1 reply; 13+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-19  6:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Gautham R. Shenoy, linuxppc-dev

On 03/19/2014 05:07 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2014-03-10 at 16:40 +0530, Gautham R. Shenoy wrote:
>> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>>
>> Create a helper method that computes the cpumask corresponding to the
>> thread-siblings of a cpu. Use this for initializing the policy->cpus
>> mask for a given cpu.
>>
>> (Original code written by Srivatsa S. Bhat. Gautham moved this to a
>> helper function!)
> 
> How does that differ from the existing entry in the sibling map  which
> you can obtain with the helper cpu_sibling_mask() ? (You probably want
> to *copy* the mask of course but I don't see the need of re-creating
> one from scratch).
>

The intent here was to have a sibling mask that is invariant of CPU
hotplug. cpu_sibling_mask is updated on every hotplug operation, and this
would break cpufreq, since policy->cpus has to be hotplug invariant.

This should have been noted in the changelog of patch 1 as well as this
patch. (The earlier (internal) versions of this patchset had the
description in the changelogs, but somehow it got dropped accidentally).
I'll work with Gautham and ensure that we have this important info
included in the changelog. Thanks for pointing it out!

Regards,
Srivatsa S. Bhat

 
> Also, this should have been CCed to the cpufreq mailing list and maybe
> the relevant maintainer too.
> 
> Cheers,
> Ben.
> 
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> ---
>>  drivers/cpufreq/powernv-cpufreq.c | 24 ++++++++++++++++++------
>>  1 file changed, 18 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>> index ab1551f..4cad727 100644
>> --- a/drivers/cpufreq/powernv-cpufreq.c
>> +++ b/drivers/cpufreq/powernv-cpufreq.c
>> @@ -115,6 +115,23 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
>>  
>>  /* Helper routines */
>>  
>> +/**
>> + * Sets the bits corresponding to the thread-siblings of cpu in its core
>> + * in 'cpus'.
>> + */
>> +static void powernv_cpu_to_core_mask(unsigned int cpu, cpumask_var_t cpus)
>> +{
>> +	int base, i;
>> +
>> +	base = cpu_first_thread_sibling(cpu);
>> +
>> +	for (i = 0; i < threads_per_core; i++) {
>> +		cpumask_set_cpu(base + i, cpus);
>> +	}
>> +
>> +	return;
>> +}
>> +
>>  /* Access helpers to power mgt SPR */
>>  
>>  static inline unsigned long get_pmspr(unsigned long sprn)
>> @@ -180,13 +197,8 @@ static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
>>  
>>  static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>  {
>> -	int base, i;
>> -
>>  #ifdef CONFIG_SMP
>> -	base = cpu_first_thread_sibling(policy->cpu);
>> -
>> -	for (i = 0; i < threads_per_core; i++)
>> -		cpumask_set_cpu(base + i, policy->cpus);
>> +	powernv_cpu_to_core_mask(policy->cpu, policy->cpus);
>>  #endif
>>  	policy->cpuinfo.transition_latency = 25000;
>>  
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper.
  2014-03-19  6:35     ` Srivatsa S. Bhat
@ 2014-03-19 11:02       ` Gautham R Shenoy
  0 siblings, 0 replies; 13+ messages in thread
From: Gautham R Shenoy @ 2014-03-19 11:02 UTC (permalink / raw)
  To: Srivatsa S. Bhat; +Cc: linuxppc-dev, Gautham R. Shenoy

On Wed, Mar 19, 2014 at 12:05:20PM +0530, Srivatsa S. Bhat wrote:
> On 03/19/2014 05:07 AM, Benjamin Herrenschmidt wrote:
> > On Mon, 2014-03-10 at 16:40 +0530, Gautham R. Shenoy wrote:
> >> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> >>
> >> Create a helper method that computes the cpumask corresponding to the
> >> thread-siblings of a cpu. Use this for initializing the policy->cpus
> >> mask for a given cpu.
> >>
> >> (Original code written by Srivatsa S. Bhat. Gautham moved this to a
> >> helper function!)
> > 
> > How does that differ from the existing entry in the sibling map  which
> > you can obtain with the helper cpu_sibling_mask() ? (You probably want
> > to *copy* the mask of course but I don't see the need of re-creating
> > one from scratch).
> >
> 
> The intent here was to have a sibling mask that is invariant of CPU
> hotplug. cpu_sibling_mask is updated on every hotplug operation, and this
> would break cpufreq, since policy->cpus has to be hotplug invariant.
> 
> This should have been noted in the changelog of patch 1 as well as this
> patch. (The earlier (internal) versions of this patchset had the
> description in the changelogs, but somehow it got dropped accidentally).
> I'll work with Gautham and ensure that we have this important info
> included in the changelog. Thanks for pointing it out!

I reused that part of the code because I was unaware of
cpu_sibling_mask()!

For implementing powernv_cpufreq_get(), cpu_sibling_mask() suffices
since we are using this mask as a parameter to
smp_call_function_any(), and hence are concerned only about the online
siblings. 

I shall fix the changelog to reflect Srivatsa's reason for having a
mask that does not vary with cpu-hotplug.

> 
> Regards,
> Srivatsa S. Bhat
> 
> 
> > Also, this should have been CCed to the cpufreq mailing list and maybe
> > the relevant maintainer too.

Ok. Will do it in the subsequent version. 

Thanks for the review!

> > 
> > Cheers,
> > Ben.
> > 
> >> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> >> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> >> ---
> >>  drivers/cpufreq/powernv-cpufreq.c | 24 ++++++++++++++++++------
> >>  1 file changed, 18 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> >> index ab1551f..4cad727 100644
> >> --- a/drivers/cpufreq/powernv-cpufreq.c
> >> +++ b/drivers/cpufreq/powernv-cpufreq.c
> >> @@ -115,6 +115,23 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
> >>  
> >>  /* Helper routines */
> >>  
> >> +/**
> >> + * Sets the bits corresponding to the thread-siblings of cpu in its core
> >> + * in 'cpus'.
> >> + */
> >> +static void powernv_cpu_to_core_mask(unsigned int cpu, cpumask_var_t cpus)
> >> +{
> >> +	int base, i;
> >> +
> >> +	base = cpu_first_thread_sibling(cpu);
> >> +
> >> +	for (i = 0; i < threads_per_core; i++) {
> >> +		cpumask_set_cpu(base + i, cpus);
> >> +	}
> >> +
> >> +	return;
> >> +}
> >> +
> >>  /* Access helpers to power mgt SPR */
> >>  
> >>  static inline unsigned long get_pmspr(unsigned long sprn)
> >> @@ -180,13 +197,8 @@ static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> >>  
> >>  static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> >>  {
> >> -	int base, i;
> >> -
> >>  #ifdef CONFIG_SMP
> >> -	base = cpu_first_thread_sibling(policy->cpu);
> >> -
> >> -	for (i = 0; i < threads_per_core; i++)
> >> -		cpumask_set_cpu(base + i, policy->cpus);
> >> +	powernv_cpu_to_core_mask(policy->cpu, policy->cpus);
> >>  #endif
> >>  	policy->cpuinfo.transition_latency = 25000;
> >>  
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-03-19 11:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-10 11:10 [PATCH v2 0/6] powernv:cpufreq: Dynamic cpu-frequency scaling Gautham R. Shenoy
2014-03-10 11:10 ` [PATCH v2 1/6] powernv: cpufreq driver for powernv platform Gautham R. Shenoy
2014-03-10 11:10 ` [PATCH v2 2/6] powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper Gautham R. Shenoy
2014-03-18 23:37   ` Benjamin Herrenschmidt
2014-03-19  6:35     ` Srivatsa S. Bhat
2014-03-19 11:02       ` Gautham R Shenoy
2014-03-10 11:10 ` [PATCH v2 3/6] powernv, cpufreq:Add per-core locking to serialize frequency transitions Gautham R. Shenoy
2014-03-17  9:04   ` Preeti U Murthy
2014-03-10 11:10 ` [PATCH v2 4/6] powernv:cpufreq: Create pstate_id_to_freq() helper Gautham R. Shenoy
2014-03-17  9:06   ` Preeti U Murthy
2014-03-10 11:11 ` [PATCH v2 5/6] powernv:cpufreq: Export nominal frequency via sysfs Gautham R. Shenoy
2014-03-10 11:11 ` [PATCH v2 6/6] powernv:cpufreq: Implement the driver->get() method Gautham R. Shenoy
2014-03-17  9:11   ` Preeti U Murthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).