All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5] powernv: Enable Dynamic Frequency
@ 2014-03-20 12:10 ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the v3 of the consolidated patchset consisting
patches for enabling cpufreq on IBM POWERNV platforms
along with some enhancements. I have incorporated the review 
for v2 (which can be found at [1]).

- This patchset contains code for the platform driver to support CPU
  frequency scaling on IBM POWERNV platforms.

- In addition to the standard control and status files exposed by the
  cpufreq core, the patchset exposes the nominal frequency through the
  file named "cpuinfo_nominal_freq".


The changes from from v2:
  - Updated the changelog for "powernv: cpufreq driver for powernv platform"
    to record the fact that policy->cpus must be populated with a sibling mask
    that includes even the offlined siblings.

  - Dropped the patch named 
    "powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper"
     since the driver->get() method can use cpu_sibling_mask().
  
  -  Updated "powernv:cpufreq: Implement the driver->get() method" to 
     use cpu_sibling_mask() in powernv_cpufreq_get()

The patchset is based against commit 4907cdca7210c5895311bddcf05a4c85b67d8566
of the mainline tree.

[1]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-March/115861.html


Gautham R. Shenoy (3):
  powernv:cpufreq: Create pstate_id_to_freq() helper
  powernv:cpufreq: Export nominal frequency via sysfs.
  powernv:cpufreq: Implement the driver->get() method

Srivatsa S. Bhat (1):
  powernv,cpufreq:Add per-core locking to serialize frequency
    transitions

Vaidyanathan Srinivasan (1):
  powernv: cpufreq driver for powernv platform

 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 375 +++++++++++++++++++++++++++++++++
 6 files changed, 395 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

-- 
1.8.3.1

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v3 0/5] powernv: Enable Dynamic Frequency
@ 2014-03-20 12:10 ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Hi,

This is the v3 of the consolidated patchset consisting
patches for enabling cpufreq on IBM POWERNV platforms
along with some enhancements. I have incorporated the review 
for v2 (which can be found at [1]).

- This patchset contains code for the platform driver to support CPU
  frequency scaling on IBM POWERNV platforms.

- In addition to the standard control and status files exposed by the
  cpufreq core, the patchset exposes the nominal frequency through the
  file named "cpuinfo_nominal_freq".


The changes from from v2:
  - Updated the changelog for "powernv: cpufreq driver for powernv platform"
    to record the fact that policy->cpus must be populated with a sibling mask
    that includes even the offlined siblings.

  - Dropped the patch named 
    "powernv:cpufreq: Create a powernv_cpu_to_core_mask() helper"
     since the driver->get() method can use cpu_sibling_mask().
  
  -  Updated "powernv:cpufreq: Implement the driver->get() method" to 
     use cpu_sibling_mask() in powernv_cpufreq_get()

The patchset is based against commit 4907cdca7210c5895311bddcf05a4c85b67d8566
of the mainline tree.

[1]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-March/115861.html


Gautham R. Shenoy (3):
  powernv:cpufreq: Create pstate_id_to_freq() helper
  powernv:cpufreq: Export nominal frequency via sysfs.
  powernv:cpufreq: Implement the driver->get() method

Srivatsa S. Bhat (1):
  powernv,cpufreq:Add per-core locking to serialize frequency
    transitions

Vaidyanathan Srinivasan (1):
  powernv: cpufreq driver for powernv platform

 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 375 +++++++++++++++++++++++++++++++++
 6 files changed, 395 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm
  Cc: benh, svaidy, Srivatsa S. Bhat, Anton Blanchard, Gautham R. Shenoy

From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Backend driver to dynamically set voltage and frequency on
IBM POWER non-virtualized platforms.  Power management SPRs
are used to set the required PState.

This driver works in conjunction with cpufreq governors
like 'ondemand' to provide a demand based frequency and
voltage setting on IBM POWER non-virtualized platforms.

PState table is obtained from OPAL v3 firmware through device
tree.

powernv_cpufreq back-end driver would parse the relevant device-tree
nodes and initialise the cpufreq subsystem on powernv platform.

The policy->cpus needs to be populated in a hotplug-invariant manner
instead of using cpu_sibling_mask() which varies with cpu-hotplug. This
is because the cpufreq core code copies this content into policy->related_cpus
mask which is should not vary on cpu-hotplug.

[Change log updated by ego@linux.vnet.ibm.com]

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 277 +++++++++++++++++++++++++++++++++
 6 files changed, 297 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 90c06ec..84f92ca 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -271,6 +271,10 @@
 #define SPRN_HSRR1	0x13B	/* Hypervisor Save/Restore 1 */
 #define SPRN_IC		0x350	/* Virtual Instruction Count */
 #define SPRN_VTB	0x351	/* Virtual Time Base */
+#define SPRN_PMICR	0x354   /* Power Management Idle Control Reg */
+#define SPRN_PMSR	0x355   /* Power Management Status Reg */
+#define SPRN_PMCR	0x374	/* Power Management Control Register */
+
 /* HFSCR and FSCR bit numbers are the same */
 #define FSCR_TAR_LG	8	/* Enable Target Address Register */
 #define FSCR_EBB_LG	7	/* Enable Event Based Branching */
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 895e8a2..1fe12b1 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -11,6 +11,7 @@ config PPC_POWERNV
 	select PPC_UDBG_16550
 	select PPC_SCOM
 	select ARCH_RANDOM
+	select CPU_FREQ
 	default y
 
 config PPC_POWERNV_RTAS
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 4b029c0..4ba1632 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
 choice
 	prompt "Default CPUFreq governor"
 	default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
+	default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
 	default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
 	help
 	  This option sets which CPUFreq governor shall be loaded at
diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
index ca0021a..93f8689 100644
--- a/drivers/cpufreq/Kconfig.powerpc
+++ b/drivers/cpufreq/Kconfig.powerpc
@@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
 	help
 	  This adds the support for frequency switching on PA Semi
 	  PWRficient processors.
+
+config POWERNV_CPUFREQ
+       tristate "CPU frequency scaling for IBM POWERNV platform"
+       depends on PPC_POWERNV
+       select CPU_FREQ_GOV_PERFORMANCE
+       select CPU_FREQ_GOV_POWERSAVE
+       select CPU_FREQ_GOV_USERSPACE
+       select CPU_FREQ_GOV_ONDEMAND
+       select CPU_FREQ_GOV_CONSERVATIVE
+       default y
+       help
+	 This adds support for CPU frequency switching on IBM POWERNV
+	 platform
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 7494565..0dbb963 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ)   += ppc-corenet-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC)		+= pmac32-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC64)		+= pmac64-cpufreq.o
 obj-$(CONFIG_PPC_PASEMI_CPUFREQ)	+= pasemi-cpufreq.o
+obj-$(CONFIG_POWERNV_CPUFREQ)		+= powernv-cpufreq.o
 
 ##################################################################################
 # Other platform drivers
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
new file mode 100644
index 0000000..ab1551f
--- /dev/null
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -0,0 +1,277 @@
+/*
+ * POWERNV cpufreq driver for the IBM POWER processors
+ *
+ * (C) Copyright IBM 2014
+ *
+ * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define pr_fmt(fmt)	"powernv-cpufreq: " fmt
+
+#include <linux/module.h>
+#include <linux/cpufreq.h>
+#include <linux/of.h>
+#include <asm/cputhreads.h>
+
+/* FIXME: Make this per-core */
+static DEFINE_MUTEX(freq_switch_mutex);
+
+#define POWERNV_MAX_PSTATES	256
+
+static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
+
+/*
+ * Initialize the freq table based on data obtained
+ * from the firmware passed via device-tree
+ */
+
+static int init_powernv_pstates(void)
+{
+	struct device_node *power_mgt;
+	int nr_pstates = 0;
+	int pstate_min, pstate_max, pstate_nominal;
+	const __be32 *pstate_ids, *pstate_freqs;
+	int i;
+	u32 len_ids, len_freqs;
+
+	power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
+	if (!power_mgt) {
+		pr_warn("power-mgt node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
+		pr_warn("ibm,pstate-min node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
+		pr_warn("ibm,pstate-max node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
+				 &pstate_nominal)) {
+		pr_warn("ibm,pstate-nominal not found\n");
+		return -ENODEV;
+	}
+	pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+		pstate_nominal, pstate_max);
+
+	pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
+	if (!pstate_ids) {
+		pr_warn("ibm,pstate-ids not found\n");
+		return -ENODEV;
+	}
+
+	pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
+				      &len_freqs);
+	if (!pstate_freqs) {
+		pr_warn("ibm,pstate-frequencies-mhz not found\n");
+		return -ENODEV;
+	}
+
+	WARN_ON(len_ids != len_freqs);
+	nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
+	WARN_ON(!nr_pstates);
+
+	pr_debug("NR PStates %d\n", nr_pstates);
+	for (i = 0; i < nr_pstates; i++) {
+		u32 id = be32_to_cpu(pstate_ids[i]);
+		u32 freq = be32_to_cpu(pstate_freqs[i]);
+
+		pr_debug("PState id %d freq %d MHz\n", id, freq);
+		powernv_freqs[i].driver_data = i;
+		powernv_freqs[i].frequency = freq * 1000; /* kHz */
+		powernv_pstate_ids[i] = id;
+	}
+	/* End of list marker entry */
+	powernv_freqs[i].driver_data = 0;
+	powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
+
+	/* Print frequency table */
+	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
+		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
+
+	return 0;
+}
+
+static struct freq_attr *powernv_cpu_freq_attr[] = {
+	&cpufreq_freq_attr_scaling_available_freqs,
+	NULL,
+};
+
+/* Helper routines */
+
+/* Access helpers to power mgt SPR */
+
+static inline unsigned long get_pmspr(unsigned long sprn)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		return mfspr(SPRN_PMCR);
+
+	case SPRN_PMICR:
+		return mfspr(SPRN_PMICR);
+
+	case SPRN_PMSR:
+		return mfspr(SPRN_PMSR);
+	}
+	BUG();
+}
+
+static inline void set_pmspr(unsigned long sprn, unsigned long val)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		mtspr(SPRN_PMCR, val);
+		return;
+
+	case SPRN_PMICR:
+		mtspr(SPRN_PMICR, val);
+		return;
+
+	case SPRN_PMSR:
+		mtspr(SPRN_PMSR, val);
+		return;
+	}
+	BUG();
+}
+
+static void set_pstate(void *pstate)
+{
+	unsigned long val;
+	unsigned long pstate_ul = *(unsigned long *) pstate;
+
+	val = get_pmspr(SPRN_PMCR);
+	val = val & 0x0000ffffffffffffULL;
+	/* Set both global(bits 56..63) and local(bits 48..55) PStates */
+	val = val | (pstate_ul << 56) | (pstate_ul << 48);
+	pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
+	set_pmspr(SPRN_PMCR, val);
+}
+
+static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
+{
+	unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
+
+	/*
+	 * Use smp_call_function to send IPI and execute the
+	 * mtspr on target cpu.  We could do that without IPI
+	 * if current CPU is within policy->cpus (core)
+	 */
+
+	val = val & 0xFF;
+	smp_call_function_any(cpus, set_pstate, &val, 1);
+	return 0;
+}
+
+static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
+{
+	int base, i;
+
+#ifdef CONFIG_SMP
+	base = cpu_first_thread_sibling(policy->cpu);
+
+	for (i = 0; i < threads_per_core; i++)
+		cpumask_set_cpu(base + i, policy->cpus);
+#endif
+	policy->cpuinfo.transition_latency = 25000;
+
+	policy->cur = powernv_freqs[0].frequency;
+	cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
+	return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+	cpufreq_frequency_table_put_attr(policy->cpu);
+	return 0;
+}
+
+static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
+{
+	return cpufreq_frequency_table_verify(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_target(struct cpufreq_policy *policy,
+			      unsigned int target_freq,
+			      unsigned int relation)
+{
+	int rc;
+	struct cpufreq_freqs freqs;
+	unsigned int new_index;
+
+	cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
+				       relation, &new_index);
+
+	freqs.old = policy->cur;
+	freqs.new = powernv_freqs[new_index].frequency;
+	freqs.cpu = policy->cpu;
+
+	mutex_lock(&freq_switch_mutex);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+
+	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
+		 policy->cpu,
+		 powernv_freqs[new_index].frequency,
+		 new_index,
+		 powernv_pstate_ids[new_index]);
+
+	rc = powernv_set_freq(policy->cpus, new_index);
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+	mutex_unlock(&freq_switch_mutex);
+
+	return rc;
+}
+
+static struct cpufreq_driver powernv_cpufreq_driver = {
+	.verify		= powernv_cpufreq_verify,
+	.target		= powernv_cpufreq_target,
+	.init		= powernv_cpufreq_cpu_init,
+	.exit		= powernv_cpufreq_cpu_exit,
+	.name		= "powernv-cpufreq",
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.attr		= powernv_cpu_freq_attr,
+};
+
+static int __init powernv_cpufreq_init(void)
+{
+	int rc = 0;
+
+	/* Discover pstates from device tree and init */
+
+	rc = init_powernv_pstates();
+
+	if (rc) {
+		pr_info("powernv-cpufreq disabled\n");
+		return rc;
+	}
+
+	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
+	return rc;
+}
+
+static void __exit powernv_cpufreq_exit(void)
+{
+	cpufreq_unregister_driver(&powernv_cpufreq_driver);
+}
+
+module_init(powernv_cpufreq_init);
+module_exit(powernv_cpufreq_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm
  Cc: Anton Blanchard, Srivatsa S. Bhat, Gautham R. Shenoy

From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Backend driver to dynamically set voltage and frequency on
IBM POWER non-virtualized platforms.  Power management SPRs
are used to set the required PState.

This driver works in conjunction with cpufreq governors
like 'ondemand' to provide a demand based frequency and
voltage setting on IBM POWER non-virtualized platforms.

PState table is obtained from OPAL v3 firmware through device
tree.

powernv_cpufreq back-end driver would parse the relevant device-tree
nodes and initialise the cpufreq subsystem on powernv platform.

The policy->cpus needs to be populated in a hotplug-invariant manner
instead of using cpu_sibling_mask() which varies with cpu-hotplug. This
is because the cpufreq core code copies this content into policy->related_cpus
mask which is should not vary on cpu-hotplug.

[Change log updated by ego@linux.vnet.ibm.com]

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/reg.h         |   4 +
 arch/powerpc/platforms/powernv/Kconfig |   1 +
 drivers/cpufreq/Kconfig                |   1 +
 drivers/cpufreq/Kconfig.powerpc        |  13 ++
 drivers/cpufreq/Makefile               |   1 +
 drivers/cpufreq/powernv-cpufreq.c      | 277 +++++++++++++++++++++++++++++++++
 6 files changed, 297 insertions(+)
 create mode 100644 drivers/cpufreq/powernv-cpufreq.c

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 90c06ec..84f92ca 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -271,6 +271,10 @@
 #define SPRN_HSRR1	0x13B	/* Hypervisor Save/Restore 1 */
 #define SPRN_IC		0x350	/* Virtual Instruction Count */
 #define SPRN_VTB	0x351	/* Virtual Time Base */
+#define SPRN_PMICR	0x354   /* Power Management Idle Control Reg */
+#define SPRN_PMSR	0x355   /* Power Management Status Reg */
+#define SPRN_PMCR	0x374	/* Power Management Control Register */
+
 /* HFSCR and FSCR bit numbers are the same */
 #define FSCR_TAR_LG	8	/* Enable Target Address Register */
 #define FSCR_EBB_LG	7	/* Enable Event Based Branching */
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 895e8a2..1fe12b1 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -11,6 +11,7 @@ config PPC_POWERNV
 	select PPC_UDBG_16550
 	select PPC_SCOM
 	select ARCH_RANDOM
+	select CPU_FREQ
 	default y
 
 config PPC_POWERNV_RTAS
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 4b029c0..4ba1632 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
 choice
 	prompt "Default CPUFreq governor"
 	default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
+	default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
 	default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
 	help
 	  This option sets which CPUFreq governor shall be loaded at
diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
index ca0021a..93f8689 100644
--- a/drivers/cpufreq/Kconfig.powerpc
+++ b/drivers/cpufreq/Kconfig.powerpc
@@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
 	help
 	  This adds the support for frequency switching on PA Semi
 	  PWRficient processors.
+
+config POWERNV_CPUFREQ
+       tristate "CPU frequency scaling for IBM POWERNV platform"
+       depends on PPC_POWERNV
+       select CPU_FREQ_GOV_PERFORMANCE
+       select CPU_FREQ_GOV_POWERSAVE
+       select CPU_FREQ_GOV_USERSPACE
+       select CPU_FREQ_GOV_ONDEMAND
+       select CPU_FREQ_GOV_CONSERVATIVE
+       default y
+       help
+	 This adds support for CPU frequency switching on IBM POWERNV
+	 platform
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 7494565..0dbb963 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ)   += ppc-corenet-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC)		+= pmac32-cpufreq.o
 obj-$(CONFIG_CPU_FREQ_PMAC64)		+= pmac64-cpufreq.o
 obj-$(CONFIG_PPC_PASEMI_CPUFREQ)	+= pasemi-cpufreq.o
+obj-$(CONFIG_POWERNV_CPUFREQ)		+= powernv-cpufreq.o
 
 ##################################################################################
 # Other platform drivers
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
new file mode 100644
index 0000000..ab1551f
--- /dev/null
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -0,0 +1,277 @@
+/*
+ * POWERNV cpufreq driver for the IBM POWER processors
+ *
+ * (C) Copyright IBM 2014
+ *
+ * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#define pr_fmt(fmt)	"powernv-cpufreq: " fmt
+
+#include <linux/module.h>
+#include <linux/cpufreq.h>
+#include <linux/of.h>
+#include <asm/cputhreads.h>
+
+/* FIXME: Make this per-core */
+static DEFINE_MUTEX(freq_switch_mutex);
+
+#define POWERNV_MAX_PSTATES	256
+
+static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
+
+/*
+ * Initialize the freq table based on data obtained
+ * from the firmware passed via device-tree
+ */
+
+static int init_powernv_pstates(void)
+{
+	struct device_node *power_mgt;
+	int nr_pstates = 0;
+	int pstate_min, pstate_max, pstate_nominal;
+	const __be32 *pstate_ids, *pstate_freqs;
+	int i;
+	u32 len_ids, len_freqs;
+
+	power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
+	if (!power_mgt) {
+		pr_warn("power-mgt node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
+		pr_warn("ibm,pstate-min node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
+		pr_warn("ibm,pstate-max node not found\n");
+		return -ENODEV;
+	}
+
+	if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
+				 &pstate_nominal)) {
+		pr_warn("ibm,pstate-nominal not found\n");
+		return -ENODEV;
+	}
+	pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+		pstate_nominal, pstate_max);
+
+	pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
+	if (!pstate_ids) {
+		pr_warn("ibm,pstate-ids not found\n");
+		return -ENODEV;
+	}
+
+	pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
+				      &len_freqs);
+	if (!pstate_freqs) {
+		pr_warn("ibm,pstate-frequencies-mhz not found\n");
+		return -ENODEV;
+	}
+
+	WARN_ON(len_ids != len_freqs);
+	nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
+	WARN_ON(!nr_pstates);
+
+	pr_debug("NR PStates %d\n", nr_pstates);
+	for (i = 0; i < nr_pstates; i++) {
+		u32 id = be32_to_cpu(pstate_ids[i]);
+		u32 freq = be32_to_cpu(pstate_freqs[i]);
+
+		pr_debug("PState id %d freq %d MHz\n", id, freq);
+		powernv_freqs[i].driver_data = i;
+		powernv_freqs[i].frequency = freq * 1000; /* kHz */
+		powernv_pstate_ids[i] = id;
+	}
+	/* End of list marker entry */
+	powernv_freqs[i].driver_data = 0;
+	powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
+
+	/* Print frequency table */
+	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
+		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
+
+	return 0;
+}
+
+static struct freq_attr *powernv_cpu_freq_attr[] = {
+	&cpufreq_freq_attr_scaling_available_freqs,
+	NULL,
+};
+
+/* Helper routines */
+
+/* Access helpers to power mgt SPR */
+
+static inline unsigned long get_pmspr(unsigned long sprn)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		return mfspr(SPRN_PMCR);
+
+	case SPRN_PMICR:
+		return mfspr(SPRN_PMICR);
+
+	case SPRN_PMSR:
+		return mfspr(SPRN_PMSR);
+	}
+	BUG();
+}
+
+static inline void set_pmspr(unsigned long sprn, unsigned long val)
+{
+	switch (sprn) {
+	case SPRN_PMCR:
+		mtspr(SPRN_PMCR, val);
+		return;
+
+	case SPRN_PMICR:
+		mtspr(SPRN_PMICR, val);
+		return;
+
+	case SPRN_PMSR:
+		mtspr(SPRN_PMSR, val);
+		return;
+	}
+	BUG();
+}
+
+static void set_pstate(void *pstate)
+{
+	unsigned long val;
+	unsigned long pstate_ul = *(unsigned long *) pstate;
+
+	val = get_pmspr(SPRN_PMCR);
+	val = val & 0x0000ffffffffffffULL;
+	/* Set both global(bits 56..63) and local(bits 48..55) PStates */
+	val = val | (pstate_ul << 56) | (pstate_ul << 48);
+	pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
+	set_pmspr(SPRN_PMCR, val);
+}
+
+static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
+{
+	unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
+
+	/*
+	 * Use smp_call_function to send IPI and execute the
+	 * mtspr on target cpu.  We could do that without IPI
+	 * if current CPU is within policy->cpus (core)
+	 */
+
+	val = val & 0xFF;
+	smp_call_function_any(cpus, set_pstate, &val, 1);
+	return 0;
+}
+
+static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
+{
+	int base, i;
+
+#ifdef CONFIG_SMP
+	base = cpu_first_thread_sibling(policy->cpu);
+
+	for (i = 0; i < threads_per_core; i++)
+		cpumask_set_cpu(base + i, policy->cpus);
+#endif
+	policy->cpuinfo.transition_latency = 25000;
+
+	policy->cur = powernv_freqs[0].frequency;
+	cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
+	return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
+{
+	cpufreq_frequency_table_put_attr(policy->cpu);
+	return 0;
+}
+
+static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
+{
+	return cpufreq_frequency_table_verify(policy, powernv_freqs);
+}
+
+static int powernv_cpufreq_target(struct cpufreq_policy *policy,
+			      unsigned int target_freq,
+			      unsigned int relation)
+{
+	int rc;
+	struct cpufreq_freqs freqs;
+	unsigned int new_index;
+
+	cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
+				       relation, &new_index);
+
+	freqs.old = policy->cur;
+	freqs.new = powernv_freqs[new_index].frequency;
+	freqs.cpu = policy->cpu;
+
+	mutex_lock(&freq_switch_mutex);
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
+
+	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
+		 policy->cpu,
+		 powernv_freqs[new_index].frequency,
+		 new_index,
+		 powernv_pstate_ids[new_index]);
+
+	rc = powernv_set_freq(policy->cpus, new_index);
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
+	mutex_unlock(&freq_switch_mutex);
+
+	return rc;
+}
+
+static struct cpufreq_driver powernv_cpufreq_driver = {
+	.verify		= powernv_cpufreq_verify,
+	.target		= powernv_cpufreq_target,
+	.init		= powernv_cpufreq_cpu_init,
+	.exit		= powernv_cpufreq_cpu_exit,
+	.name		= "powernv-cpufreq",
+	.flags		= CPUFREQ_CONST_LOOPS,
+	.attr		= powernv_cpu_freq_attr,
+};
+
+static int __init powernv_cpufreq_init(void)
+{
+	int rc = 0;
+
+	/* Discover pstates from device tree and init */
+
+	rc = init_powernv_pstates();
+
+	if (rc) {
+		pr_info("powernv-cpufreq disabled\n");
+		return rc;
+	}
+
+	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
+	return rc;
+}
+
+static void __exit powernv_cpufreq_exit(void)
+{
+	cpufreq_unregister_driver(&powernv_cpufreq_driver);
+}
+
+module_init(powernv_cpufreq_init);
+module_exit(powernv_cpufreq_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 2/5] powernv,cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: benh, svaidy, Srivatsa S. Bhat, Gautham R. Shenoy

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>

On POWER systems, the CPU frequency is controlled at a core-level and
hence we need to serialize so that only one of the threads in the core
switches the core's frequency at a time.

Using a global mutex lock would needlessly serialize _all_ frequency
transitions in the system (across all cores). So introduce per-core
locking to enable finer-grained synchronization and thereby enhance
the speed and responsiveness of the cpufreq driver to varying workload
demands.

The design of per-core locking is very simple and straight-forward: we
first define a Per-CPU lock and use the ones that belongs to the first
thread sibling of the core.

cpu_first_thread_sibling() macro is used to find the *common* lock for
all thread siblings belonging to a core.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index ab1551f..66dae0d 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -24,8 +24,15 @@
 #include <linux/of.h>
 #include <asm/cputhreads.h>
 
-/* FIXME: Make this per-core */
-static DEFINE_MUTEX(freq_switch_mutex);
+/* Per-Core locking for frequency transitions */
+static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
+
+#define lock_core_freq(cpu)				\
+			mutex_lock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
+#define unlock_core_freq(cpu)				\
+			mutex_unlock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
 
 #define POWERNV_MAX_PSTATES	256
 
@@ -221,7 +228,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	freqs.new = powernv_freqs[new_index].frequency;
 	freqs.cpu = policy->cpu;
 
-	mutex_lock(&freq_switch_mutex);
+	lock_core_freq(policy->cpu);
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
@@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	rc = powernv_set_freq(policy->cpus, new_index);
 
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
-	mutex_unlock(&freq_switch_mutex);
+	unlock_core_freq(policy->cpu);
 
 	return rc;
 }
@@ -250,7 +257,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 
 static int __init powernv_cpufreq_init(void)
 {
-	int rc = 0;
+	int cpu, rc = 0;
 
 	/* Discover pstates from device tree and init */
 
@@ -260,6 +267,10 @@ static int __init powernv_cpufreq_init(void)
 		pr_info("powernv-cpufreq disabled\n");
 		return rc;
 	}
+	/* Init per-core mutex */
+	for_each_possible_cpu(cpu) {
+		mutex_init(&per_cpu(freq_switch_lock, cpu));
+	}
 
 	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
 	return rc;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 2/5] powernv, cpufreq:Add per-core locking to serialize frequency transitions
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Srivatsa S. Bhat, Gautham R. Shenoy

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>

On POWER systems, the CPU frequency is controlled at a core-level and
hence we need to serialize so that only one of the threads in the core
switches the core's frequency at a time.

Using a global mutex lock would needlessly serialize _all_ frequency
transitions in the system (across all cores). So introduce per-core
locking to enable finer-grained synchronization and thereby enhance
the speed and responsiveness of the cpufreq driver to varying workload
demands.

The design of per-core locking is very simple and straight-forward: we
first define a Per-CPU lock and use the ones that belongs to the first
thread sibling of the core.

cpu_first_thread_sibling() macro is used to find the *common* lock for
all thread siblings belonging to a core.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index ab1551f..66dae0d 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -24,8 +24,15 @@
 #include <linux/of.h>
 #include <asm/cputhreads.h>
 
-/* FIXME: Make this per-core */
-static DEFINE_MUTEX(freq_switch_mutex);
+/* Per-Core locking for frequency transitions */
+static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
+
+#define lock_core_freq(cpu)				\
+			mutex_lock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
+#define unlock_core_freq(cpu)				\
+			mutex_unlock(&per_cpu(freq_switch_lock,\
+				cpu_first_thread_sibling(cpu)));
 
 #define POWERNV_MAX_PSTATES	256
 
@@ -221,7 +228,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	freqs.new = powernv_freqs[new_index].frequency;
 	freqs.cpu = policy->cpu;
 
-	mutex_lock(&freq_switch_mutex);
+	lock_core_freq(policy->cpu);
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
 
 	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
@@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 	rc = powernv_set_freq(policy->cpus, new_index);
 
 	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
-	mutex_unlock(&freq_switch_mutex);
+	unlock_core_freq(policy->cpu);
 
 	return rc;
 }
@@ -250,7 +257,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
 
 static int __init powernv_cpufreq_init(void)
 {
-	int rc = 0;
+	int cpu, rc = 0;
 
 	/* Discover pstates from device tree and init */
 
@@ -260,6 +267,10 @@ static int __init powernv_cpufreq_init(void)
 		pr_info("powernv-cpufreq disabled\n");
 		return rc;
 	}
+	/* Init per-core mutex */
+	for_each_possible_cpu(cpu) {
+		mutex_init(&per_cpu(freq_switch_lock, cpu));
+	}
 
 	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
 	return rc;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 3/5] powernv:cpufreq: Create pstate_id_to_freq() helper
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: benh, svaidy, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a helper routine that can return the cpu-frequency for the
corresponding pstate_id.

Also, cache the values of the pstate_max, pstate_min and
pstate_nominal and nr_pstates in a static structure so that they can
be reused in the future to perform any validations.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 66dae0d..e7b0292 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
 
+struct powernv_pstate_info {
+	int pstate_min_id;
+	int pstate_max_id;
+	int pstate_nominal_id;
+	int nr_pstates;
+};
+static struct powernv_pstate_info powernv_pstate_info;
+
 /*
  * Initialize the freq table based on data obtained
  * from the firmware passed via device-tree
@@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
 	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
 		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
 
+	powernv_pstate_info.pstate_min_id = pstate_min;
+	powernv_pstate_info.pstate_max_id = pstate_max;
+	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
+	powernv_pstate_info.nr_pstates = nr_pstates;
+
 	return 0;
 }
 
+/**
+ * Returns the cpu frequency corresponding to the pstate_id.
+ */
+static unsigned int pstate_id_to_freq(int pstate_id)
+{
+	int i;
+
+	i = powernv_pstate_info.pstate_max_id - pstate_id;
+
+	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
+	WARN_ON(powernv_pstate_ids[i] != pstate_id);
+	return powernv_freqs[i].frequency;
+}
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
 	NULL,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 3/5] powernv:cpufreq: Create pstate_id_to_freq() helper
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a helper routine that can return the cpu-frequency for the
corresponding pstate_id.

Also, cache the values of the pstate_max, pstate_min and
pstate_nominal and nr_pstates in a static structure so that they can
be reused in the future to perform any validations.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 66dae0d..e7b0292 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
 
+struct powernv_pstate_info {
+	int pstate_min_id;
+	int pstate_max_id;
+	int pstate_nominal_id;
+	int nr_pstates;
+};
+static struct powernv_pstate_info powernv_pstate_info;
+
 /*
  * Initialize the freq table based on data obtained
  * from the firmware passed via device-tree
@@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
 	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
 		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
 
+	powernv_pstate_info.pstate_min_id = pstate_min;
+	powernv_pstate_info.pstate_max_id = pstate_max;
+	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
+	powernv_pstate_info.nr_pstates = nr_pstates;
+
 	return 0;
 }
 
+/**
+ * Returns the cpu frequency corresponding to the pstate_id.
+ */
+static unsigned int pstate_id_to_freq(int pstate_id)
+{
+	int i;
+
+	i = powernv_pstate_info.pstate_max_id - pstate_id;
+
+	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
+	WARN_ON(powernv_pstate_ids[i] != pstate_id);
+	return powernv_freqs[i].frequency;
+}
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
 	NULL,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: benh, svaidy, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a driver attribute named cpuinfo_nominal_freq which
creates a sysfs read-only file named cpuinfo_nominal_freq. Export
the frequency corresponding to the nominal_pstate through this
interface.

Nominal frequency is the highest non-turbo frequency for the
platform.  This is generally used for setting governor policies from
user space for optimal energy efficiency.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index e7b0292..46bee8a 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
 	return powernv_freqs[i].frequency;
 }
 
+/**
+ * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
+ * the firmware
+ */
+static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
+					char *buf)
+{
+	int nominal_freq;
+	nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
+	return sprintf(buf, "%u\n", nominal_freq);
+}
+
+
+struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
+	.attr = { .name = "cpuinfo_nominal_freq",
+		  .mode = 0444,
+		},
+	.show = show_cpuinfo_nominal_freq,
+};
+
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
+	&cpufreq_freq_attr_cpuinfo_nominal_freq,
 	NULL,
 };
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
@ 2014-03-20 12:10   ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:10 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

Create a driver attribute named cpuinfo_nominal_freq which
creates a sysfs read-only file named cpuinfo_nominal_freq. Export
the frequency corresponding to the nominal_pstate through this
interface.

Nominal frequency is the highest non-turbo frequency for the
platform.  This is generally used for setting governor policies from
user space for optimal energy efficiency.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index e7b0292..46bee8a 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
 	return powernv_freqs[i].frequency;
 }
 
+/**
+ * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
+ * the firmware
+ */
+static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
+					char *buf)
+{
+	int nominal_freq;
+	nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
+	return sprintf(buf, "%u\n", nominal_freq);
+}
+
+
+struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
+	.attr = { .name = "cpuinfo_nominal_freq",
+		  .mode = 0444,
+		},
+	.show = show_cpuinfo_nominal_freq,
+};
+
+
 static struct freq_attr *powernv_cpu_freq_attr[] = {
 	&cpufreq_freq_attr_scaling_available_freqs,
+	&cpufreq_freq_attr_cpuinfo_nominal_freq,
 	NULL,
 };
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-20 12:11   ` Gautham R. Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:11 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: benh, svaidy, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The current frequency of a cpu is reported through the sysfs file
cpuinfo_cur_freq. This requires the driver to implement a
"->get(unsigned int cpu)" method which will return the current
operating frequency.

Implement a function named powernv_cpufreq_get() which reads the local
pstate from the PMSR and returns the corresponding frequency.

Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 46bee8a..ef6ed8c 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
 	BUG();
 }
 
+/*
+ * Computes the current frequency on this cpu
+ * and stores the result in *ret_freq.
+ */
+static void powernv_read_cpu_freq(void *ret_freq)
+{
+	unsigned long pmspr_val;
+	s8 local_pstate_id;
+	int *cur_freq, freq, pstate_id;
+
+	cur_freq = (int *)ret_freq;
+	pmspr_val = get_pmspr(SPRN_PMSR);
+
+	/* The local pstate id corresponds bits 48..55 in the PMSR.
+         * Note: Watch out for the sign! */
+	local_pstate_id = (pmspr_val >> 48) & 0xFF;
+	pstate_id = local_pstate_id;
+
+	freq = pstate_id_to_freq(pstate_id);
+	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
+		smp_processor_id(), pmspr_val, pstate_id, freq);
+	*cur_freq = freq;
+}
+
+/*
+ * Returns the cpu frequency as reported by the firmware for 'cpu'.
+ * This value is reported through the sysfs file cpuinfo_cur_freq.
+ */
+unsigned int powernv_cpufreq_get(unsigned int cpu)
+{
+	int ret_freq;
+
+	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
+			&ret_freq, 1);
+	return ret_freq;
+}
+
 static void set_pstate(void *pstate)
 {
 	unsigned long val;
@@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 static struct cpufreq_driver powernv_cpufreq_driver = {
 	.verify		= powernv_cpufreq_verify,
 	.target		= powernv_cpufreq_target,
+	.get		= powernv_cpufreq_get,
 	.init		= powernv_cpufreq_cpu_init,
 	.exit		= powernv_cpufreq_cpu_exit,
 	.name		= "powernv-cpufreq",
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-20 12:11   ` Gautham R. Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R. Shenoy @ 2014-03-20 12:11 UTC (permalink / raw)
  To: linuxppc-dev, linux-pm; +Cc: Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The current frequency of a cpu is reported through the sysfs file
cpuinfo_cur_freq. This requires the driver to implement a
"->get(unsigned int cpu)" method which will return the current
operating frequency.

Implement a function named powernv_cpufreq_get() which reads the local
pstate from the PMSR and returns the corresponding frequency.

Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 46bee8a..ef6ed8c 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
 	BUG();
 }
 
+/*
+ * Computes the current frequency on this cpu
+ * and stores the result in *ret_freq.
+ */
+static void powernv_read_cpu_freq(void *ret_freq)
+{
+	unsigned long pmspr_val;
+	s8 local_pstate_id;
+	int *cur_freq, freq, pstate_id;
+
+	cur_freq = (int *)ret_freq;
+	pmspr_val = get_pmspr(SPRN_PMSR);
+
+	/* The local pstate id corresponds bits 48..55 in the PMSR.
+         * Note: Watch out for the sign! */
+	local_pstate_id = (pmspr_val >> 48) & 0xFF;
+	pstate_id = local_pstate_id;
+
+	freq = pstate_id_to_freq(pstate_id);
+	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
+		smp_processor_id(), pmspr_val, pstate_id, freq);
+	*cur_freq = freq;
+}
+
+/*
+ * Returns the cpu frequency as reported by the firmware for 'cpu'.
+ * This value is reported through the sysfs file cpuinfo_cur_freq.
+ */
+unsigned int powernv_cpufreq_get(unsigned int cpu)
+{
+	int ret_freq;
+
+	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
+			&ret_freq, 1);
+	return ret_freq;
+}
+
 static void set_pstate(void *pstate)
 {
 	unsigned long val;
@@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
 static struct cpufreq_driver powernv_cpufreq_driver = {
 	.verify		= powernv_cpufreq_verify,
 	.target		= powernv_cpufreq_target,
+	.get		= powernv_cpufreq_get,
 	.init		= powernv_cpufreq_cpu_init,
 	.exit		= powernv_cpufreq_cpu_exit,
 	.name		= "powernv-cpufreq",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv,cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-20 12:10   ` [PATCH v3 2/5] powernv, cpufreq:Add " Gautham R. Shenoy
@ 2014-03-21  6:24     ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:24 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, linux-pm, benh, svaidy, Srivatsa S. Bhat, Preeti U Murthy

On Thu, Mar 20, 2014 at 05:40:57PM +0530, Gautham R. Shenoy wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> 
> On POWER systems, the CPU frequency is controlled at a core-level and
> hence we need to serialize so that only one of the threads in the core
> switches the core's frequency at a time.
> 
> Using a global mutex lock would needlessly serialize _all_ frequency
> transitions in the system (across all cores). So introduce per-core
> locking to enable finer-grained synchronization and thereby enhance
> the speed and responsiveness of the cpufreq driver to varying workload
> demands.
> 
> The design of per-core locking is very simple and straight-forward: we
> first define a Per-CPU lock and use the ones that belongs to the first
> thread sibling of the core.
> 
> cpu_first_thread_sibling() macro is used to find the *common* lock for
> all thread siblings belonging to a core.
> 

Forgot to add the following line.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index ab1551f..66dae0d 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -24,8 +24,15 @@
>  #include <linux/of.h>
>  #include <asm/cputhreads.h>
> 
> -/* FIXME: Make this per-core */
> -static DEFINE_MUTEX(freq_switch_mutex);
> +/* Per-Core locking for frequency transitions */
> +static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
> +
> +#define lock_core_freq(cpu)				\
> +			mutex_lock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
> +#define unlock_core_freq(cpu)				\
> +			mutex_unlock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
> 
>  #define POWERNV_MAX_PSTATES	256
> 
> @@ -221,7 +228,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	freqs.new = powernv_freqs[new_index].frequency;
>  	freqs.cpu = policy->cpu;
> 
> -	mutex_lock(&freq_switch_mutex);
> +	lock_core_freq(policy->cpu);
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> 
>  	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> @@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	rc = powernv_set_freq(policy->cpus, new_index);
> 
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> -	mutex_unlock(&freq_switch_mutex);
> +	unlock_core_freq(policy->cpu);
> 
>  	return rc;
>  }
> @@ -250,7 +257,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
> 
>  static int __init powernv_cpufreq_init(void)
>  {
> -	int rc = 0;
> +	int cpu, rc = 0;
> 
>  	/* Discover pstates from device tree and init */
> 
> @@ -260,6 +267,10 @@ static int __init powernv_cpufreq_init(void)
>  		pr_info("powernv-cpufreq disabled\n");
>  		return rc;
>  	}
> +	/* Init per-core mutex */
> +	for_each_possible_cpu(cpu) {
> +		mutex_init(&per_cpu(freq_switch_lock, cpu));
> +	}
> 
>  	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
>  	return rc;
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv, cpufreq:Add per-core locking to serialize frequency transitions
@ 2014-03-21  6:24     ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:24 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linux-pm, linuxppc-dev, Srivatsa S. Bhat, Preeti U Murthy

On Thu, Mar 20, 2014 at 05:40:57PM +0530, Gautham R. Shenoy wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> 
> On POWER systems, the CPU frequency is controlled at a core-level and
> hence we need to serialize so that only one of the threads in the core
> switches the core's frequency at a time.
> 
> Using a global mutex lock would needlessly serialize _all_ frequency
> transitions in the system (across all cores). So introduce per-core
> locking to enable finer-grained synchronization and thereby enhance
> the speed and responsiveness of the cpufreq driver to varying workload
> demands.
> 
> The design of per-core locking is very simple and straight-forward: we
> first define a Per-CPU lock and use the ones that belongs to the first
> thread sibling of the core.
> 
> cpu_first_thread_sibling() macro is used to find the *common* lock for
> all thread siblings belonging to a core.
> 

Forgot to add the following line.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index ab1551f..66dae0d 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -24,8 +24,15 @@
>  #include <linux/of.h>
>  #include <asm/cputhreads.h>
> 
> -/* FIXME: Make this per-core */
> -static DEFINE_MUTEX(freq_switch_mutex);
> +/* Per-Core locking for frequency transitions */
> +static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
> +
> +#define lock_core_freq(cpu)				\
> +			mutex_lock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
> +#define unlock_core_freq(cpu)				\
> +			mutex_unlock(&per_cpu(freq_switch_lock,\
> +				cpu_first_thread_sibling(cpu)));
> 
>  #define POWERNV_MAX_PSTATES	256
> 
> @@ -221,7 +228,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	freqs.new = powernv_freqs[new_index].frequency;
>  	freqs.cpu = policy->cpu;
> 
> -	mutex_lock(&freq_switch_mutex);
> +	lock_core_freq(policy->cpu);
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> 
>  	pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> @@ -233,7 +240,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  	rc = powernv_set_freq(policy->cpus, new_index);
> 
>  	cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> -	mutex_unlock(&freq_switch_mutex);
> +	unlock_core_freq(policy->cpu);
> 
>  	return rc;
>  }
> @@ -250,7 +257,7 @@ static struct cpufreq_driver powernv_cpufreq_driver = {
> 
>  static int __init powernv_cpufreq_init(void)
>  {
> -	int rc = 0;
> +	int cpu, rc = 0;
> 
>  	/* Discover pstates from device tree and init */
> 
> @@ -260,6 +267,10 @@ static int __init powernv_cpufreq_init(void)
>  		pr_info("powernv-cpufreq disabled\n");
>  		return rc;
>  	}
> +	/* Init per-core mutex */
> +	for_each_possible_cpu(cpu) {
> +		mutex_init(&per_cpu(freq_switch_lock, cpu));
> +	}
> 
>  	rc = cpufreq_register_driver(&powernv_cpufreq_driver);
>  	return rc;
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 3/5] powernv:cpufreq: Create pstate_id_to_freq() helper
  2014-03-20 12:10   ` Gautham R. Shenoy
@ 2014-03-21  6:24     ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:24 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, linux-pm, benh, svaidy, Preeti U Murthy

On Thu, Mar 20, 2014 at 05:40:58PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Create a helper routine that can return the cpu-frequency for the
> corresponding pstate_id.
> 
> Also, cache the values of the pstate_max, pstate_min and
> pstate_nominal and nr_pstates in a static structure so that they can
> be reused in the future to perform any validations.
> 

Forgot to add the following line:

  Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 66dae0d..e7b0292 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>  static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> 
> +struct powernv_pstate_info {
> +	int pstate_min_id;
> +	int pstate_max_id;
> +	int pstate_nominal_id;
> +	int nr_pstates;
> +};
> +static struct powernv_pstate_info powernv_pstate_info;
> +
>  /*
>   * Initialize the freq table based on data obtained
>   * from the firmware passed via device-tree
> @@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
>  	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
>  		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> 
> +	powernv_pstate_info.pstate_min_id = pstate_min;
> +	powernv_pstate_info.pstate_max_id = pstate_max;
> +	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
> +	powernv_pstate_info.nr_pstates = nr_pstates;
> +
>  	return 0;
>  }
> 
> +/**
> + * Returns the cpu frequency corresponding to the pstate_id.
> + */
> +static unsigned int pstate_id_to_freq(int pstate_id)
> +{
> +	int i;
> +
> +	i = powernv_pstate_info.pstate_max_id - pstate_id;
> +
> +	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
> +	WARN_ON(powernv_pstate_ids[i] != pstate_id);
> +	return powernv_freqs[i].frequency;
> +}
> +
>  static struct freq_attr *powernv_cpu_freq_attr[] = {
>  	&cpufreq_freq_attr_scaling_available_freqs,
>  	NULL,
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 3/5] powernv:cpufreq: Create pstate_id_to_freq() helper
@ 2014-03-21  6:24     ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:24 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: Preeti U Murthy, linuxppc-dev, linux-pm

On Thu, Mar 20, 2014 at 05:40:58PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> Create a helper routine that can return the cpu-frequency for the
> corresponding pstate_id.
> 
> Also, cache the values of the pstate_max, pstate_min and
> pstate_nominal and nr_pstates in a static structure so that they can
> be reused in the future to perform any validations.
> 

Forgot to add the following line:

  Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 66dae0d..e7b0292 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -39,6 +39,14 @@ static DEFINE_PER_CPU(struct mutex, freq_switch_lock);
>  static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
>  static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> 
> +struct powernv_pstate_info {
> +	int pstate_min_id;
> +	int pstate_max_id;
> +	int pstate_nominal_id;
> +	int nr_pstates;
> +};
> +static struct powernv_pstate_info powernv_pstate_info;
> +
>  /*
>   * Initialize the freq table based on data obtained
>   * from the firmware passed via device-tree
> @@ -112,9 +120,28 @@ static int init_powernv_pstates(void)
>  	for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
>  		pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> 
> +	powernv_pstate_info.pstate_min_id = pstate_min;
> +	powernv_pstate_info.pstate_max_id = pstate_max;
> +	powernv_pstate_info.pstate_nominal_id = pstate_nominal;
> +	powernv_pstate_info.nr_pstates = nr_pstates;
> +
>  	return 0;
>  }
> 
> +/**
> + * Returns the cpu frequency corresponding to the pstate_id.
> + */
> +static unsigned int pstate_id_to_freq(int pstate_id)
> +{
> +	int i;
> +
> +	i = powernv_pstate_info.pstate_max_id - pstate_id;
> +
> +	BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0);
> +	WARN_ON(powernv_pstate_ids[i] != pstate_id);
> +	return powernv_freqs[i].frequency;
> +}
> +
>  static struct freq_attr *powernv_cpu_freq_attr[] = {
>  	&cpufreq_freq_attr_scaling_available_freqs,
>  	NULL,
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-20 12:11   ` Gautham R. Shenoy
@ 2014-03-21  6:25     ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:25 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, linux-pm, benh, svaidy, Preeti U Murthy

On Thu, Mar 20, 2014 at 05:41:00PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
> 
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
> 
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Forgot to add 

  Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>  	BUG();
>  }
> 
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +	unsigned long pmspr_val;
> +	s8 local_pstate_id;
> +	int *cur_freq, freq, pstate_id;
> +
> +	cur_freq = (int *)ret_freq;
> +	pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +	/* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +	local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +	pstate_id = local_pstate_id;
> +
> +	freq = pstate_id_to_freq(pstate_id);
> +	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +		smp_processor_id(), pmspr_val, pstate_id, freq);
> +	*cur_freq = freq;
> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +	int ret_freq;
> +
> +	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +			&ret_freq, 1);
> +	return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>  	unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>  	.verify		= powernv_cpufreq_verify,
>  	.target		= powernv_cpufreq_target,
> +	.get		= powernv_cpufreq_get,
>  	.init		= powernv_cpufreq_cpu_init,
>  	.exit		= powernv_cpufreq_cpu_exit,
>  	.name		= "powernv-cpufreq",
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21  6:25     ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  6:25 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: Preeti U Murthy, linuxppc-dev, linux-pm

On Thu, Mar 20, 2014 at 05:41:00PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
> 
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
> 
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().

Forgot to add 

  Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> 
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>  	BUG();
>  }
> 
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +	unsigned long pmspr_val;
> +	s8 local_pstate_id;
> +	int *cur_freq, freq, pstate_id;
> +
> +	cur_freq = (int *)ret_freq;
> +	pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +	/* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +	local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +	pstate_id = local_pstate_id;
> +
> +	freq = pstate_id_to_freq(pstate_id);
> +	pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +		smp_processor_id(), pmspr_val, pstate_id, freq);
> +	*cur_freq = freq;
> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +	int ret_freq;
> +
> +	smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +			&ret_freq, 1);
> +	return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>  	unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>  	.verify		= powernv_cpufreq_verify,
>  	.target		= powernv_cpufreq_target,
> +	.get		= powernv_cpufreq_get,
>  	.init		= powernv_cpufreq_cpu_init,
>  	.exit		= powernv_cpufreq_cpu_exit,
>  	.name		= "powernv-cpufreq",
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 0/5] powernv: Enable Dynamic Frequency
  2014-03-20 12:10 ` Gautham R. Shenoy
@ 2014-03-21  7:46   ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  7:46 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list, benh, svaidy

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> This is the v3 of the consolidated patchset consisting
> patches for enabling cpufreq on IBM POWERNV platforms
> along with some enhancements.

This is the first time I saw them. Looks like you never Cc'd linux-pm list.
Also, would be better to keep Maintainers in --to field so that they can
review them as soon as possible. Otherwise they might stay on lists for
a long time..

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 0/5] powernv: Enable Dynamic Frequency
@ 2014-03-21  7:46   ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  7:46 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> This is the v3 of the consolidated patchset consisting
> patches for enabling cpufreq on IBM POWERNV platforms
> along with some enhancements.

This is the first time I saw them. Looks like you never Cc'd linux-pm list.
Also, would be better to keep Maintainers in --to field so that they can
review them as soon as possible. Otherwise they might stay on lists for
a long time..

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 0/5] powernv: Enable Dynamic Frequency
  2014-03-21  7:46   ` Viresh Kumar
@ 2014-03-21  8:19     ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  8:19 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy

Hi Viresh,

On Fri, Mar 21, 2014 at 01:16:03PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > This is the v3 of the consolidated patchset consisting
> > patches for enabling cpufreq on IBM POWERNV platforms
> > along with some enhancements.
> 
> This is the first time I saw them. Looks like you never Cc'd linux-pm list.
> Also, would be better to keep Maintainers in --to field so that they can
> review them as soon as possible. Otherwise they might stay on lists for
> a long time..

Yes, I hadn't Cc'ed linux-pm list last time around. Sorry about
that. Will keep this in mind the next time around. Also, your
suggestion regarding keeping the Maintainers in --to field is well
taken.

--
Thanks and Regards
gautham.
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 0/5] powernv: Enable Dynamic Frequency
@ 2014-03-21  8:19     ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  8:19 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev

Hi Viresh,

On Fri, Mar 21, 2014 at 01:16:03PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > This is the v3 of the consolidated patchset consisting
> > patches for enabling cpufreq on IBM POWERNV platforms
> > along with some enhancements.
> 
> This is the first time I saw them. Looks like you never Cc'd linux-pm list.
> Also, would be better to keep Maintainers in --to field so that they can
> review them as soon as possible. Otherwise they might stay on lists for
> a long time..

Yes, I hadn't Cc'ed linux-pm list last time around. Sorry about
that. Will keep this in mind the next time around. Also, your
suggestion regarding keeping the Maintainers in --to field is well
taken.

--
Thanks and Regards
gautham.
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-20 12:10   ` Gautham R. Shenoy
@ 2014-03-21  8:41     ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:41 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: linuxppc-dev, Linux PM list, benh, svaidy, Srivatsa S. Bhat,
	Anton Blanchard

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Hi Vaidy,

> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> index 4b029c0..4ba1632 100644
> --- a/drivers/cpufreq/Kconfig
> +++ b/drivers/cpufreq/Kconfig
> @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
>  choice
>         prompt "Default CPUFreq governor"
>         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ

Probably we should remove SA1100's entry as well from here. This is
not the right way of doing it. Imagine 100 platforms having entries here.
If you want it, then select it from your platforms Kconfig.

> diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
> index ca0021a..93f8689 100644
> --- a/drivers/cpufreq/Kconfig.powerpc
> +++ b/drivers/cpufreq/Kconfig.powerpc
> @@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
>         help
>           This adds the support for frequency switching on PA Semi
>           PWRficient processors.
> +
> +config POWERNV_CPUFREQ
> +       tristate "CPU frequency scaling for IBM POWERNV platform"
> +       depends on PPC_POWERNV
> +       select CPU_FREQ_GOV_PERFORMANCE
> +       select CPU_FREQ_GOV_POWERSAVE
> +       select CPU_FREQ_GOV_USERSPACE
> +       select CPU_FREQ_GOV_ONDEMAND
> +       select CPU_FREQ_GOV_CONSERVATIVE

Nice :)

People normally do this from the defconfigs instead. And logically speaking
we might better have only dependencies here, isn't it? And obviously
governors aren't a dependency for this driver. :)

> +       default y
> +       help
> +        This adds support for CPU frequency switching on IBM POWERNV
> +        platform

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> new file mode 100644
> index 0000000..ab1551f
> --- /dev/null
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -0,0 +1,277 @@
> +/*
> + * POWERNV cpufreq driver for the IBM POWER processors
> + *
> + * (C) Copyright IBM 2014
> + *
> + * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2, or (at your option)
> + * any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> +
> +#include <linux/module.h>
> +#include <linux/cpufreq.h>
> +#include <linux/of.h>
> +#include <asm/cputhreads.h>

That's it? Sure?

Even if things compile for you, you must explicitly include all the
files on which
you depend.

> +/* FIXME: Make this per-core */
> +static DEFINE_MUTEX(freq_switch_mutex);
> +
> +#define POWERNV_MAX_PSTATES    256
> +
> +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> +
> +/*
> + * Initialize the freq table based on data obtained
> + * from the firmware passed via device-tree
> + */
> +
> +static int init_powernv_pstates(void)
> +{
> +       struct device_node *power_mgt;
> +       int nr_pstates = 0;
> +       int pstate_min, pstate_max, pstate_nominal;
> +       const __be32 *pstate_ids, *pstate_freqs;
> +       int i;
> +       u32 len_ids, len_freqs;
> +
> +       power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
> +       if (!power_mgt) {
> +               pr_warn("power-mgt node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
> +               pr_warn("ibm,pstate-min node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
> +               pr_warn("ibm,pstate-max node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
> +                                &pstate_nominal)) {
> +               pr_warn("ibm,pstate-nominal not found\n");
> +               return -ENODEV;
> +       }
> +       pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
> +               pstate_nominal, pstate_max);
> +
> +       pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
> +       if (!pstate_ids) {
> +               pr_warn("ibm,pstate-ids not found\n");
> +               return -ENODEV;
> +       }
> +
> +       pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
> +                                     &len_freqs);
> +       if (!pstate_freqs) {
> +               pr_warn("ibm,pstate-frequencies-mhz not found\n");
> +               return -ENODEV;
> +       }
> +
> +       WARN_ON(len_ids != len_freqs);
> +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> +       WARN_ON(!nr_pstates);

Why do you want to continue here?

> +       pr_debug("NR PStates %d\n", nr_pstates);
> +       for (i = 0; i < nr_pstates; i++) {
> +               u32 id = be32_to_cpu(pstate_ids[i]);
> +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> +
> +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> +               powernv_freqs[i].driver_data = i;

I don't think you are using this field at all and this is the field you can
use for driver_data and so you can get rid of powernv_pstate_ids[ ].

> +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> +               powernv_pstate_ids[i] = id;
> +       }
> +       /* End of list marker entry */
> +       powernv_freqs[i].driver_data = 0;

Not required.

> +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> +
> +       /* Print frequency table */
> +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);

You have already printed this table earlier..

> +
> +       return 0;
> +}
> +
> +static struct freq_attr *powernv_cpu_freq_attr[] = {
> +       &cpufreq_freq_attr_scaling_available_freqs,
> +       NULL,
> +};

Can use this instead: cpufreq_generic_attr?

> +/* Helper routines */
> +
> +/* Access helpers to power mgt SPR */
> +
> +static inline unsigned long get_pmspr(unsigned long sprn)

Looks big enough not be inlined?

> +{
> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               return mfspr(SPRN_PMCR);
> +
> +       case SPRN_PMICR:
> +               return mfspr(SPRN_PMICR);
> +
> +       case SPRN_PMSR:
> +               return mfspr(SPRN_PMSR);
> +       }
> +       BUG();
> +}
> +
> +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> +{

Same here..

> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               mtspr(SPRN_PMCR, val);
> +               return;
> +
> +       case SPRN_PMICR:
> +               mtspr(SPRN_PMICR, val);
> +               return;
> +
> +       case SPRN_PMSR:
> +               mtspr(SPRN_PMSR, val);
> +               return;
> +       }
> +       BUG();
> +}
> +
> +static void set_pstate(void *pstate)
> +{
> +       unsigned long val;
> +       unsigned long pstate_ul = *(unsigned long *) pstate;

Why not sending value only to this routine instead of pointer?

> +
> +       val = get_pmspr(SPRN_PMCR);
> +       val = val & 0x0000ffffffffffffULL;

Maybe a blank line here?

> +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> +       val = val | (pstate_ul << 56) | (pstate_ul << 48);

here as well?

> +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> +       set_pmspr(SPRN_PMCR, val);
> +}
> +
> +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> +{
> +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];

I think automatic type conversion will happen here.

> +
> +       /*
> +        * Use smp_call_function to send IPI and execute the
> +        * mtspr on target cpu.  We could do that without IPI
> +        * if current CPU is within policy->cpus (core)
> +        */

Hmm, interesting I also feel there are cases where this routine can
get called from other CPUs. Can you list those use cases where it can
happen? Governors will mostly call this from one of the CPUs present
in policy->cpus.

> +       val = val & 0xFF;
> +       smp_call_function_any(cpus, set_pstate, &val, 1);
> +       return 0;
> +}
> +
> +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> +{
> +       int base, i;
> +
> +#ifdef CONFIG_SMP

What will break if you don't have this ifdef here? Without that as well
below code should work?

> +       base = cpu_first_thread_sibling(policy->cpu);
> +
> +       for (i = 0; i < threads_per_core; i++)
> +               cpumask_set_cpu(base + i, policy->cpus);
> +#endif
> +       policy->cpuinfo.transition_latency = 25000;
> +
> +       policy->cur = powernv_freqs[0].frequency;
> +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);

This doesn't exist anymore.

> +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);

Have you written this driver long time back? CPUFreq core has been
cleaned up heavily since last few kernel releases and I think there are
better helper routines available now.

> +}
> +
> +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> +{
> +       cpufreq_frequency_table_put_attr(policy->cpu);
> +       return 0;
> +}

You don't need this..

> +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> +{
> +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> +}

use generic verify function pointer instead..

> +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> +                             unsigned int target_freq,
> +                             unsigned int relation)

use target_index() instead..

> +{
> +       int rc;
> +       struct cpufreq_freqs freqs;
> +       unsigned int new_index;
> +
> +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> +                                      relation, &new_index);
> +
> +       freqs.old = policy->cur;
> +       freqs.new = powernv_freqs[new_index].frequency;
> +       freqs.cpu = policy->cpu;
> +
> +       mutex_lock(&freq_switch_mutex);

Why do you need this lock for?

> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> +
> +       pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> +                policy->cpu,
> +                powernv_freqs[new_index].frequency,
> +                new_index,
> +                powernv_pstate_ids[new_index]);
> +
> +       rc = powernv_set_freq(policy->cpus, new_index);
> +
> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> +       mutex_unlock(&freq_switch_mutex);
> +
> +       return rc;
> +}
> +
> +static struct cpufreq_driver powernv_cpufreq_driver = {
> +       .verify         = powernv_cpufreq_verify,
> +       .target         = powernv_cpufreq_target,

I think you have Srivatsa there who has seen lots of cpufreq code and
could have helped you a lot :)

> +       .init           = powernv_cpufreq_cpu_init,
> +       .exit           = powernv_cpufreq_cpu_exit,
> +       .name           = "powernv-cpufreq",
> +       .flags          = CPUFREQ_CONST_LOOPS,
> +       .attr           = powernv_cpu_freq_attr,

Would be better if you keep these callbacks in the order they are declared
in cpufreq.h..

> +};
> +
> +static int __init powernv_cpufreq_init(void)
> +{
> +       int rc = 0;
> +
> +       /* Discover pstates from device tree and init */
> +

Remove blank line.

> +       rc = init_powernv_pstates();
> +

same here..

> +       if (rc) {
> +               pr_info("powernv-cpufreq disabled\n");
> +               return rc;
> +       }
> +
> +       rc = cpufreq_register_driver(&powernv_cpufreq_driver);
> +       return rc;

Merge above two lines.

> +}
> +
> +static void __exit powernv_cpufreq_exit(void)
> +{
> +       cpufreq_unregister_driver(&powernv_cpufreq_driver);
> +}
> +
> +module_init(powernv_cpufreq_init);
> +module_exit(powernv_cpufreq_exit);

Place these right after the routines without any blank lines in between.

> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21  8:41     ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:41 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Hi Vaidy,

> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> index 4b029c0..4ba1632 100644
> --- a/drivers/cpufreq/Kconfig
> +++ b/drivers/cpufreq/Kconfig
> @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
>  choice
>         prompt "Default CPUFreq governor"
>         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ

Probably we should remove SA1100's entry as well from here. This is
not the right way of doing it. Imagine 100 platforms having entries here.
If you want it, then select it from your platforms Kconfig.

> diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc
> index ca0021a..93f8689 100644
> --- a/drivers/cpufreq/Kconfig.powerpc
> +++ b/drivers/cpufreq/Kconfig.powerpc
> @@ -54,3 +54,16 @@ config PPC_PASEMI_CPUFREQ
>         help
>           This adds the support for frequency switching on PA Semi
>           PWRficient processors.
> +
> +config POWERNV_CPUFREQ
> +       tristate "CPU frequency scaling for IBM POWERNV platform"
> +       depends on PPC_POWERNV
> +       select CPU_FREQ_GOV_PERFORMANCE
> +       select CPU_FREQ_GOV_POWERSAVE
> +       select CPU_FREQ_GOV_USERSPACE
> +       select CPU_FREQ_GOV_ONDEMAND
> +       select CPU_FREQ_GOV_CONSERVATIVE

Nice :)

People normally do this from the defconfigs instead. And logically speaking
we might better have only dependencies here, isn't it? And obviously
governors aren't a dependency for this driver. :)

> +       default y
> +       help
> +        This adds support for CPU frequency switching on IBM POWERNV
> +        platform

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> new file mode 100644
> index 0000000..ab1551f
> --- /dev/null
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -0,0 +1,277 @@
> +/*
> + * POWERNV cpufreq driver for the IBM POWER processors
> + *
> + * (C) Copyright IBM 2014
> + *
> + * Author: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2, or (at your option)
> + * any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> +
> +#include <linux/module.h>
> +#include <linux/cpufreq.h>
> +#include <linux/of.h>
> +#include <asm/cputhreads.h>

That's it? Sure?

Even if things compile for you, you must explicitly include all the
files on which
you depend.

> +/* FIXME: Make this per-core */
> +static DEFINE_MUTEX(freq_switch_mutex);
> +
> +#define POWERNV_MAX_PSTATES    256
> +
> +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
> +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1];
> +
> +/*
> + * Initialize the freq table based on data obtained
> + * from the firmware passed via device-tree
> + */
> +
> +static int init_powernv_pstates(void)
> +{
> +       struct device_node *power_mgt;
> +       int nr_pstates = 0;
> +       int pstate_min, pstate_max, pstate_nominal;
> +       const __be32 *pstate_ids, *pstate_freqs;
> +       int i;
> +       u32 len_ids, len_freqs;
> +
> +       power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
> +       if (!power_mgt) {
> +               pr_warn("power-mgt node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) {
> +               pr_warn("ibm,pstate-min node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) {
> +               pr_warn("ibm,pstate-max node not found\n");
> +               return -ENODEV;
> +       }
> +
> +       if (of_property_read_u32(power_mgt, "ibm,pstate-nominal",
> +                                &pstate_nominal)) {
> +               pr_warn("ibm,pstate-nominal not found\n");
> +               return -ENODEV;
> +       }
> +       pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
> +               pstate_nominal, pstate_max);
> +
> +       pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids);
> +       if (!pstate_ids) {
> +               pr_warn("ibm,pstate-ids not found\n");
> +               return -ENODEV;
> +       }
> +
> +       pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz",
> +                                     &len_freqs);
> +       if (!pstate_freqs) {
> +               pr_warn("ibm,pstate-frequencies-mhz not found\n");
> +               return -ENODEV;
> +       }
> +
> +       WARN_ON(len_ids != len_freqs);
> +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> +       WARN_ON(!nr_pstates);

Why do you want to continue here?

> +       pr_debug("NR PStates %d\n", nr_pstates);
> +       for (i = 0; i < nr_pstates; i++) {
> +               u32 id = be32_to_cpu(pstate_ids[i]);
> +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> +
> +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> +               powernv_freqs[i].driver_data = i;

I don't think you are using this field at all and this is the field you can
use for driver_data and so you can get rid of powernv_pstate_ids[ ].

> +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> +               powernv_pstate_ids[i] = id;
> +       }
> +       /* End of list marker entry */
> +       powernv_freqs[i].driver_data = 0;

Not required.

> +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> +
> +       /* Print frequency table */
> +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);

You have already printed this table earlier..

> +
> +       return 0;
> +}
> +
> +static struct freq_attr *powernv_cpu_freq_attr[] = {
> +       &cpufreq_freq_attr_scaling_available_freqs,
> +       NULL,
> +};

Can use this instead: cpufreq_generic_attr?

> +/* Helper routines */
> +
> +/* Access helpers to power mgt SPR */
> +
> +static inline unsigned long get_pmspr(unsigned long sprn)

Looks big enough not be inlined?

> +{
> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               return mfspr(SPRN_PMCR);
> +
> +       case SPRN_PMICR:
> +               return mfspr(SPRN_PMICR);
> +
> +       case SPRN_PMSR:
> +               return mfspr(SPRN_PMSR);
> +       }
> +       BUG();
> +}
> +
> +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> +{

Same here..

> +       switch (sprn) {
> +       case SPRN_PMCR:
> +               mtspr(SPRN_PMCR, val);
> +               return;
> +
> +       case SPRN_PMICR:
> +               mtspr(SPRN_PMICR, val);
> +               return;
> +
> +       case SPRN_PMSR:
> +               mtspr(SPRN_PMSR, val);
> +               return;
> +       }
> +       BUG();
> +}
> +
> +static void set_pstate(void *pstate)
> +{
> +       unsigned long val;
> +       unsigned long pstate_ul = *(unsigned long *) pstate;

Why not sending value only to this routine instead of pointer?

> +
> +       val = get_pmspr(SPRN_PMCR);
> +       val = val & 0x0000ffffffffffffULL;

Maybe a blank line here?

> +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> +       val = val | (pstate_ul << 56) | (pstate_ul << 48);

here as well?

> +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> +       set_pmspr(SPRN_PMCR, val);
> +}
> +
> +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> +{
> +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];

I think automatic type conversion will happen here.

> +
> +       /*
> +        * Use smp_call_function to send IPI and execute the
> +        * mtspr on target cpu.  We could do that without IPI
> +        * if current CPU is within policy->cpus (core)
> +        */

Hmm, interesting I also feel there are cases where this routine can
get called from other CPUs. Can you list those use cases where it can
happen? Governors will mostly call this from one of the CPUs present
in policy->cpus.

> +       val = val & 0xFF;
> +       smp_call_function_any(cpus, set_pstate, &val, 1);
> +       return 0;
> +}
> +
> +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> +{
> +       int base, i;
> +
> +#ifdef CONFIG_SMP

What will break if you don't have this ifdef here? Without that as well
below code should work?

> +       base = cpu_first_thread_sibling(policy->cpu);
> +
> +       for (i = 0; i < threads_per_core; i++)
> +               cpumask_set_cpu(base + i, policy->cpus);
> +#endif
> +       policy->cpuinfo.transition_latency = 25000;
> +
> +       policy->cur = powernv_freqs[0].frequency;
> +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);

This doesn't exist anymore.

> +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);

Have you written this driver long time back? CPUFreq core has been
cleaned up heavily since last few kernel releases and I think there are
better helper routines available now.

> +}
> +
> +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> +{
> +       cpufreq_frequency_table_put_attr(policy->cpu);
> +       return 0;
> +}

You don't need this..

> +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> +{
> +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> +}

use generic verify function pointer instead..

> +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> +                             unsigned int target_freq,
> +                             unsigned int relation)

use target_index() instead..

> +{
> +       int rc;
> +       struct cpufreq_freqs freqs;
> +       unsigned int new_index;
> +
> +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> +                                      relation, &new_index);
> +
> +       freqs.old = policy->cur;
> +       freqs.new = powernv_freqs[new_index].frequency;
> +       freqs.cpu = policy->cpu;
> +
> +       mutex_lock(&freq_switch_mutex);

Why do you need this lock for?

> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> +
> +       pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> +                policy->cpu,
> +                powernv_freqs[new_index].frequency,
> +                new_index,
> +                powernv_pstate_ids[new_index]);
> +
> +       rc = powernv_set_freq(policy->cpus, new_index);
> +
> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> +       mutex_unlock(&freq_switch_mutex);
> +
> +       return rc;
> +}
> +
> +static struct cpufreq_driver powernv_cpufreq_driver = {
> +       .verify         = powernv_cpufreq_verify,
> +       .target         = powernv_cpufreq_target,

I think you have Srivatsa there who has seen lots of cpufreq code and
could have helped you a lot :)

> +       .init           = powernv_cpufreq_cpu_init,
> +       .exit           = powernv_cpufreq_cpu_exit,
> +       .name           = "powernv-cpufreq",
> +       .flags          = CPUFREQ_CONST_LOOPS,
> +       .attr           = powernv_cpu_freq_attr,

Would be better if you keep these callbacks in the order they are declared
in cpufreq.h..

> +};
> +
> +static int __init powernv_cpufreq_init(void)
> +{
> +       int rc = 0;
> +
> +       /* Discover pstates from device tree and init */
> +

Remove blank line.

> +       rc = init_powernv_pstates();
> +

same here..

> +       if (rc) {
> +               pr_info("powernv-cpufreq disabled\n");
> +               return rc;
> +       }
> +
> +       rc = cpufreq_register_driver(&powernv_cpufreq_driver);
> +       return rc;

Merge above two lines.

> +}
> +
> +static void __exit powernv_cpufreq_exit(void)
> +{
> +       cpufreq_unregister_driver(&powernv_cpufreq_driver);
> +}
> +
> +module_init(powernv_cpufreq_init);
> +module_exit(powernv_cpufreq_exit);

Place these right after the routines without any blank lines in between.

> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv, cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-20 12:10   ` [PATCH v3 2/5] powernv, cpufreq:Add " Gautham R. Shenoy
@ 2014-03-21  8:42     ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:42 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Srivatsa S. Bhat, Linux PM list

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>
> On POWER systems, the CPU frequency is controlled at a core-level and
> hence we need to serialize so that only one of the threads in the core
> switches the core's frequency at a time.

Probably you don't need this anymore.

https://lkml.org/lkml/2014/3/21/23
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv, cpufreq:Add per-core locking to serialize frequency transitions
@ 2014-03-21  8:42     ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:42 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Srivatsa S. Bhat, Linux PM list

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>
> On POWER systems, the CPU frequency is controlled at a core-level and
> hence we need to serialize so that only one of the threads in the core
> switches the core's frequency at a time.

Probably you don't need this anymore.

https://lkml.org/lkml/2014/3/21/23

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
  2014-03-20 12:10   ` Gautham R. Shenoy
@ 2014-03-21  8:47     ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:47 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list, benh, svaidy

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> Create a driver attribute named cpuinfo_nominal_freq which
> creates a sysfs read-only file named cpuinfo_nominal_freq. Export
> the frequency corresponding to the nominal_pstate through this
> interface.
>
> Nominal frequency is the highest non-turbo frequency for the
> platform.  This is generally used for setting governor policies from
> user space for optimal energy efficiency.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index e7b0292..46bee8a 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
>         return powernv_freqs[i].frequency;
>  }
>
> +/**
> + * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
> + * the firmware
> + */
> +static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
> +                                       char *buf)
> +{
> +       int nominal_freq;
> +       nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
> +       return sprintf(buf, "%u\n", nominal_freq);

return sprintf(buf, "%u\n",
pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id));

??

> +}
> +
> +

remove extra blank line.

> +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
> +       .attr = { .name = "cpuinfo_nominal_freq",
> +                 .mode = 0444,
> +               },

Align {}

> +       .show = show_cpuinfo_nominal_freq,
> +};
> +
> +
>  static struct freq_attr *powernv_cpu_freq_attr[] = {
>         &cpufreq_freq_attr_scaling_available_freqs,
> +       &cpufreq_freq_attr_cpuinfo_nominal_freq,
>         NULL,
>  };

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
@ 2014-03-21  8:47     ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  8:47 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list

On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> Create a driver attribute named cpuinfo_nominal_freq which
> creates a sysfs read-only file named cpuinfo_nominal_freq. Export
> the frequency corresponding to the nominal_pstate through this
> interface.
>
> Nominal frequency is the highest non-turbo frequency for the
> platform.  This is generally used for setting governor policies from
> user space for optimal energy efficiency.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index e7b0292..46bee8a 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -142,8 +142,30 @@ static unsigned int pstate_id_to_freq(int pstate_id)
>         return powernv_freqs[i].frequency;
>  }
>
> +/**
> + * show_cpuinfo_nominal_freq - Show the nominal CPU frequency as indicated by
> + * the firmware
> + */
> +static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
> +                                       char *buf)
> +{
> +       int nominal_freq;
> +       nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
> +       return sprintf(buf, "%u\n", nominal_freq);

return sprintf(buf, "%u\n",
pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id));

??

> +}
> +
> +

remove extra blank line.

> +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
> +       .attr = { .name = "cpuinfo_nominal_freq",
> +                 .mode = 0444,
> +               },

Align {}

> +       .show = show_cpuinfo_nominal_freq,
> +};
> +
> +
>  static struct freq_attr *powernv_cpu_freq_attr[] = {
>         &cpufreq_freq_attr_scaling_available_freqs,
> +       &cpufreq_freq_attr_cpuinfo_nominal_freq,
>         NULL,
>  };

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-20 12:11   ` Gautham R. Shenoy
@ 2014-03-21  9:31     ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  9:31 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list, benh, svaidy

On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
>
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
>
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)

Please merge these fixups with the first patch which is creating the driver.
I understand that a different guy has created this patch and so wanted
to have a patch on his name but its really difficult to review this way.
Better add your signed-off in the first patch instead. Sending such changes
once the driver is mainlined looks fine.

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>         BUG();
>  }
>
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +       unsigned long pmspr_val;
> +       s8 local_pstate_id;
> +       int *cur_freq, freq, pstate_id;
> +
> +       cur_freq = (int *)ret_freq;

You don't need cur_freq variable at all..

> +       pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +       pstate_id = local_pstate_id;

similarly local_pstate_id

> +
> +       freq = pstate_id_to_freq(pstate_id);
> +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +               smp_processor_id(), pmspr_val, pstate_id, freq);
> +       *cur_freq = freq;

Move above print here after a blank line. Also remove freq variable as
well and use *cur_freq directly.. And then you can rename it to freq as well.

> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +       int ret_freq;
> +
> +       smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +                       &ret_freq, 1);
> +       return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>         unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>         .verify         = powernv_cpufreq_verify,
>         .target         = powernv_cpufreq_target,
> +       .get            = powernv_cpufreq_get,
>         .init           = powernv_cpufreq_cpu_init,
>         .exit           = powernv_cpufreq_cpu_exit,
>         .name           = "powernv-cpufreq",
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21  9:31     ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  9:31 UTC (permalink / raw)
  To: Gautham R. Shenoy; +Cc: linuxppc-dev, Linux PM list

On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> The current frequency of a cpu is reported through the sysfs file
> cpuinfo_cur_freq. This requires the driver to implement a
> "->get(unsigned int cpu)" method which will return the current
> operating frequency.
>
> Implement a function named powernv_cpufreq_get() which reads the local
> pstate from the PMSR and returns the corresponding frequency.
>
> Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)

Please merge these fixups with the first patch which is creating the driver.
I understand that a different guy has created this patch and so wanted
to have a patch on his name but its really difficult to review this way.
Better add your signed-off in the first patch instead. Sending such changes
once the driver is mainlined looks fine.

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 46bee8a..ef6ed8c 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
>         BUG();
>  }
>
> +/*
> + * Computes the current frequency on this cpu
> + * and stores the result in *ret_freq.
> + */
> +static void powernv_read_cpu_freq(void *ret_freq)
> +{
> +       unsigned long pmspr_val;
> +       s8 local_pstate_id;
> +       int *cur_freq, freq, pstate_id;
> +
> +       cur_freq = (int *)ret_freq;

You don't need cur_freq variable at all..

> +       pmspr_val = get_pmspr(SPRN_PMSR);
> +
> +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> +         * Note: Watch out for the sign! */
> +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> +       pstate_id = local_pstate_id;

similarly local_pstate_id

> +
> +       freq = pstate_id_to_freq(pstate_id);
> +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> +               smp_processor_id(), pmspr_val, pstate_id, freq);
> +       *cur_freq = freq;

Move above print here after a blank line. Also remove freq variable as
well and use *cur_freq directly.. And then you can rename it to freq as well.

> +}
> +
> +/*
> + * Returns the cpu frequency as reported by the firmware for 'cpu'.
> + * This value is reported through the sysfs file cpuinfo_cur_freq.
> + */
> +unsigned int powernv_cpufreq_get(unsigned int cpu)
> +{
> +       int ret_freq;
> +
> +       smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq,
> +                       &ret_freq, 1);
> +       return ret_freq;
> +}
> +
>  static void set_pstate(void *pstate)
>  {
>         unsigned long val;
> @@ -297,6 +334,7 @@ static int powernv_cpufreq_target(struct cpufreq_policy *policy,
>  static struct cpufreq_driver powernv_cpufreq_driver = {
>         .verify         = powernv_cpufreq_verify,
>         .target         = powernv_cpufreq_target,
> +       .get            = powernv_cpufreq_get,
>         .init           = powernv_cpufreq_cpu_init,
>         .exit           = powernv_cpufreq_cpu_exit,
>         .name           = "powernv-cpufreq",
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
  2014-03-21  8:47     ` Viresh Kumar
@ 2014-03-21  9:55       ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  9:55 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy

On Fri, Mar 21, 2014 at 02:17:28PM +0530, Viresh Kumar wrote:
> > +static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
> > +                                       char *buf)
> > +{
> > +       int nominal_freq;
> > +       nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
> > +       return sprintf(buf, "%u\n", nominal_freq);
> 
> return sprintf(buf, "%u\n",
> pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id));

Sure. Will fix this. 
> 
> ??
> 
> > +}
> > +
> > +
> 
> remove extra blank line.
> 
> > +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
> > +       .attr = { .name = "cpuinfo_nominal_freq",
> > +                 .mode = 0444,
> > +               },
> 
> Align {}

Probably the use of ATTR_RO(cpuinfo_nominal_freq) and renaming
show_cpuinfo_nominal_freq to cpuinfo_nominal_freq_show() would be even
better. What do you think ?

> 
> > +       .show = show_cpuinfo_nominal_freq,
> > +};
> > +
> > +
> >  static struct freq_attr *powernv_cpu_freq_attr[] = {
> >         &cpufreq_freq_attr_scaling_available_freqs,
> > +       &cpufreq_freq_attr_cpuinfo_nominal_freq,
> >         NULL,
> >  };

This needs to be rewritten to include the entries of cpufreq_generic_attr.
>

Thanks for the review.

--
Regards
gautham. 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
@ 2014-03-21  9:55       ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21  9:55 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev

On Fri, Mar 21, 2014 at 02:17:28PM +0530, Viresh Kumar wrote:
> > +static ssize_t show_cpuinfo_nominal_freq(struct cpufreq_policy *policy,
> > +                                       char *buf)
> > +{
> > +       int nominal_freq;
> > +       nominal_freq = pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id);
> > +       return sprintf(buf, "%u\n", nominal_freq);
> 
> return sprintf(buf, "%u\n",
> pstate_id_to_freq(powernv_pstate_info.pstate_nominal_id));

Sure. Will fix this. 
> 
> ??
> 
> > +}
> > +
> > +
> 
> remove extra blank line.
> 
> > +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = {
> > +       .attr = { .name = "cpuinfo_nominal_freq",
> > +                 .mode = 0444,
> > +               },
> 
> Align {}

Probably the use of ATTR_RO(cpuinfo_nominal_freq) and renaming
show_cpuinfo_nominal_freq to cpuinfo_nominal_freq_show() would be even
better. What do you think ?

> 
> > +       .show = show_cpuinfo_nominal_freq,
> > +};
> > +
> > +
> >  static struct freq_attr *powernv_cpu_freq_attr[] = {
> >         &cpufreq_freq_attr_scaling_available_freqs,
> > +       &cpufreq_freq_attr_cpuinfo_nominal_freq,
> >         NULL,
> >  };

This needs to be rewritten to include the entries of cpufreq_generic_attr.
>

Thanks for the review.

--
Regards
gautham. 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv,cpufreq:Add per-core locking to serialize frequency transitions
  2014-03-21  8:42     ` Viresh Kumar
@ 2014-03-21  9:56       ` Srivatsa S. Bhat
  -1 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21  9:56 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy

On 03/21/2014 02:12 PM, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
>> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>>
>> On POWER systems, the CPU frequency is controlled at a core-level and
>> hence we need to serialize so that only one of the threads in the core
>> switches the core's frequency at a time.
> 
> Probably you don't need this anymore.
> 
> https://lkml.org/lkml/2014/3/21/23
> 

Agreed.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 2/5] powernv, cpufreq:Add per-core locking to serialize frequency transitions
@ 2014-03-21  9:56       ` Srivatsa S. Bhat
  0 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21  9:56 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev

On 03/21/2014 02:12 PM, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
>> From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
>>
>> On POWER systems, the CPU frequency is controlled at a core-level and
>> hence we need to serialize so that only one of the threads in the core
>> switches the core's frequency at a time.
> 
> Probably you don't need this anymore.
> 
> https://lkml.org/lkml/2014/3/21/23
> 

Agreed.

Regards,
Srivatsa S. Bhat

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
  2014-03-21  9:55       ` Gautham R Shenoy
@ 2014-03-21  9:57         ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  9:57 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan

On 21 March 2014 15:25, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Probably the use of ATTR_RO(cpuinfo_nominal_freq) and renaming
> show_cpuinfo_nominal_freq to cpuinfo_nominal_freq_show() would be even
> better. What do you think ?

+1

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs.
@ 2014-03-21  9:57         ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21  9:57 UTC (permalink / raw)
  To: ego; +Cc: linuxppc-dev, Linux PM list

On 21 March 2014 15:25, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Probably the use of ATTR_RO(cpuinfo_nominal_freq) and renaming
> show_cpuinfo_nominal_freq to cpuinfo_nominal_freq_show() would be even
> better. What do you think ?

+1

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21  8:41     ` Viresh Kumar
@ 2014-03-21 10:43       ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 10:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy,
	Srivatsa S. Bhat, Anton Blanchard

Hi Viresh,

On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> Hi Vaidy,
> 
> > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> > index 4b029c0..4ba1632 100644
> > --- a/drivers/cpufreq/Kconfig
> > +++ b/drivers/cpufreq/Kconfig
> > @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
> >  choice
> >         prompt "Default CPUFreq governor"
> >         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> > +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
> 
> Probably we should remove SA1100's entry as well from here. This is
> not the right way of doing it. Imagine 100 platforms having entries here.
> If you want it, then select it from your platforms Kconfig.

Sure. Will move these bits and the other governor related bits to the
Powernv Kconfig.

> > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > new file mode 100644
> > index 0000000..ab1551f
> > --- /dev/null
> > +
> > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> > +
> > +#include <linux/module.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/of.h>
> > +#include <asm/cputhreads.h>
> 
> That's it? Sure?
> 
> Even if things compile for you, you must explicitly include all the
> files on which
> you depend.

Ok. 

> 
> > +
> > +       WARN_ON(len_ids != len_freqs);
> > +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> > +       WARN_ON(!nr_pstates);
> 
> Why do you want to continue here?

Good point. We might be better off exiting at this point. 

> 
> > +       pr_debug("NR PStates %d\n", nr_pstates);
> > +       for (i = 0; i < nr_pstates; i++) {
> > +               u32 id = be32_to_cpu(pstate_ids[i]);
> > +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> > +
> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> > +               powernv_freqs[i].driver_data = i;
> 
> I don't think you are using this field at all and this is the field you can
> use for driver_data and so you can get rid of powernv_pstate_ids[ ].

Using driver_data to record powernv_pstate_ids won't work since
powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
corresponding to pstate id -3. So for now I think we will be retaining
powernv_pstate_ids.

> 
> > +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> > +               powernv_pstate_ids[i] = id;
> > +       }
> > +       /* End of list marker entry */
> > +       powernv_freqs[i].driver_data = 0;
> 
> Not required.

Ok.
> 
> > +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> > +
> > +       /* Print frequency table */
> > +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> > +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> 
> You have already printed this table earlier..

Fair enough.

> 
> > +
> > +       return 0;
> > +}
> > +
> > +static struct freq_attr *powernv_cpu_freq_attr[] = {
> > +       &cpufreq_freq_attr_scaling_available_freqs,
> > +       NULL,
> > +};
> 
> Can use this instead: cpufreq_generic_attr?

In this patch yes. But later patch introduces an additional attribute
for displaying the nominal frequency. Will handle that part in a clean
way in the next version.

> 
> > +/* Helper routines */
> > +
> > +/* Access helpers to power mgt SPR */
> > +
> > +static inline unsigned long get_pmspr(unsigned long sprn)
> 
> Looks big enough not be inlined?

It is called from one function. It has been defined separately for
readability. 

> 
> > +{
> > +       switch (sprn) {
> > +       case SPRN_PMCR:
> > +               return mfspr(SPRN_PMCR);
> > +
> > +       case SPRN_PMICR:
> > +               return mfspr(SPRN_PMICR);
> > +
> > +       case SPRN_PMSR:
> > +               return mfspr(SPRN_PMSR);
> > +       }
> > +       BUG();
> > +}
> > +
> > +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> > +{
> 
> Same here..

Same reason as above.

> 
> > +       switch (sprn) {
> > +       case SPRN_PMCR:
> > +               mtspr(SPRN_PMCR, val);
> > +               return;
> > +
> > +       case SPRN_PMICR:
> > +               mtspr(SPRN_PMICR, val);
> > +               return;
> > +
> > +       case SPRN_PMSR:
> > +               mtspr(SPRN_PMSR, val);
> > +               return;
> > +       }
> > +       BUG();
> > +}
> > +
> > +static void set_pstate(void *pstate)
> > +{
> > +       unsigned long val;
> > +       unsigned long pstate_ul = *(unsigned long *) pstate;
> 
> Why not sending value only to this routine instead of pointer?

Well this function is called via an smp_call_function. so, cannot send
a value :(

> 
> > +
> > +       val = get_pmspr(SPRN_PMCR);
> > +       val = val & 0x0000ffffffffffffULL;
> 
> Maybe a blank line here?

Ok.

> 
> > +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> > +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> 
> here as well?

Ok.
> 

> > +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> > +       set_pmspr(SPRN_PMCR, val);
> > +}
> > +
> > +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> > +{
> > +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
> 
> I think automatic type conversion will happen here.

Ok. Will fix this.

> 
> > +
> > +       /*
> > +        * Use smp_call_function to send IPI and execute the
> > +        * mtspr on target cpu.  We could do that without IPI
> > +        * if current CPU is within policy->cpus (core)
> > +        */
> 
> Hmm, interesting I also feel there are cases where this routine can
> get called from other CPUs. Can you list those use cases where it can
> happen? Governors will mostly call this from one of the CPUs present
> in policy->cpus.

Consider the case when the governor is userspace and we are executing 

    # echo freqvalue  > 
         /sys/devices/system/cpu/cpu<i>/cpufreq/scaling_setspeed 

from a cpu <j> which is not in policy->cpus of cpu i. 


> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> > +{
> > +       int base, i;
> > +
> > +#ifdef CONFIG_SMP
> 
> What will break if you don't have this ifdef here? Without that as well
> below code should work?
> 
> > +       base = cpu_first_thread_sibling(policy->cpu);
> > +
> > +       for (i = 0; i < threads_per_core; i++)
> > +               cpumask_set_cpu(base + i, policy->cpus);
> > +#endif
> > +       policy->cpuinfo.transition_latency = 25000;
> > +
> > +       policy->cur = powernv_freqs[0].frequency;
> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> 
> This doesn't exist anymore.

Didn't get this comment!

> 
> > +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
> 
> Have you written this driver long time back? CPUFreq core has been
> cleaned up heavily since last few kernel releases and I think there are
> better helper routines available now.

Yup it was written quite a while ago. And yeah, CPUFreq has changed
quite a bit since the last time I saw it :-)

> 
> > +}
> > +
> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> > +{
> > +       cpufreq_frequency_table_put_attr(policy->cpu);
> > +       return 0;
> > +}
> 
> You don't need this..

Why not ?

> 
> > +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> > +{
> > +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> > +}
> 
> use generic verify function pointer instead..
> 
> > +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> > +                             unsigned int target_freq,
> > +                             unsigned int relation)
> 
> use target_index() instead..
> 
> > +{
> > +       int rc;
> > +       struct cpufreq_freqs freqs;
> > +       unsigned int new_index;
> > +
> > +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> > +                                      relation, &new_index);
> > +
> > +       freqs.old = policy->cur;
> > +       freqs.new = powernv_freqs[new_index].frequency;
> > +       freqs.cpu = policy->cpu;
> > +
> > +       mutex_lock(&freq_switch_mutex);
> 
> Why do you need this lock for?

I guess it was to serialize accesses to PMCR. But Srivatsa's patch
converts this to a per-core lock which probably is no longer required
after your cpufreq_freq_transition_begin/end() patch series.

> 
> > +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> > +
> > +       pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> > +                policy->cpu,
> > +                powernv_freqs[new_index].frequency,
> > +                new_index,
> > +                powernv_pstate_ids[new_index]);
> > +
> > +       rc = powernv_set_freq(policy->cpus, new_index);
> > +
> > +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> > +       mutex_unlock(&freq_switch_mutex);
> > +
> > +       return rc;
> > +}
> > +
> > +static struct cpufreq_driver powernv_cpufreq_driver = {
> > +       .verify         = powernv_cpufreq_verify,
> > +       .target         = powernv_cpufreq_target,
> 
> I think you have Srivatsa there who has seen lots of cpufreq code and
> could have helped you a lot :)
> 
> > +       .init           = powernv_cpufreq_cpu_init,
> > +       .exit           = powernv_cpufreq_cpu_exit,
> > +       .name           = "powernv-cpufreq",
> > +       .flags          = CPUFREQ_CONST_LOOPS,
> > +       .attr           = powernv_cpu_freq_attr,
> 
> Would be better if you keep these callbacks in the order they are declared
> in cpufreq.h..

Sure.

> 
> > +module_init(powernv_cpufreq_init);
> > +module_exit(powernv_cpufreq_exit);
> 
> Place these right after the routines without any blank lines in
between.

Is this the new convention ?

> 
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
> 

Thanks for the detailed review.

--
Regards
gautham.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 10:43       ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 10:43 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev, Anton Blanchard,
	Srivatsa S. Bhat

Hi Viresh,

On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> Hi Vaidy,
> 
> > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> > index 4b029c0..4ba1632 100644
> > --- a/drivers/cpufreq/Kconfig
> > +++ b/drivers/cpufreq/Kconfig
> > @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
> >  choice
> >         prompt "Default CPUFreq governor"
> >         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> > +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
> 
> Probably we should remove SA1100's entry as well from here. This is
> not the right way of doing it. Imagine 100 platforms having entries here.
> If you want it, then select it from your platforms Kconfig.

Sure. Will move these bits and the other governor related bits to the
Powernv Kconfig.

> > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > new file mode 100644
> > index 0000000..ab1551f
> > --- /dev/null
> > +
> > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> > +
> > +#include <linux/module.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/of.h>
> > +#include <asm/cputhreads.h>
> 
> That's it? Sure?
> 
> Even if things compile for you, you must explicitly include all the
> files on which
> you depend.

Ok. 

> 
> > +
> > +       WARN_ON(len_ids != len_freqs);
> > +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> > +       WARN_ON(!nr_pstates);
> 
> Why do you want to continue here?

Good point. We might be better off exiting at this point. 

> 
> > +       pr_debug("NR PStates %d\n", nr_pstates);
> > +       for (i = 0; i < nr_pstates; i++) {
> > +               u32 id = be32_to_cpu(pstate_ids[i]);
> > +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> > +
> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> > +               powernv_freqs[i].driver_data = i;
> 
> I don't think you are using this field at all and this is the field you can
> use for driver_data and so you can get rid of powernv_pstate_ids[ ].

Using driver_data to record powernv_pstate_ids won't work since
powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
corresponding to pstate id -3. So for now I think we will be retaining
powernv_pstate_ids.

> 
> > +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> > +               powernv_pstate_ids[i] = id;
> > +       }
> > +       /* End of list marker entry */
> > +       powernv_freqs[i].driver_data = 0;
> 
> Not required.

Ok.
> 
> > +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> > +
> > +       /* Print frequency table */
> > +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> > +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> 
> You have already printed this table earlier..

Fair enough.

> 
> > +
> > +       return 0;
> > +}
> > +
> > +static struct freq_attr *powernv_cpu_freq_attr[] = {
> > +       &cpufreq_freq_attr_scaling_available_freqs,
> > +       NULL,
> > +};
> 
> Can use this instead: cpufreq_generic_attr?

In this patch yes. But later patch introduces an additional attribute
for displaying the nominal frequency. Will handle that part in a clean
way in the next version.

> 
> > +/* Helper routines */
> > +
> > +/* Access helpers to power mgt SPR */
> > +
> > +static inline unsigned long get_pmspr(unsigned long sprn)
> 
> Looks big enough not be inlined?

It is called from one function. It has been defined separately for
readability. 

> 
> > +{
> > +       switch (sprn) {
> > +       case SPRN_PMCR:
> > +               return mfspr(SPRN_PMCR);
> > +
> > +       case SPRN_PMICR:
> > +               return mfspr(SPRN_PMICR);
> > +
> > +       case SPRN_PMSR:
> > +               return mfspr(SPRN_PMSR);
> > +       }
> > +       BUG();
> > +}
> > +
> > +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> > +{
> 
> Same here..

Same reason as above.

> 
> > +       switch (sprn) {
> > +       case SPRN_PMCR:
> > +               mtspr(SPRN_PMCR, val);
> > +               return;
> > +
> > +       case SPRN_PMICR:
> > +               mtspr(SPRN_PMICR, val);
> > +               return;
> > +
> > +       case SPRN_PMSR:
> > +               mtspr(SPRN_PMSR, val);
> > +               return;
> > +       }
> > +       BUG();
> > +}
> > +
> > +static void set_pstate(void *pstate)
> > +{
> > +       unsigned long val;
> > +       unsigned long pstate_ul = *(unsigned long *) pstate;
> 
> Why not sending value only to this routine instead of pointer?

Well this function is called via an smp_call_function. so, cannot send
a value :(

> 
> > +
> > +       val = get_pmspr(SPRN_PMCR);
> > +       val = val & 0x0000ffffffffffffULL;
> 
> Maybe a blank line here?

Ok.

> 
> > +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> > +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> 
> here as well?

Ok.
> 

> > +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> > +       set_pmspr(SPRN_PMCR, val);
> > +}
> > +
> > +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> > +{
> > +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
> 
> I think automatic type conversion will happen here.

Ok. Will fix this.

> 
> > +
> > +       /*
> > +        * Use smp_call_function to send IPI and execute the
> > +        * mtspr on target cpu.  We could do that without IPI
> > +        * if current CPU is within policy->cpus (core)
> > +        */
> 
> Hmm, interesting I also feel there are cases where this routine can
> get called from other CPUs. Can you list those use cases where it can
> happen? Governors will mostly call this from one of the CPUs present
> in policy->cpus.

Consider the case when the governor is userspace and we are executing 

    # echo freqvalue  > 
         /sys/devices/system/cpu/cpu<i>/cpufreq/scaling_setspeed 

from a cpu <j> which is not in policy->cpus of cpu i. 


> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> > +{
> > +       int base, i;
> > +
> > +#ifdef CONFIG_SMP
> 
> What will break if you don't have this ifdef here? Without that as well
> below code should work?
> 
> > +       base = cpu_first_thread_sibling(policy->cpu);
> > +
> > +       for (i = 0; i < threads_per_core; i++)
> > +               cpumask_set_cpu(base + i, policy->cpus);
> > +#endif
> > +       policy->cpuinfo.transition_latency = 25000;
> > +
> > +       policy->cur = powernv_freqs[0].frequency;
> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> 
> This doesn't exist anymore.

Didn't get this comment!

> 
> > +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
> 
> Have you written this driver long time back? CPUFreq core has been
> cleaned up heavily since last few kernel releases and I think there are
> better helper routines available now.

Yup it was written quite a while ago. And yeah, CPUFreq has changed
quite a bit since the last time I saw it :-)

> 
> > +}
> > +
> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> > +{
> > +       cpufreq_frequency_table_put_attr(policy->cpu);
> > +       return 0;
> > +}
> 
> You don't need this..

Why not ?

> 
> > +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> > +{
> > +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> > +}
> 
> use generic verify function pointer instead..
> 
> > +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> > +                             unsigned int target_freq,
> > +                             unsigned int relation)
> 
> use target_index() instead..
> 
> > +{
> > +       int rc;
> > +       struct cpufreq_freqs freqs;
> > +       unsigned int new_index;
> > +
> > +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> > +                                      relation, &new_index);
> > +
> > +       freqs.old = policy->cur;
> > +       freqs.new = powernv_freqs[new_index].frequency;
> > +       freqs.cpu = policy->cpu;
> > +
> > +       mutex_lock(&freq_switch_mutex);
> 
> Why do you need this lock for?

I guess it was to serialize accesses to PMCR. But Srivatsa's patch
converts this to a per-core lock which probably is no longer required
after your cpufreq_freq_transition_begin/end() patch series.

> 
> > +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
> > +
> > +       pr_debug("setting frequency for cpu %d to %d kHz index %d pstate %d",
> > +                policy->cpu,
> > +                powernv_freqs[new_index].frequency,
> > +                new_index,
> > +                powernv_pstate_ids[new_index]);
> > +
> > +       rc = powernv_set_freq(policy->cpus, new_index);
> > +
> > +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
> > +       mutex_unlock(&freq_switch_mutex);
> > +
> > +       return rc;
> > +}
> > +
> > +static struct cpufreq_driver powernv_cpufreq_driver = {
> > +       .verify         = powernv_cpufreq_verify,
> > +       .target         = powernv_cpufreq_target,
> 
> I think you have Srivatsa there who has seen lots of cpufreq code and
> could have helped you a lot :)
> 
> > +       .init           = powernv_cpufreq_cpu_init,
> > +       .exit           = powernv_cpufreq_cpu_exit,
> > +       .name           = "powernv-cpufreq",
> > +       .flags          = CPUFREQ_CONST_LOOPS,
> > +       .attr           = powernv_cpu_freq_attr,
> 
> Would be better if you keep these callbacks in the order they are declared
> in cpufreq.h..

Sure.

> 
> > +module_init(powernv_cpufreq_init);
> > +module_exit(powernv_cpufreq_exit);
> 
> Place these right after the routines without any blank lines in
between.

Is this the new convention ?

> 
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>");
> 

Thanks for the detailed review.

--
Regards
gautham.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 10:43       ` Gautham R Shenoy
@ 2014-03-21 10:54         ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 10:54 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan, Srivatsa S. Bhat, Anton Blanchard

On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:

>> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
>> > +               powernv_freqs[i].driver_data = i;
>>
>> I don't think you are using this field at all and this is the field you can
>> use for driver_data and so you can get rid of powernv_pstate_ids[ ].
>
> Using driver_data to record powernv_pstate_ids won't work since
> powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> corresponding to pstate id -3. So for now I think we will be retaining
> powernv_pstate_ids.

As I said earlier, this field is only used by cpufreq drivers and cpufreq core
isn't supposed to touch it. CPUFREQ_BOOST_FREQ and other macros
are there for .frequency field and not this one.


>> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> > +{
>> > +       int base, i;
>> > +
>> > +#ifdef CONFIG_SMP
>>
>> What will break if you don't have this ifdef here? Without that as well
>> below code should work?

Missed this one?

>> > +       base = cpu_first_thread_sibling(policy->cpu);
>> > +
>> > +       for (i = 0; i < threads_per_core; i++)
>> > +               cpumask_set_cpu(base + i, policy->cpus);
>> > +#endif
>> > +       policy->cpuinfo.transition_latency = 25000;
>> > +
>> > +       policy->cur = powernv_freqs[0].frequency;
>> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
>>
>> This doesn't exist anymore.
>
> Didn't get this comment!

cpufreq_frequency_table_get_attr() routine doesn't exist anymore.

>> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
>> > +{
>> > +       cpufreq_frequency_table_put_attr(policy->cpu);
>> > +       return 0;
>> > +}
>>
>> You don't need this..
>
> Why not ?

Because cpufreq_frequency_table_get_attr() and
cpufreq_frequency_table_put_attr() don't exist anymore :)

>> > +module_init(powernv_cpufreq_init);
>> > +module_exit(powernv_cpufreq_exit);
>>
>> Place these right after the routines without any blank lines in
> between.
>
> Is this the new convention ?

Don't know I have been following this since long time, probably from the
time I started with Mainline stuff.. I have seen many maintainers advising this
as it make code more readable, specially if these routines are quite big..

Probably it isn't mentioned in coding guidelines but people follow it most of
the times.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 10:54         ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 10:54 UTC (permalink / raw)
  To: ego; +Cc: Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:

>> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
>> > +               powernv_freqs[i].driver_data = i;
>>
>> I don't think you are using this field at all and this is the field you can
>> use for driver_data and so you can get rid of powernv_pstate_ids[ ].
>
> Using driver_data to record powernv_pstate_ids won't work since
> powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> corresponding to pstate id -3. So for now I think we will be retaining
> powernv_pstate_ids.

As I said earlier, this field is only used by cpufreq drivers and cpufreq core
isn't supposed to touch it. CPUFREQ_BOOST_FREQ and other macros
are there for .frequency field and not this one.


>> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> > +{
>> > +       int base, i;
>> > +
>> > +#ifdef CONFIG_SMP
>>
>> What will break if you don't have this ifdef here? Without that as well
>> below code should work?

Missed this one?

>> > +       base = cpu_first_thread_sibling(policy->cpu);
>> > +
>> > +       for (i = 0; i < threads_per_core; i++)
>> > +               cpumask_set_cpu(base + i, policy->cpus);
>> > +#endif
>> > +       policy->cpuinfo.transition_latency = 25000;
>> > +
>> > +       policy->cur = powernv_freqs[0].frequency;
>> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
>>
>> This doesn't exist anymore.
>
> Didn't get this comment!

cpufreq_frequency_table_get_attr() routine doesn't exist anymore.

>> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
>> > +{
>> > +       cpufreq_frequency_table_put_attr(policy->cpu);
>> > +       return 0;
>> > +}
>>
>> You don't need this..
>
> Why not ?

Because cpufreq_frequency_table_get_attr() and
cpufreq_frequency_table_put_attr() don't exist anymore :)

>> > +module_init(powernv_cpufreq_init);
>> > +module_exit(powernv_cpufreq_exit);
>>
>> Place these right after the routines without any blank lines in
> between.
>
> Is this the new convention ?

Don't know I have been following this since long time, probably from the
time I started with Mainline stuff.. I have seen many maintainers advising this
as it make code more readable, specially if these routines are quite big..

Probably it isn't mentioned in coding guidelines but people follow it most of
the times.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21  9:31     ` Viresh Kumar
@ 2014-03-21 11:04       ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 11:04 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy

On Fri, Mar 21, 2014 at 03:01:29PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > The current frequency of a cpu is reported through the sysfs file
> > cpuinfo_cur_freq. This requires the driver to implement a
> > "->get(unsigned int cpu)" method which will return the current
> > operating frequency.
> >
> > Implement a function named powernv_cpufreq_get() which reads the local
> > pstate from the PMSR and returns the corresponding frequency.
> >
> > Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> 
> Please merge these fixups with the first patch which is creating the driver.
> I understand that a different guy has created this patch and so wanted
> to have a patch on his name but its really difficult to review this
way.

Heh! Well, that wasn't the reason why this was sent out as a separate
patch, but never mind. Though I don't understand why it would be
difficult to review the patch though.

> Better add your signed-off in the first patch instead. Sending such changes
> once the driver is mainlined looks fine.

Sure, this makes sense.

> 
> > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > index 46bee8a..ef6ed8c 100644
> > --- a/drivers/cpufreq/powernv-cpufreq.c
> > +++ b/drivers/cpufreq/powernv-cpufreq.c
> > @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
> >         BUG();
> >  }
> >
> > +/*
> > + * Computes the current frequency on this cpu
> > + * and stores the result in *ret_freq.
> > + */
> > +static void powernv_read_cpu_freq(void *ret_freq)
> > +{
> > +       unsigned long pmspr_val;
> > +       s8 local_pstate_id;
> > +       int *cur_freq, freq, pstate_id;
> > +
> > +       cur_freq = (int *)ret_freq;
> 
> You don't need cur_freq variable at all..

I don't like it either. But the compiler complains without this hack
:-(

> 
> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> > +
> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> > +         * Note: Watch out for the sign! */
> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> > +       pstate_id = local_pstate_id;
> 
> similarly local_pstate_id

well, I am interested in the bits 48..55 of pmspr_val. That's the
pstate_id which can be negative. So I'll like to keep
local_pstate_id. 
> 
> > +
> > +       freq = pstate_id_to_freq(pstate_id);
> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> > +               smp_processor_id(), pmspr_val, pstate_id, freq);
> > +       *cur_freq = freq;
> 
> Move above print here after a blank line. Also remove freq variable as
> well and use *cur_freq directly.. And then you can rename it to freq
as well.

We can get rid of freq and use *cur_freq in its place.


Thanks for the detailed review.

--
Regards
gautham.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 11:04       ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 11:04 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev

On Fri, Mar 21, 2014 at 03:01:29PM +0530, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:41 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > The current frequency of a cpu is reported through the sysfs file
> > cpuinfo_cur_freq. This requires the driver to implement a
> > "->get(unsigned int cpu)" method which will return the current
> > operating frequency.
> >
> > Implement a function named powernv_cpufreq_get() which reads the local
> > pstate from the PMSR and returns the corresponding frequency.
> >
> > Set the powernv_cpufreq_driver.get hook to powernv_cpufreq_get().
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> >  drivers/cpufreq/powernv-cpufreq.c | 38 ++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 38 insertions(+)
> 
> Please merge these fixups with the first patch which is creating the driver.
> I understand that a different guy has created this patch and so wanted
> to have a patch on his name but its really difficult to review this
way.

Heh! Well, that wasn't the reason why this was sent out as a separate
patch, but never mind. Though I don't understand why it would be
difficult to review the patch though.

> Better add your signed-off in the first patch instead. Sending such changes
> once the driver is mainlined looks fine.

Sure, this makes sense.

> 
> > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > index 46bee8a..ef6ed8c 100644
> > --- a/drivers/cpufreq/powernv-cpufreq.c
> > +++ b/drivers/cpufreq/powernv-cpufreq.c
> > @@ -206,6 +206,43 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
> >         BUG();
> >  }
> >
> > +/*
> > + * Computes the current frequency on this cpu
> > + * and stores the result in *ret_freq.
> > + */
> > +static void powernv_read_cpu_freq(void *ret_freq)
> > +{
> > +       unsigned long pmspr_val;
> > +       s8 local_pstate_id;
> > +       int *cur_freq, freq, pstate_id;
> > +
> > +       cur_freq = (int *)ret_freq;
> 
> You don't need cur_freq variable at all..

I don't like it either. But the compiler complains without this hack
:-(

> 
> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> > +
> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> > +         * Note: Watch out for the sign! */
> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> > +       pstate_id = local_pstate_id;
> 
> similarly local_pstate_id

well, I am interested in the bits 48..55 of pmspr_val. That's the
pstate_id which can be negative. So I'll like to keep
local_pstate_id. 
> 
> > +
> > +       freq = pstate_id_to_freq(pstate_id);
> > +       pr_debug("cpu %d pmsr %lx pstate_id %d frequency %d \n",
> > +               smp_processor_id(), pmspr_val, pstate_id, freq);
> > +       *cur_freq = freq;
> 
> Move above print here after a blank line. Also remove freq variable as
> well and use *cur_freq directly.. And then you can rename it to freq
as well.

We can get rid of freq and use *cur_freq in its place.


Thanks for the detailed review.

--
Regards
gautham.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 10:54         ` Viresh Kumar
@ 2014-03-21 11:40           ` Srivatsa S. Bhat
  -1 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 11:40 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: ego, Linux PM list, linuxppc-dev, Anton Blanchard

On 03/21/2014 04:24 PM, Viresh Kumar wrote:
> On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> 
>>>> +               pr_debug("PState id %d freq %d MHz\n", id, freq);
>>>> +               powernv_freqs[i].driver_data = i;
>>>
[...] 
>>>> +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>>> +{
>>>> +       int base, i;
>>>> +
>>>> +#ifdef CONFIG_SMP
>>>
>>> What will break if you don't have this ifdef here? Without that as well
>>> below code should work?
> 
> Missed this one?
> 

Nothing will break, its just that the code size will be a tiny bit
lesser on UP configs, that's all :-) Anyway, I think removing the ifdef
improves the readability (and doesn't add any noticeable overhead on UP
kernels), so let's get rid of it.

Regards,
Srivatsa S. Bhat

>>>> +       base = cpu_first_thread_sibling(policy->cpu);
>>>> +
>>>> +       for (i = 0; i < threads_per_core; i++)
>>>> +               cpumask_set_cpu(base + i, policy->cpus);
>>>> +#endif
>>>> +       policy->cpuinfo.transition_latency = 25000;
>>>> +
>>>> +       policy->cur = powernv_freqs[0].frequency;
>>>> +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
>>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 11:40           ` Srivatsa S. Bhat
  0 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 11:40 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: linuxppc-dev, ego, Anton Blanchard, Linux PM list

On 03/21/2014 04:24 PM, Viresh Kumar wrote:
> On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
>> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> 
>>>> +               pr_debug("PState id %d freq %d MHz\n", id, freq);
>>>> +               powernv_freqs[i].driver_data = i;
>>>
[...] 
>>>> +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>>> +{
>>>> +       int base, i;
>>>> +
>>>> +#ifdef CONFIG_SMP
>>>
>>> What will break if you don't have this ifdef here? Without that as well
>>> below code should work?
> 
> Missed this one?
> 

Nothing will break, its just that the code size will be a tiny bit
lesser on UP configs, that's all :-) Anyway, I think removing the ifdef
improves the readability (and doesn't add any noticeable overhead on UP
kernels), so let's get rid of it.

Regards,
Srivatsa S. Bhat

>>>> +       base = cpu_first_thread_sibling(policy->cpu);
>>>> +
>>>> +       for (i = 0; i < threads_per_core; i++)
>>>> +               cpumask_set_cpu(base + i, policy->cpus);
>>>> +#endif
>>>> +       policy->cpuinfo.transition_latency = 25000;
>>>> +
>>>> +       policy->cur = powernv_freqs[0].frequency;
>>>> +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21  8:41     ` Viresh Kumar
@ 2014-03-21 11:45       ` Srivatsa S. Bhat
  -1 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 11:45 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list, benh, svaidy,
	Anton Blanchard

On 03/21/2014 02:11 PM, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
>> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> Hi Vaidy,
> 
>> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
>> index 4b029c0..4ba1632 100644
>> --- a/drivers/cpufreq/Kconfig
>> +++ b/drivers/cpufreq/Kconfig
>> @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
>>  choice
[...]
>> +static struct cpufreq_driver powernv_cpufreq_driver = {
>> +       .verify         = powernv_cpufreq_verify,
>> +       .target         = powernv_cpufreq_target,
> 
> I think you have Srivatsa there who has seen lots of cpufreq code and
> could have helped you a lot :)
>

:-)

I have followed the locking and synchronization aspects of cpufreq
closely, but unfortunately, I haven't been able to keep up with some of
the other details that well :-( Besides, its hard to keep up with the
rate of your changes, you are way too fast! ;-)

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 11:45       ` Srivatsa S. Bhat
  0 siblings, 0 replies; 80+ messages in thread
From: Srivatsa S. Bhat @ 2014-03-21 11:45 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev, Anton Blanchard

On 03/21/2014 02:11 PM, Viresh Kumar wrote:
> On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
>> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> Hi Vaidy,
> 
>> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
>> index 4b029c0..4ba1632 100644
>> --- a/drivers/cpufreq/Kconfig
>> +++ b/drivers/cpufreq/Kconfig
>> @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
>>  choice
[...]
>> +static struct cpufreq_driver powernv_cpufreq_driver = {
>> +       .verify         = powernv_cpufreq_verify,
>> +       .target         = powernv_cpufreq_target,
> 
> I think you have Srivatsa there who has seen lots of cpufreq code and
> could have helped you a lot :)
>

:-)

I have followed the locking and synchronization aspects of cpufreq
closely, but unfortunately, I haven't been able to keep up with some of
the other details that well :-( Besides, its hard to keep up with the
rate of your changes, you are way too fast! ;-)

Regards,
Srivatsa S. Bhat

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 11:04       ` Gautham R Shenoy
@ 2014-03-21 11:45         ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 11:45 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan

On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Heh! Well, that wasn't the reason why this was sent out as a separate
> patch, but never mind. Though I don't understand why it would be
> difficult to review the patch though.

Because the initial driver wasn't complete earlier. There were 2-3 patches
after the first one which are changing what the first patch has added.
Nothing else :)

>> > +static void powernv_read_cpu_freq(void *ret_freq)
>> > +{
>> > +       unsigned long pmspr_val;
>> > +       s8 local_pstate_id;
>> > +       int *cur_freq, freq, pstate_id;
>> > +
>> > +       cur_freq = (int *)ret_freq;
>>
>> You don't need cur_freq variable at all..
>
> I don't like it either. But the compiler complains without this hack
> :-(

Why would the compiler warn for doing this?:

*(int *)ret_freq = freq;

>> > +       pmspr_val = get_pmspr(SPRN_PMSR);
>> > +
>> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
>> > +         * Note: Watch out for the sign! */
>> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
>> > +       pstate_id = local_pstate_id;
>>
>> similarly local_pstate_id
>
> well, I am interested in the bits 48..55 of pmspr_val. That's the
> pstate_id which can be negative. So I'll like to keep
> local_pstate_id.

Can you please explain at bit level how getting the value in a s8
first and then into a s32 will help you here? Instead of getting it
directly into s32.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 11:45         ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 11:45 UTC (permalink / raw)
  To: ego; +Cc: linuxppc-dev, Linux PM list

On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Heh! Well, that wasn't the reason why this was sent out as a separate
> patch, but never mind. Though I don't understand why it would be
> difficult to review the patch though.

Because the initial driver wasn't complete earlier. There were 2-3 patches
after the first one which are changing what the first patch has added.
Nothing else :)

>> > +static void powernv_read_cpu_freq(void *ret_freq)
>> > +{
>> > +       unsigned long pmspr_val;
>> > +       s8 local_pstate_id;
>> > +       int *cur_freq, freq, pstate_id;
>> > +
>> > +       cur_freq = (int *)ret_freq;
>>
>> You don't need cur_freq variable at all..
>
> I don't like it either. But the compiler complains without this hack
> :-(

Why would the compiler warn for doing this?:

*(int *)ret_freq = freq;

>> > +       pmspr_val = get_pmspr(SPRN_PMSR);
>> > +
>> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
>> > +         * Note: Watch out for the sign! */
>> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
>> > +       pstate_id = local_pstate_id;
>>
>> similarly local_pstate_id
>
> well, I am interested in the bits 48..55 of pmspr_val. That's the
> pstate_id which can be negative. So I'll like to keep
> local_pstate_id.

Can you please explain at bit level how getting the value in a s8
first and then into a s32 will help you here? Instead of getting it
directly into s32.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 11:45       ` Srivatsa S. Bhat
@ 2014-03-21 11:47         ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 11:47 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Gautham R. Shenoy, linuxppc-dev, Linux PM list,
	Benjamin Herrenschmidt, Vaidyanathan Srinivasan, Anton Blanchard

On 21 March 2014 17:15, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> I think you have Srivatsa there who has seen lots of cpufreq code and
>> could have helped you a lot :)
>>
>
> :-)

I was waiting for your reply here :)

> I have followed the locking and synchronization aspects of cpufreq
> closely, but unfortunately, I haven't been able to keep up with some of
> the other details that well :-( Besides, its hard to keep up with the
> rate of your changes, you are way too fast! ;-)

Haha.. I am a very slow guy, really :)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 11:47         ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 11:47 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Gautham R. Shenoy, Linux PM list, linuxppc-dev, Anton Blanchard

On 21 March 2014 17:15, Srivatsa S. Bhat
<srivatsa.bhat@linux.vnet.ibm.com> wrote:
>> I think you have Srivatsa there who has seen lots of cpufreq code and
>> could have helped you a lot :)
>>
>
> :-)

I was waiting for your reply here :)

> I have followed the locking and synchronization aspects of cpufreq
> closely, but unfortunately, I haven't been able to keep up with some of
> the other details that well :-( Besides, its hard to keep up with the
> rate of your changes, you are way too fast! ;-)

Haha.. I am a very slow guy, really :)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 11:45         ` Viresh Kumar
@ 2014-03-21 12:01           ` David Laight
  -1 siblings, 0 replies; 80+ messages in thread
From: David Laight @ 2014-03-21 12:01 UTC (permalink / raw)
  To: 'Viresh Kumar', ego; +Cc: linuxppc-dev, Linux PM list

From: Viresh Kumar
 
> On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Heh! Well, that wasn't the reason why this was sent out as a separate
> > patch, but never mind. Though I don't understand why it would be
> > difficult to review the patch though.
> 
> Because the initial driver wasn't complete earlier. There were 2-3 patches
> after the first one which are changing what the first patch has added.
> Nothing else :)
> 
> >> > +static void powernv_read_cpu_freq(void *ret_freq)
> >> > +{
> >> > +       unsigned long pmspr_val;
> >> > +       s8 local_pstate_id;
> >> > +       int *cur_freq, freq, pstate_id;
> >> > +
> >> > +       cur_freq = (int *)ret_freq;
> >>
> >> You don't need cur_freq variable at all..
> >
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Why would the compiler warn for doing this?:
> 
> *(int *)ret_freq = freq;

Because it is very likely to be wrong.
In general casts of pointers to integer types are dangerous.
In this case why not make the function return the value?

	David



^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 12:01           ` David Laight
  0 siblings, 0 replies; 80+ messages in thread
From: David Laight @ 2014-03-21 12:01 UTC (permalink / raw)
  To: 'Viresh Kumar', ego; +Cc: linuxppc-dev, Linux PM list

RnJvbTogVmlyZXNoIEt1bWFyDQogDQo+IE9uIDIxIE1hcmNoIDIwMTQgMTY6MzQsIEdhdXRoYW0g
UiBTaGVub3kgPGVnb0BsaW51eC52bmV0LmlibS5jb20+IHdyb3RlOg0KPiA+IEhlaCEgV2VsbCwg
dGhhdCB3YXNuJ3QgdGhlIHJlYXNvbiB3aHkgdGhpcyB3YXMgc2VudCBvdXQgYXMgYSBzZXBhcmF0
ZQ0KPiA+IHBhdGNoLCBidXQgbmV2ZXIgbWluZC4gVGhvdWdoIEkgZG9uJ3QgdW5kZXJzdGFuZCB3
aHkgaXQgd291bGQgYmUNCj4gPiBkaWZmaWN1bHQgdG8gcmV2aWV3IHRoZSBwYXRjaCB0aG91Z2gu
DQo+IA0KPiBCZWNhdXNlIHRoZSBpbml0aWFsIGRyaXZlciB3YXNuJ3QgY29tcGxldGUgZWFybGll
ci4gVGhlcmUgd2VyZSAyLTMgcGF0Y2hlcw0KPiBhZnRlciB0aGUgZmlyc3Qgb25lIHdoaWNoIGFy
ZSBjaGFuZ2luZyB3aGF0IHRoZSBmaXJzdCBwYXRjaCBoYXMgYWRkZWQuDQo+IE5vdGhpbmcgZWxz
ZSA6KQ0KPiANCj4gPj4gPiArc3RhdGljIHZvaWQgcG93ZXJudl9yZWFkX2NwdV9mcmVxKHZvaWQg
KnJldF9mcmVxKQ0KPiA+PiA+ICt7DQo+ID4+ID4gKyAgICAgICB1bnNpZ25lZCBsb25nIHBtc3By
X3ZhbDsNCj4gPj4gPiArICAgICAgIHM4IGxvY2FsX3BzdGF0ZV9pZDsNCj4gPj4gPiArICAgICAg
IGludCAqY3VyX2ZyZXEsIGZyZXEsIHBzdGF0ZV9pZDsNCj4gPj4gPiArDQo+ID4+ID4gKyAgICAg
ICBjdXJfZnJlcSA9IChpbnQgKilyZXRfZnJlcTsNCj4gPj4NCj4gPj4gWW91IGRvbid0IG5lZWQg
Y3VyX2ZyZXEgdmFyaWFibGUgYXQgYWxsLi4NCj4gPg0KPiA+IEkgZG9uJ3QgbGlrZSBpdCBlaXRo
ZXIuIEJ1dCB0aGUgY29tcGlsZXIgY29tcGxhaW5zIHdpdGhvdXQgdGhpcyBoYWNrDQo+ID4gOi0o
DQo+IA0KPiBXaHkgd291bGQgdGhlIGNvbXBpbGVyIHdhcm4gZm9yIGRvaW5nIHRoaXM/Og0KPiAN
Cj4gKihpbnQgKilyZXRfZnJlcSA9IGZyZXE7DQoNCkJlY2F1c2UgaXQgaXMgdmVyeSBsaWtlbHkg
dG8gYmUgd3JvbmcuDQpJbiBnZW5lcmFsIGNhc3RzIG9mIHBvaW50ZXJzIHRvIGludGVnZXIgdHlw
ZXMgYXJlIGRhbmdlcm91cy4NCkluIHRoaXMgY2FzZSB3aHkgbm90IG1ha2UgdGhlIGZ1bmN0aW9u
IHJldHVybiB0aGUgdmFsdWU/DQoNCglEYXZpZA0KDQoNCg==

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 12:01           ` David Laight
@ 2014-03-21 12:31             ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 12:31 UTC (permalink / raw)
  To: David Laight; +Cc: 'Viresh Kumar', ego, linuxppc-dev, Linux PM list

On Fri, Mar 21, 2014 at 12:01:07PM +0000, David Laight wrote:
> From: Viresh Kumar
>  
> > On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > > Heh! Well, that wasn't the reason why this was sent out as a separate
> > > patch, but never mind. Though I don't understand why it would be
> > > difficult to review the patch though.
> > 
> > Because the initial driver wasn't complete earlier. There were 2-3 patches
> > after the first one which are changing what the first patch has added.
> > Nothing else :)
> > 
> > >> > +static void powernv_read_cpu_freq(void *ret_freq)
> > >> > +{
> > >> > +       unsigned long pmspr_val;
> > >> > +       s8 local_pstate_id;
> > >> > +       int *cur_freq, freq, pstate_id;
> > >> > +
> > >> > +       cur_freq = (int *)ret_freq;
> > >>
> > >> You don't need cur_freq variable at all..
> > >
> > > I don't like it either. But the compiler complains without this hack
> > > :-(
> > 
> > Why would the compiler warn for doing this?:
> > 
> > *(int *)ret_freq = freq;
> 
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.
> In this case why not make the function return the value?

Because this function is called via an smp_call_function(). And we
need a way of returning the value to the caller.

> 
> 	David
> 
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 12:31             ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 12:31 UTC (permalink / raw)
  To: David Laight; +Cc: linuxppc-dev, 'Viresh Kumar', Linux PM list, ego

On Fri, Mar 21, 2014 at 12:01:07PM +0000, David Laight wrote:
> From: Viresh Kumar
>  
> > On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > > Heh! Well, that wasn't the reason why this was sent out as a separate
> > > patch, but never mind. Though I don't understand why it would be
> > > difficult to review the patch though.
> > 
> > Because the initial driver wasn't complete earlier. There were 2-3 patches
> > after the first one which are changing what the first patch has added.
> > Nothing else :)
> > 
> > >> > +static void powernv_read_cpu_freq(void *ret_freq)
> > >> > +{
> > >> > +       unsigned long pmspr_val;
> > >> > +       s8 local_pstate_id;
> > >> > +       int *cur_freq, freq, pstate_id;
> > >> > +
> > >> > +       cur_freq = (int *)ret_freq;
> > >>
> > >> You don't need cur_freq variable at all..
> > >
> > > I don't like it either. But the compiler complains without this hack
> > > :-(
> > 
> > Why would the compiler warn for doing this?:
> > 
> > *(int *)ret_freq = freq;
> 
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.
> In this case why not make the function return the value?

Because this function is called via an smp_call_function(). And we
need a way of returning the value to the caller.

> 
> 	David
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 12:01           ` David Laight
@ 2014-03-21 12:34             ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 12:34 UTC (permalink / raw)
  To: David Laight; +Cc: ego, linuxppc-dev, Linux PM list

On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
>> *(int *)ret_freq = freq;
>
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.

Where are we converting pointers to integers? We are doing a
cast from 'void * ' to 'int *' and then using indirection operator
to set its value.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 12:34             ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 12:34 UTC (permalink / raw)
  To: David Laight; +Cc: ego, Linux PM list, linuxppc-dev

On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
>> *(int *)ret_freq = freq;
>
> Because it is very likely to be wrong.
> In general casts of pointers to integer types are dangerous.

Where are we converting pointers to integers? We are doing a
cast from 'void * ' to 'int *' and then using indirection operator
to set its value.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 11:45         ` Viresh Kumar
@ 2014-03-21 13:04           ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 13:04 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan

On Fri, Mar 21, 2014 at 05:15:34PM +0530, Viresh Kumar wrote:
> On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Heh! Well, that wasn't the reason why this was sent out as a separate
> > patch, but never mind. Though I don't understand why it would be
> > difficult to review the patch though.
> 
> Because the initial driver wasn't complete earlier. There were 2-3 patches
> after the first one which are changing what the first patch has added.
> Nothing else :)
> 
> >> > +static void powernv_read_cpu_freq(void *ret_freq)
> >> > +{
> >> > +       unsigned long pmspr_val;
> >> > +       s8 local_pstate_id;
> >> > +       int *cur_freq, freq, pstate_id;
> >> > +
> >> > +       cur_freq = (int *)ret_freq;
> >>
> >> You don't need cur_freq variable at all..
> >
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Why would the compiler warn for doing this?:
> 
> *(int *)ret_freq = freq;

The compiler doesn't complain if I do this.
> 
> >> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> >> > +
> >> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> >> > +         * Note: Watch out for the sign! */
> >> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> >> > +       pstate_id = local_pstate_id;
> >>
> >> similarly local_pstate_id
> >
> > well, I am interested in the bits 48..55 of pmspr_val. That's the
> > pstate_id which can be negative. So I'll like to keep
> > local_pstate_id.
> 
> Can you please explain at bit level how getting the value in a s8
> first and then into a s32 will help you here? Instead of getting it
> directly into s32.

Consider the case when pmspr = 0x00feffffffffffff;

We are interested in extracting the value 'fe'. And ensure that when
we store this value into an int, we get the sign extension right.

So the following doesn't work: 

   pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

Since we get pstate_id = 0x000000fe, which is not the same as
0xfffffffe.

Of course, we could do the following, but I wasn't sure if
two levels of type casting helped on the readability front.

   pstate_id = (int) ((s8)((pmspr_val >> 48) & 0xFF));

--
Thanks and Regards
gautham.
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 13:04           ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 13:04 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: ego, Linux PM list, linuxppc-dev

On Fri, Mar 21, 2014 at 05:15:34PM +0530, Viresh Kumar wrote:
> On 21 March 2014 16:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Heh! Well, that wasn't the reason why this was sent out as a separate
> > patch, but never mind. Though I don't understand why it would be
> > difficult to review the patch though.
> 
> Because the initial driver wasn't complete earlier. There were 2-3 patches
> after the first one which are changing what the first patch has added.
> Nothing else :)
> 
> >> > +static void powernv_read_cpu_freq(void *ret_freq)
> >> > +{
> >> > +       unsigned long pmspr_val;
> >> > +       s8 local_pstate_id;
> >> > +       int *cur_freq, freq, pstate_id;
> >> > +
> >> > +       cur_freq = (int *)ret_freq;
> >>
> >> You don't need cur_freq variable at all..
> >
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Why would the compiler warn for doing this?:
> 
> *(int *)ret_freq = freq;

The compiler doesn't complain if I do this.
> 
> >> > +       pmspr_val = get_pmspr(SPRN_PMSR);
> >> > +
> >> > +       /* The local pstate id corresponds bits 48..55 in the PMSR.
> >> > +         * Note: Watch out for the sign! */
> >> > +       local_pstate_id = (pmspr_val >> 48) & 0xFF;
> >> > +       pstate_id = local_pstate_id;
> >>
> >> similarly local_pstate_id
> >
> > well, I am interested in the bits 48..55 of pmspr_val. That's the
> > pstate_id which can be negative. So I'll like to keep
> > local_pstate_id.
> 
> Can you please explain at bit level how getting the value in a s8
> first and then into a s32 will help you here? Instead of getting it
> directly into s32.

Consider the case when pmspr = 0x00feffffffffffff;

We are interested in extracting the value 'fe'. And ensure that when
we store this value into an int, we get the sign extension right.

So the following doesn't work: 

   pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

Since we get pstate_id = 0x000000fe, which is not the same as
0xfffffffe.

Of course, we could do the following, but I wasn't sure if
two levels of type casting helped on the readability front.

   pstate_id = (int) ((s8)((pmspr_val >> 48) & 0xFF));

--
Thanks and Regards
gautham.
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 13:04           ` Gautham R Shenoy
@ 2014-03-21 13:12             ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 13:12 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan

On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Consider the case when pmspr = 0x00feffffffffffff;
>
> We are interested in extracting the value 'fe'. And ensure that when
> we store this value into an int, we get the sign extension right.
>
> So the following doesn't work:
>
>    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

What about pstate_id = (pmspr_val >> 48) & 0xFF; ??

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 13:12             ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 13:12 UTC (permalink / raw)
  To: ego; +Cc: linuxppc-dev, Linux PM list

On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Consider the case when pmspr = 0x00feffffffffffff;
>
> We are interested in extracting the value 'fe'. And ensure that when
> we store this value into an int, we get the sign extension right.
>
> So the following doesn't work:
>
>    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;

What about pstate_id = (pmspr_val >> 48) & 0xFF; ??

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 10:54         ` Viresh Kumar
@ 2014-03-21 13:23           ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 13:23 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan, Srivatsa S. Bhat, Anton Blanchard

Hi Viresh,

On Fri, Mar 21, 2014 at 04:24:27PM +0530, Viresh Kumar wrote:
> On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> 
> >> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> >> > +               powernv_freqs[i].driver_data = i;
> >>
> >> I don't think you are using this field at all and this is the field you can
> >> use for driver_data and so you can get rid of powernv_pstate_ids[ ].
> >
> > Using driver_data to record powernv_pstate_ids won't work since
> > powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> > with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> > corresponding to pstate id -3. So for now I think we will be retaining
> > powernv_pstate_ids.
> 
> As I said earlier, this field is only used by cpufreq drivers and cpufreq core
> isn't supposed to touch it. CPUFREQ_BOOST_FREQ and other macros
> are there for .frequency field and not this one.
> 

Ok, I had based my code on linus's git tree. I checked the 'pm-cpufreq'
branch of Rafael's 'linux-pm' tree and freq_table.c contains the
following code snippet in show_available_frequencies:

	if (show_boost ^ (table[i].driver_data == CPUFREQ_BOOST_FREQ))
			continue;

I suspect we're talking about different code bases. So could you
please tell me which one should I be looking at ?

> 
> >> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> >> > +{
> >> > +       int base, i;
> >> > +
> >> > +#ifdef CONFIG_SMP
> >>
> >> What will break if you don't have this ifdef here? Without that as well
> >> below code should work?
> 
> Missed this one?
> 
> >> > +       base = cpu_first_thread_sibling(policy->cpu);
> >> > +
> >> > +       for (i = 0; i < threads_per_core; i++)
> >> > +               cpumask_set_cpu(base + i, policy->cpus);
> >> > +#endif
> >> > +       policy->cpuinfo.transition_latency = 25000;
> >> > +
> >> > +       policy->cur = powernv_freqs[0].frequency;
> >> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> >>
> >> This doesn't exist anymore.
> >
> > Didn't get this comment!
> 
> cpufreq_frequency_table_get_attr() routine doesn't exist anymore.
> 
> >> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> >> > +{
> >> > +       cpufreq_frequency_table_put_attr(policy->cpu);
> >> > +       return 0;
> >> > +}
> >>
> >> You don't need this..
> >
> > Why not ?
> 
> Because cpufreq_frequency_table_get_attr() and
> cpufreq_frequency_table_put_attr() don't exist anymore :)

Well, it does in the codebases that I was looking at. May be I've been
looking at the wrong place.

> 
> >> > +module_init(powernv_cpufreq_init);
> >> > +module_exit(powernv_cpufreq_exit);
> >>
> >> Place these right after the routines without any blank lines in
> > between.
> >
> > Is this the new convention ?
> 
> Don't know I have been following this since long time, probably from the
> time I started with Mainline stuff.. I have seen many maintainers advising this
> as it make code more readable, specially if these routines are quite big..
> 
> Probably it isn't mentioned in coding guidelines but people follow it most of
> the times.

Ok. I was not aware of this. Good to know :-)
> 

--
Thanks and Regards
gautham.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 13:23           ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 13:23 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

Hi Viresh,

On Fri, Mar 21, 2014 at 04:24:27PM +0530, Viresh Kumar wrote:
> On 21 March 2014 16:13, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> 
> >> > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> >> > +               powernv_freqs[i].driver_data = i;
> >>
> >> I don't think you are using this field at all and this is the field you can
> >> use for driver_data and so you can get rid of powernv_pstate_ids[ ].
> >
> > Using driver_data to record powernv_pstate_ids won't work since
> > powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> > with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> > corresponding to pstate id -3. So for now I think we will be retaining
> > powernv_pstate_ids.
> 
> As I said earlier, this field is only used by cpufreq drivers and cpufreq core
> isn't supposed to touch it. CPUFREQ_BOOST_FREQ and other macros
> are there for .frequency field and not this one.
> 

Ok, I had based my code on linus's git tree. I checked the 'pm-cpufreq'
branch of Rafael's 'linux-pm' tree and freq_table.c contains the
following code snippet in show_available_frequencies:

	if (show_boost ^ (table[i].driver_data == CPUFREQ_BOOST_FREQ))
			continue;

I suspect we're talking about different code bases. So could you
please tell me which one should I be looking at ?

> 
> >> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> >> > +{
> >> > +       int base, i;
> >> > +
> >> > +#ifdef CONFIG_SMP
> >>
> >> What will break if you don't have this ifdef here? Without that as well
> >> below code should work?
> 
> Missed this one?
> 
> >> > +       base = cpu_first_thread_sibling(policy->cpu);
> >> > +
> >> > +       for (i = 0; i < threads_per_core; i++)
> >> > +               cpumask_set_cpu(base + i, policy->cpus);
> >> > +#endif
> >> > +       policy->cpuinfo.transition_latency = 25000;
> >> > +
> >> > +       policy->cur = powernv_freqs[0].frequency;
> >> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> >>
> >> This doesn't exist anymore.
> >
> > Didn't get this comment!
> 
> cpufreq_frequency_table_get_attr() routine doesn't exist anymore.
> 
> >> > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> >> > +{
> >> > +       cpufreq_frequency_table_put_attr(policy->cpu);
> >> > +       return 0;
> >> > +}
> >>
> >> You don't need this..
> >
> > Why not ?
> 
> Because cpufreq_frequency_table_get_attr() and
> cpufreq_frequency_table_put_attr() don't exist anymore :)

Well, it does in the codebases that I was looking at. May be I've been
looking at the wrong place.

> 
> >> > +module_init(powernv_cpufreq_init);
> >> > +module_exit(powernv_cpufreq_exit);
> >>
> >> Place these right after the routines without any blank lines in
> > between.
> >
> > Is this the new convention ?
> 
> Don't know I have been following this since long time, probably from the
> time I started with Mainline stuff.. I have seen many maintainers advising this
> as it make code more readable, specially if these routines are quite big..
> 
> Probably it isn't mentioned in coding guidelines but people follow it most of
> the times.

Ok. I was not aware of this. Good to know :-)
> 

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 13:23           ` Gautham R Shenoy
@ 2014-03-21 13:34             ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 13:34 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan, Srivatsa S. Bhat, Anton Blanchard

On 21 March 2014 18:53, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Ok, I had based my code on linus's git tree. I checked the 'pm-cpufreq'

Always check his bleeding-edge or linux-next branch.

> branch of Rafael's 'linux-pm' tree and freq_table.c contains the
> following code snippet in show_available_frequencies:
>
>         if (show_boost ^ (table[i].driver_data == CPUFREQ_BOOST_FREQ))
>                         continue;
>
> I suspect we're talking about different code bases. So could you
> please tell me which one should I be looking at ?

Okay, there is some problem here. I will check how can I simplify this..
And yes, you were correct.

>> Because cpufreq_frequency_table_get_attr() and
>> cpufreq_frequency_table_put_attr() don't exist anymore :)
>
> Well, it does in the codebases that I was looking at. May be I've been
> looking at the wrong place.

Check the one I mentioned.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 13:34             ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-21 13:34 UTC (permalink / raw)
  To: ego; +Cc: Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

On 21 March 2014 18:53, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Ok, I had based my code on linus's git tree. I checked the 'pm-cpufreq'

Always check his bleeding-edge or linux-next branch.

> branch of Rafael's 'linux-pm' tree and freq_table.c contains the
> following code snippet in show_available_frequencies:
>
>         if (show_boost ^ (table[i].driver_data == CPUFREQ_BOOST_FREQ))
>                         continue;
>
> I suspect we're talking about different code bases. So could you
> please tell me which one should I be looking at ?

Okay, there is some problem here. I will check how can I simplify this..
And yes, you were correct.

>> Because cpufreq_frequency_table_get_attr() and
>> cpufreq_frequency_table_put_attr() don't exist anymore :)
>
> Well, it does in the codebases that I was looking at. May be I've been
> looking at the wrong place.

Check the one I mentioned.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 13:12             ` Viresh Kumar
@ 2014-03-21 14:25               ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 14:25 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan

On Fri, Mar 21, 2014 at 06:42:50PM +0530, Viresh Kumar wrote:
> On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Consider the case when pmspr = 0x00feffffffffffff;
> >
> > We are interested in extracting the value 'fe'. And ensure that when
> > we store this value into an int, we get the sign extension right.
> >
> > So the following doesn't work:
> >
> >    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;
> 
> What about pstate_id = (pmspr_val >> 48) & 0xFF; ??

Nope. 

Suppose the pmspr_val is contained in the register 
%rax, and pstate_id corresponds to the address -20(%rbp) then:

	pstate_id = (pmspr_val >> 48) & 0xFF;

would corrspond to 

        shrq    $48, %rax        // Left shift by 48 bits
        andl    $255, %eax       // Mask the lower 32 bits of %rax with 0x000000FF
        movl    %eax, -20(%rbp)  // Store the lower 32 bits of %rax
                                    into pstate_id


On the other hand, 

   pstate_id = (int)((s8)((pmspr_val >> 48) & 0xFF));

would correspond to:

        shrq    $48, %rax  // Left shift by 48 bits.
        movsbl  %al, %eax  // Move the lower 8 bits of %rax to %eax
                              with sign-extension. 
        movl    %eax, -20(%rbp) // store the result in pstate_id;

So with this, we are getting the sign extension due to the use of movsbl.

And if local_pstate_id corresponds to the address -1(%rbp) then:

       local_pstate_id = (pmspr_val >> 48) & 0xFF;
       pstate_id = local_pstate_id;

would correspond to:

        shrq    $48, %rax        // Left shift by 48 bits
        movb    %al, -1(%rbp)    //copy the lower 8 bits to local_pstate_id
        movsbl  -1(%rbp), %eax   //move the contents of
                                   local_pstate_id to %eax with sign extension.
        movl    %eax, -20(%rbp)  // Store the result in pstate_id


Hope this helps :-)

--
Thanks and Regards
gautham.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 14:25               ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 14:25 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: ego, Linux PM list, linuxppc-dev

On Fri, Mar 21, 2014 at 06:42:50PM +0530, Viresh Kumar wrote:
> On 21 March 2014 18:34, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> > Consider the case when pmspr = 0x00feffffffffffff;
> >
> > We are interested in extracting the value 'fe'. And ensure that when
> > we store this value into an int, we get the sign extension right.
> >
> > So the following doesn't work:
> >
> >    pstate_id = (pmspr_val >> 48) & 0xFFFFFFFF;
> 
> What about pstate_id = (pmspr_val >> 48) & 0xFF; ??

Nope. 

Suppose the pmspr_val is contained in the register 
%rax, and pstate_id corresponds to the address -20(%rbp) then:

	pstate_id = (pmspr_val >> 48) & 0xFF;

would corrspond to 

        shrq    $48, %rax        // Left shift by 48 bits
        andl    $255, %eax       // Mask the lower 32 bits of %rax with 0x000000FF
        movl    %eax, -20(%rbp)  // Store the lower 32 bits of %rax
                                    into pstate_id


On the other hand, 

   pstate_id = (int)((s8)((pmspr_val >> 48) & 0xFF));

would correspond to:

        shrq    $48, %rax  // Left shift by 48 bits.
        movsbl  %al, %eax  // Move the lower 8 bits of %rax to %eax
                              with sign-extension. 
        movl    %eax, -20(%rbp) // store the result in pstate_id;

So with this, we are getting the sign extension due to the use of movsbl.

And if local_pstate_id corresponds to the address -1(%rbp) then:

       local_pstate_id = (pmspr_val >> 48) & 0xFF;
       pstate_id = local_pstate_id;

would correspond to:

        shrq    $48, %rax        // Left shift by 48 bits
        movb    %al, -1(%rbp)    //copy the lower 8 bits to local_pstate_id
        movsbl  -1(%rbp), %eax   //move the contents of
                                   local_pstate_id to %eax with sign extension.
        movl    %eax, -20(%rbp)  // Store the result in pstate_id


Hope this helps :-)

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 10:43       ` Gautham R Shenoy
@ 2014-03-21 14:48         ` Vaidyanathan Srinivasan
  -1 siblings, 0 replies; 80+ messages in thread
From: Vaidyanathan Srinivasan @ 2014-03-21 14:48 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Linux PM list, Viresh Kumar, linuxppc-dev, Anton Blanchard,
	Srivatsa S. Bhat

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2014-03-21 16:13:17]:

> Hi Viresh,
> 
> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> > On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> > <ego@linux.vnet.ibm.com> wrote:
> > > From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> > 
> > Hi Vaidy,
> > 

Hi Viresh,

Thanks for the detailed review.

> > > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> > > index 4b029c0..4ba1632 100644
> > > --- a/drivers/cpufreq/Kconfig
> > > +++ b/drivers/cpufreq/Kconfig
> > > @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
> > >  choice
> > >         prompt "Default CPUFreq governor"
> > >         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> > > +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
> > 
> > Probably we should remove SA1100's entry as well from here. This is
> > not the right way of doing it. Imagine 100 platforms having entries here.
> > If you want it, then select it from your platforms Kconfig.
> 
> Sure. Will move these bits and the other governor related bits to the
> Powernv Kconfig.
> 
> > > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > > new file mode 100644
> > > index 0000000..ab1551f
> > > --- /dev/null
> > > +
> > > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/cpufreq.h>
> > > +#include <linux/of.h>
> > > +#include <asm/cputhreads.h>
> > 
> > That's it? Sure?
> > 
> > Even if things compile for you, you must explicitly include all the
> > files on which
> > you depend.
> 
> Ok. 
> 
> > 
> > > +
> > > +       WARN_ON(len_ids != len_freqs);
> > > +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> > > +       WARN_ON(!nr_pstates);
> > 
> > Why do you want to continue here?
> 
> Good point. We might be better off exiting at this point. 

Yes, we could return here.  We do not generally come till this point
if the platform has a problem discovering PStates from device tree.
But there is no useful debug info after this point if nr_pstates is 0.

> > 
> > > +       pr_debug("NR PStates %d\n", nr_pstates);
> > > +       for (i = 0; i < nr_pstates; i++) {
> > > +               u32 id = be32_to_cpu(pstate_ids[i]);
> > > +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> > > +
> > > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> > > +               powernv_freqs[i].driver_data = i;
> > 
> > I don't think you are using this field at all and this is the field you can
> > use for driver_data and so you can get rid of powernv_pstate_ids[ ].
> 
> Using driver_data to record powernv_pstate_ids won't work since
> powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> corresponding to pstate id -3. So for now I think we will be retaining
> powernv_pstate_ids.

Yeah, I had the driver written using driver_data to store pstates.
Gautham found the bug that we are missing one PState when we match the
ID with CPUFREQ_BOOST_FREQ!

We did not know that you have taken care of those issues.  Ideally
I did expect that driver_data should not be touched by the framework.
Thanks for fixing that and allowing the back-end driver to use
driver_data.
 
> > 
> > > +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> > > +               powernv_pstate_ids[i] = id;
> > > +       }
> > > +       /* End of list marker entry */
> > > +       powernv_freqs[i].driver_data = 0;
> > 
> > Not required.
> 
> Ok.
> > 
> > > +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> > > +
> > > +       /* Print frequency table */
> > > +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> > > +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> > 
> > You have already printed this table earlier..
> 
> Fair enough.
> 
> > 
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static struct freq_attr *powernv_cpu_freq_attr[] = {
> > > +       &cpufreq_freq_attr_scaling_available_freqs,
> > > +       NULL,
> > > +};
> > 
> > Can use this instead: cpufreq_generic_attr?
> 
> In this patch yes. But later patch introduces an additional attribute
> for displaying the nominal frequency. Will handle that part in a clean
> way in the next version.
> 
> > 
> > > +/* Helper routines */
> > > +
> > > +/* Access helpers to power mgt SPR */
> > > +
> > > +static inline unsigned long get_pmspr(unsigned long sprn)
> > 
> > Looks big enough not be inlined?
> 
> It is called from one function. It has been defined separately for
> readability. 

Let the compiler decide.  The code is straight forward :)

I wanted to keep this SPR operations in a separate function to
facilitate debug and also introduce any timing/sequence change here if
required.  Keeping this separate is helpful, if inlining is
successful, we get a bonus :)

> > 
> > > +{
> > > +       switch (sprn) {
> > > +       case SPRN_PMCR:
> > > +               return mfspr(SPRN_PMCR);
> > > +
> > > +       case SPRN_PMICR:
> > > +               return mfspr(SPRN_PMICR);
> > > +
> > > +       case SPRN_PMSR:
> > > +               return mfspr(SPRN_PMSR);
> > > +       }
> > > +       BUG();
> > > +}
> > > +
> > > +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> > > +{
> > 
> > Same here..
> 
> Same reason as above.
> 
> > 
> > > +       switch (sprn) {
> > > +       case SPRN_PMCR:
> > > +               mtspr(SPRN_PMCR, val);
> > > +               return;
> > > +
> > > +       case SPRN_PMICR:
> > > +               mtspr(SPRN_PMICR, val);
> > > +               return;
> > > +
> > > +       case SPRN_PMSR:
> > > +               mtspr(SPRN_PMSR, val);
> > > +               return;
> > > +       }
> > > +       BUG();
> > > +}
> > > +
> > > +static void set_pstate(void *pstate)
> > > +{
> > > +       unsigned long val;
> > > +       unsigned long pstate_ul = *(unsigned long *) pstate;
> > 
> > Why not sending value only to this routine instead of pointer?
> 
> Well this function is called via an smp_call_function. so, cannot send
> a value :(
> 
> > 
> > > +
> > > +       val = get_pmspr(SPRN_PMCR);
> > > +       val = val & 0x0000ffffffffffffULL;
> > 
> > Maybe a blank line here?
> 
> Ok.
> 
> > 
> > > +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> > > +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> > 
> > here as well?
> 
> Ok.
> > 
> 
> > > +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> > > +       set_pmspr(SPRN_PMCR, val);
> > > +}
> > > +
> > > +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> > > +{
> > > +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
> > 
> > I think automatic type conversion will happen here.
> 
> Ok. Will fix this.

Most of the conversions are verbose mainly to help readability of the
required sign-extensions.  There is scope to make these concise.

> > 
> > > +
> > > +       /*
> > > +        * Use smp_call_function to send IPI and execute the
> > > +        * mtspr on target cpu.  We could do that without IPI
> > > +        * if current CPU is within policy->cpus (core)
> > > +        */
> > 
> > Hmm, interesting I also feel there are cases where this routine can
> > get called from other CPUs. Can you list those use cases where it can
> > happen? Governors will mostly call this from one of the CPUs present
> > in policy->cpus.
> 
> Consider the case when the governor is userspace and we are executing 
> 
>     # echo freqvalue  > 
>          /sys/devices/system/cpu/cpu<i>/cpufreq/scaling_setspeed 
> 
> from a cpu <j> which is not in policy->cpus of cpu i. 
> 
> 
> > > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> > > +{
> > > +       int base, i;
> > > +
> > > +#ifdef CONFIG_SMP
> > 
> > What will break if you don't have this ifdef here? Without that as well
> > below code should work?
> > 
> > > +       base = cpu_first_thread_sibling(policy->cpu);
> > > +
> > > +       for (i = 0; i < threads_per_core; i++)
> > > +               cpumask_set_cpu(base + i, policy->cpus);
> > > +#endif
> > > +       policy->cpuinfo.transition_latency = 25000;
> > > +
> > > +       policy->cur = powernv_freqs[0].frequency;
> > > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> > 
> > This doesn't exist anymore.
> 
> Didn't get this comment!
> 
> > 
> > > +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
> > 
> > Have you written this driver long time back? CPUFreq core has been
> > cleaned up heavily since last few kernel releases and I think there are
> > better helper routines available now.
> 
> Yup it was written quite a while ago. And yeah, CPUFreq has changed
> quite a bit since the last time I saw it :-)

Yes, the driver is more than a year old and based on even older
implementation that I had written.  I kept up until I got a compiler
warning or functional issue.  Definitely could not catch-up with your
cleanups :)

> > 
> > > +}
> > > +
> > > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> > > +{
> > > +       cpufreq_frequency_table_put_attr(policy->cpu);
> > > +       return 0;
> > > +}
> > 
> > You don't need this..
> 
> Why not ?
> 
> > 
> > > +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> > > +{
> > > +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> > > +}
> > 
> > use generic verify function pointer instead..
> > 
> > > +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> > > +                             unsigned int target_freq,
> > > +                             unsigned int relation)
> > 
> > use target_index() instead..
> > 
> > > +{
> > > +       int rc;
> > > +       struct cpufreq_freqs freqs;
> > > +       unsigned int new_index;
> > > +
> > > +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> > > +                                      relation, &new_index);
> > > +
> > > +       freqs.old = policy->cur;
> > > +       freqs.new = powernv_freqs[new_index].frequency;
> > > +       freqs.cpu = policy->cpu;
> > > +
> > > +       mutex_lock(&freq_switch_mutex);
> > 
> > Why do you need this lock for?
> 
> I guess it was to serialize accesses to PMCR. But Srivatsa's patch
> converts this to a per-core lock which probably is no longer required
> after your cpufreq_freq_transition_begin/end() patch series.

Right.  Frequency transitions are per-core, the h/w SPRs are per-core.
We need to prevent threads from colliding on h/w SPR access.  We do
make the lock per-core later in the series as mentioned above.

Gautham has addressed most comments.

Thanks,
Vaidy

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 14:48         ` Vaidyanathan Srinivasan
  0 siblings, 0 replies; 80+ messages in thread
From: Vaidyanathan Srinivasan @ 2014-03-21 14:48 UTC (permalink / raw)
  To: Gautham R Shenoy
  Cc: Linux PM list, Viresh Kumar, linuxppc-dev, Anton Blanchard,
	Srivatsa S. Bhat

* Gautham R Shenoy <ego@linux.vnet.ibm.com> [2014-03-21 16:13:17]:

> Hi Viresh,
> 
> On Fri, Mar 21, 2014 at 02:11:32PM +0530, Viresh Kumar wrote:
> > On Thu, Mar 20, 2014 at 5:40 PM, Gautham R. Shenoy
> > <ego@linux.vnet.ibm.com> wrote:
> > > From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> > 
> > Hi Vaidy,
> > 

Hi Viresh,

Thanks for the detailed review.

> > > diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> > > index 4b029c0..4ba1632 100644
> > > --- a/drivers/cpufreq/Kconfig
> > > +++ b/drivers/cpufreq/Kconfig
> > > @@ -48,6 +48,7 @@ config CPU_FREQ_STAT_DETAILS
> > >  choice
> > >         prompt "Default CPUFreq governor"
> > >         default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1100_CPUFREQ || ARM_SA1110_CPUFREQ
> > > +       default CPU_FREQ_DEFAULT_GOV_ONDEMAND if POWERNV_CPUFREQ
> > 
> > Probably we should remove SA1100's entry as well from here. This is
> > not the right way of doing it. Imagine 100 platforms having entries here.
> > If you want it, then select it from your platforms Kconfig.
> 
> Sure. Will move these bits and the other governor related bits to the
> Powernv Kconfig.
> 
> > > diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> > > new file mode 100644
> > > index 0000000..ab1551f
> > > --- /dev/null
> > > +
> > > +#define pr_fmt(fmt)    "powernv-cpufreq: " fmt
> > > +
> > > +#include <linux/module.h>
> > > +#include <linux/cpufreq.h>
> > > +#include <linux/of.h>
> > > +#include <asm/cputhreads.h>
> > 
> > That's it? Sure?
> > 
> > Even if things compile for you, you must explicitly include all the
> > files on which
> > you depend.
> 
> Ok. 
> 
> > 
> > > +
> > > +       WARN_ON(len_ids != len_freqs);
> > > +       nr_pstates = min(len_ids, len_freqs) / sizeof(u32);
> > > +       WARN_ON(!nr_pstates);
> > 
> > Why do you want to continue here?
> 
> Good point. We might be better off exiting at this point. 

Yes, we could return here.  We do not generally come till this point
if the platform has a problem discovering PStates from device tree.
But there is no useful debug info after this point if nr_pstates is 0.

> > 
> > > +       pr_debug("NR PStates %d\n", nr_pstates);
> > > +       for (i = 0; i < nr_pstates; i++) {
> > > +               u32 id = be32_to_cpu(pstate_ids[i]);
> > > +               u32 freq = be32_to_cpu(pstate_freqs[i]);
> > > +
> > > +               pr_debug("PState id %d freq %d MHz\n", id, freq);
> > > +               powernv_freqs[i].driver_data = i;
> > 
> > I don't think you are using this field at all and this is the field you can
> > use for driver_data and so you can get rid of powernv_pstate_ids[ ].
> 
> Using driver_data to record powernv_pstate_ids won't work since
> powernv_pstate_ids can be negative. So a pstate_id -3 can be confused
> with CPUFREQ_BOOST_FREQ thereby not displaying the frequency
> corresponding to pstate id -3. So for now I think we will be retaining
> powernv_pstate_ids.

Yeah, I had the driver written using driver_data to store pstates.
Gautham found the bug that we are missing one PState when we match the
ID with CPUFREQ_BOOST_FREQ!

We did not know that you have taken care of those issues.  Ideally
I did expect that driver_data should not be touched by the framework.
Thanks for fixing that and allowing the back-end driver to use
driver_data.
 
> > 
> > > +               powernv_freqs[i].frequency = freq * 1000; /* kHz */
> > > +               powernv_pstate_ids[i] = id;
> > > +       }
> > > +       /* End of list marker entry */
> > > +       powernv_freqs[i].driver_data = 0;
> > 
> > Not required.
> 
> Ok.
> > 
> > > +       powernv_freqs[i].frequency = CPUFREQ_TABLE_END;
> > > +
> > > +       /* Print frequency table */
> > > +       for (i = 0; powernv_freqs[i].frequency != CPUFREQ_TABLE_END; i++)
> > > +               pr_debug("%d: %d\n", i, powernv_freqs[i].frequency);
> > 
> > You have already printed this table earlier..
> 
> Fair enough.
> 
> > 
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static struct freq_attr *powernv_cpu_freq_attr[] = {
> > > +       &cpufreq_freq_attr_scaling_available_freqs,
> > > +       NULL,
> > > +};
> > 
> > Can use this instead: cpufreq_generic_attr?
> 
> In this patch yes. But later patch introduces an additional attribute
> for displaying the nominal frequency. Will handle that part in a clean
> way in the next version.
> 
> > 
> > > +/* Helper routines */
> > > +
> > > +/* Access helpers to power mgt SPR */
> > > +
> > > +static inline unsigned long get_pmspr(unsigned long sprn)
> > 
> > Looks big enough not be inlined?
> 
> It is called from one function. It has been defined separately for
> readability. 

Let the compiler decide.  The code is straight forward :)

I wanted to keep this SPR operations in a separate function to
facilitate debug and also introduce any timing/sequence change here if
required.  Keeping this separate is helpful, if inlining is
successful, we get a bonus :)

> > 
> > > +{
> > > +       switch (sprn) {
> > > +       case SPRN_PMCR:
> > > +               return mfspr(SPRN_PMCR);
> > > +
> > > +       case SPRN_PMICR:
> > > +               return mfspr(SPRN_PMICR);
> > > +
> > > +       case SPRN_PMSR:
> > > +               return mfspr(SPRN_PMSR);
> > > +       }
> > > +       BUG();
> > > +}
> > > +
> > > +static inline void set_pmspr(unsigned long sprn, unsigned long val)
> > > +{
> > 
> > Same here..
> 
> Same reason as above.
> 
> > 
> > > +       switch (sprn) {
> > > +       case SPRN_PMCR:
> > > +               mtspr(SPRN_PMCR, val);
> > > +               return;
> > > +
> > > +       case SPRN_PMICR:
> > > +               mtspr(SPRN_PMICR, val);
> > > +               return;
> > > +
> > > +       case SPRN_PMSR:
> > > +               mtspr(SPRN_PMSR, val);
> > > +               return;
> > > +       }
> > > +       BUG();
> > > +}
> > > +
> > > +static void set_pstate(void *pstate)
> > > +{
> > > +       unsigned long val;
> > > +       unsigned long pstate_ul = *(unsigned long *) pstate;
> > 
> > Why not sending value only to this routine instead of pointer?
> 
> Well this function is called via an smp_call_function. so, cannot send
> a value :(
> 
> > 
> > > +
> > > +       val = get_pmspr(SPRN_PMCR);
> > > +       val = val & 0x0000ffffffffffffULL;
> > 
> > Maybe a blank line here?
> 
> Ok.
> 
> > 
> > > +       /* Set both global(bits 56..63) and local(bits 48..55) PStates */
> > > +       val = val | (pstate_ul << 56) | (pstate_ul << 48);
> > 
> > here as well?
> 
> Ok.
> > 
> 
> > > +       pr_debug("Setting cpu %d pmcr to %016lX\n", smp_processor_id(), val);
> > > +       set_pmspr(SPRN_PMCR, val);
> > > +}
> > > +
> > > +static int powernv_set_freq(cpumask_var_t cpus, unsigned int new_index)
> > > +{
> > > +       unsigned long val = (unsigned long) powernv_pstate_ids[new_index];
> > 
> > I think automatic type conversion will happen here.
> 
> Ok. Will fix this.

Most of the conversions are verbose mainly to help readability of the
required sign-extensions.  There is scope to make these concise.

> > 
> > > +
> > > +       /*
> > > +        * Use smp_call_function to send IPI and execute the
> > > +        * mtspr on target cpu.  We could do that without IPI
> > > +        * if current CPU is within policy->cpus (core)
> > > +        */
> > 
> > Hmm, interesting I also feel there are cases where this routine can
> > get called from other CPUs. Can you list those use cases where it can
> > happen? Governors will mostly call this from one of the CPUs present
> > in policy->cpus.
> 
> Consider the case when the governor is userspace and we are executing 
> 
>     # echo freqvalue  > 
>          /sys/devices/system/cpu/cpu<i>/cpufreq/scaling_setspeed 
> 
> from a cpu <j> which is not in policy->cpus of cpu i. 
> 
> 
> > > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> > > +{
> > > +       int base, i;
> > > +
> > > +#ifdef CONFIG_SMP
> > 
> > What will break if you don't have this ifdef here? Without that as well
> > below code should work?
> > 
> > > +       base = cpu_first_thread_sibling(policy->cpu);
> > > +
> > > +       for (i = 0; i < threads_per_core; i++)
> > > +               cpumask_set_cpu(base + i, policy->cpus);
> > > +#endif
> > > +       policy->cpuinfo.transition_latency = 25000;
> > > +
> > > +       policy->cur = powernv_freqs[0].frequency;
> > > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> > 
> > This doesn't exist anymore.
> 
> Didn't get this comment!
> 
> > 
> > > +       return cpufreq_frequency_table_cpuinfo(policy, powernv_freqs);
> > 
> > Have you written this driver long time back? CPUFreq core has been
> > cleaned up heavily since last few kernel releases and I think there are
> > better helper routines available now.
> 
> Yup it was written quite a while ago. And yeah, CPUFreq has changed
> quite a bit since the last time I saw it :-)

Yes, the driver is more than a year old and based on even older
implementation that I had written.  I kept up until I got a compiler
warning or functional issue.  Definitely could not catch-up with your
cleanups :)

> > 
> > > +}
> > > +
> > > +static int powernv_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> > > +{
> > > +       cpufreq_frequency_table_put_attr(policy->cpu);
> > > +       return 0;
> > > +}
> > 
> > You don't need this..
> 
> Why not ?
> 
> > 
> > > +static int powernv_cpufreq_verify(struct cpufreq_policy *policy)
> > > +{
> > > +       return cpufreq_frequency_table_verify(policy, powernv_freqs);
> > > +}
> > 
> > use generic verify function pointer instead..
> > 
> > > +static int powernv_cpufreq_target(struct cpufreq_policy *policy,
> > > +                             unsigned int target_freq,
> > > +                             unsigned int relation)
> > 
> > use target_index() instead..
> > 
> > > +{
> > > +       int rc;
> > > +       struct cpufreq_freqs freqs;
> > > +       unsigned int new_index;
> > > +
> > > +       cpufreq_frequency_table_target(policy, powernv_freqs, target_freq,
> > > +                                      relation, &new_index);
> > > +
> > > +       freqs.old = policy->cur;
> > > +       freqs.new = powernv_freqs[new_index].frequency;
> > > +       freqs.cpu = policy->cpu;
> > > +
> > > +       mutex_lock(&freq_switch_mutex);
> > 
> > Why do you need this lock for?
> 
> I guess it was to serialize accesses to PMCR. But Srivatsa's patch
> converts this to a per-core lock which probably is no longer required
> after your cpufreq_freq_transition_begin/end() patch series.

Right.  Frequency transitions are per-core, the h/w SPRs are per-core.
We need to prevent threads from colliding on h/w SPR access.  We do
make the lock per-core later in the series as mentioned above.

Gautham has addressed most comments.

Thanks,
Vaidy

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 10:54         ` Viresh Kumar
@ 2014-03-21 14:54           ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 14:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan, Srivatsa S. Bhat, Anton Blanchard

On Fri, Mar 21, 2014 at 04:24:27PM +0530, Viresh Kumar wrote:
> >> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> >> > +{
> >> > +       int base, i;
> >> > +
> >> > +       base = cpu_first_thread_sibling(policy->cpu);
> >> > +
> >> > +       for (i = 0; i < threads_per_core; i++)
> >> > +               cpumask_set_cpu(base + i, policy->cpus);
> >> > +       policy->cpuinfo.transition_latency = 25000;
> >> > +
> >> > +       policy->cur = powernv_freqs[0].frequency;
> >> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> >>
> >> This doesn't exist anymore.

Ok, I guess the right thing to do at this point is call

   cpufreq_table_validate_and_show(policy, powernv_freqs);

Will fix the code to take care of this.
--
Thanks and Regards
gautham.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-21 14:54           ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-21 14:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: ego, Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

On Fri, Mar 21, 2014 at 04:24:27PM +0530, Viresh Kumar wrote:
> >> > +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy)
> >> > +{
> >> > +       int base, i;
> >> > +
> >> > +       base = cpu_first_thread_sibling(policy->cpu);
> >> > +
> >> > +       for (i = 0; i < threads_per_core; i++)
> >> > +               cpumask_set_cpu(base + i, policy->cpus);
> >> > +       policy->cpuinfo.transition_latency = 25000;
> >> > +
> >> > +       policy->cur = powernv_freqs[0].frequency;
> >> > +       cpufreq_frequency_table_get_attr(powernv_freqs, policy->cpu);
> >>
> >> This doesn't exist anymore.

Ok, I guess the right thing to do at this point is call

   cpufreq_table_validate_and_show(policy, powernv_freqs);

Will fix the code to take care of this.
--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 12:34             ` Viresh Kumar
@ 2014-03-21 15:01               ` David Laight
  -1 siblings, 0 replies; 80+ messages in thread
From: David Laight @ 2014-03-21 15:01 UTC (permalink / raw)
  To: 'Viresh Kumar'; +Cc: ego, linuxppc-dev, Linux PM list

From: Viresh Kumar [mailto:viresh.kumar@linaro.org]
> On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
> >> *(int *)ret_freq = freq;
> >
> > Because it is very likely to be wrong.
> > In general casts of pointers to integer types are dangerous.
> 
> Where are we converting pointers to integers? We are doing a
> cast from 'void * ' to 'int *' and then using indirection operator
> to set its value.

You mis-parsed what I wrote, try:
In general casts of 'pointer to integer' types are dangerous.

Somewhere, much higher up the call stack, the address of an integer
variable is being taken and then passed as the 'void *' parameter.

The 'problem' is that it is quite easily to pass the address of
a 'long' instead. On 32bit and LE systems this won't always
be a problem. On 64bit BE it all goes wrong.

	David




^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 15:01               ` David Laight
  0 siblings, 0 replies; 80+ messages in thread
From: David Laight @ 2014-03-21 15:01 UTC (permalink / raw)
  To: 'Viresh Kumar'; +Cc: ego, Linux PM list, linuxppc-dev

From: Viresh Kumar [mailto:viresh.kumar@linaro.org]
> On 21 March 2014 17:31, David Laight <David.Laight@aculab.com> wrote:
> >> *(int *)ret_freq =3D freq;
> >
> > Because it is very likely to be wrong.
> > In general casts of pointers to integer types are dangerous.
>=20
> Where are we converting pointers to integers? We are doing a
> cast from 'void * ' to 'int *' and then using indirection operator
> to set its value.

You mis-parsed what I wrote, try:
In general casts of 'pointer to integer' types are dangerous.

Somewhere, much higher up the call stack, the address of an integer
variable is being taken and then passed as the 'void *' parameter.

The 'problem' is that it is quite easily to pass the address of
a 'long' instead. On 32bit and LE systems this won't always
be a problem. On 64bit BE it all goes wrong.

	David

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 11:04       ` Gautham R Shenoy
@ 2014-03-21 22:56         ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 80+ messages in thread
From: Benjamin Herrenschmidt @ 2014-03-21 22:56 UTC (permalink / raw)
  To: ego; +Cc: Viresh Kumar, linuxppc-dev, Linux PM list, svaidy

On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:

> > >
> > > +/*
> > > + * Computes the current frequency on this cpu
> > > + * and stores the result in *ret_freq.
> > > + */
> > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > +{
> > > +       unsigned long pmspr_val;
> > > +       s8 local_pstate_id;
> > > +       int *cur_freq, freq, pstate_id;
> > > +
> > > +       cur_freq = (int *)ret_freq;
> > 
> > You don't need cur_freq variable at all..
> 
> I don't like it either. But the compiler complains without this hack
> :-(

Casting integers into void * is a recipe for disaster... what is that
supposed to be about ? We lose all type checking and get exposed
to endian issues etc... the day somebody uses a different type on both
sides.

Also is "freq" a frequency ? In this case an int isn't big enough.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-21 22:56         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 80+ messages in thread
From: Benjamin Herrenschmidt @ 2014-03-21 22:56 UTC (permalink / raw)
  To: ego; +Cc: Viresh Kumar, Linux PM list, linuxppc-dev

On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:

> > >
> > > +/*
> > > + * Computes the current frequency on this cpu
> > > + * and stores the result in *ret_freq.
> > > + */
> > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > +{
> > > +       unsigned long pmspr_val;
> > > +       s8 local_pstate_id;
> > > +       int *cur_freq, freq, pstate_id;
> > > +
> > > +       cur_freq = (int *)ret_freq;
> > 
> > You don't need cur_freq variable at all..
> 
> I don't like it either. But the compiler complains without this hack
> :-(

Casting integers into void * is a recipe for disaster... what is that
supposed to be about ? We lose all type checking and get exposed
to endian issues etc... the day somebody uses a different type on both
sides.

Also is "freq" a frequency ? In this case an int isn't big enough.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 14:48         ` Vaidyanathan Srinivasan
@ 2014-03-22  3:43           ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-22  3:43 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Gautham R Shenoy, linuxppc-dev, Linux PM list,
	Benjamin Herrenschmidt, Srivatsa S. Bhat, Anton Blanchard

On 21 March 2014 20:18, Vaidyanathan Srinivasan
<svaidy@linux.vnet.ibm.com> wrote:
> Yeah, I had the driver written using driver_data to store pstates.
> Gautham found the bug that we are missing one PState when we match the
> ID with CPUFREQ_BOOST_FREQ!

I see..

> We did not know that you have taken care of those issues.  Ideally
> I did expect that driver_data should not be touched by the framework.
> Thanks for fixing that and allowing the back-end driver to use
> driver_data.

No, I haven't fixed anything yet. And this piece of code still exists.
I will see if I can get this fixed, by that time you can continue the
way your code is there in this version.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-22  3:43           ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-22  3:43 UTC (permalink / raw)
  To: Vaidyanathan Srinivasan
  Cc: Gautham R Shenoy, Linux PM list, linuxppc-dev, Anton Blanchard,
	Srivatsa S. Bhat

On 21 March 2014 20:18, Vaidyanathan Srinivasan
<svaidy@linux.vnet.ibm.com> wrote:
> Yeah, I had the driver written using driver_data to store pstates.
> Gautham found the bug that we are missing one PState when we match the
> ID with CPUFREQ_BOOST_FREQ!

I see..

> We did not know that you have taken care of those issues.  Ideally
> I did expect that driver_data should not be touched by the framework.
> Thanks for fixing that and allowing the back-end driver to use
> driver_data.

No, I haven't fixed anything yet. And this piece of code still exists.
I will see if I can get this fixed, by that time you can continue the
way your code is there in this version.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
  2014-03-21 14:54           ` Gautham R Shenoy
@ 2014-03-22  3:43             ` Viresh Kumar
  -1 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-22  3:43 UTC (permalink / raw)
  To: ego
  Cc: linuxppc-dev, Linux PM list, Benjamin Herrenschmidt,
	Vaidyanathan Srinivasan, Srivatsa S. Bhat, Anton Blanchard

On 21 March 2014 20:24, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Ok, I guess the right thing to do at this point is call
>
>    cpufreq_table_validate_and_show(policy, powernv_freqs);
>
> Will fix the code to take care of this.

Yes.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 1/5] powernv: cpufreq driver for powernv platform
@ 2014-03-22  3:43             ` Viresh Kumar
  0 siblings, 0 replies; 80+ messages in thread
From: Viresh Kumar @ 2014-03-22  3:43 UTC (permalink / raw)
  To: ego; +Cc: Linux PM list, linuxppc-dev, Anton Blanchard, Srivatsa S. Bhat

On 21 March 2014 20:24, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> Ok, I guess the right thing to do at this point is call
>
>    cpufreq_table_validate_and_show(policy, powernv_freqs);
>
> Will fix the code to take care of this.

Yes.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
  2014-03-21 22:56         ` Benjamin Herrenschmidt
@ 2014-03-22  7:53           ` Gautham R Shenoy
  -1 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-22  7:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: ego, Viresh Kumar, linuxppc-dev, Linux PM list, svaidy

Hi Ben,

On Sat, Mar 22, 2014 at 09:56:30AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:
> 
> > > >
> > > > +/*
> > > > + * Computes the current frequency on this cpu
> > > > + * and stores the result in *ret_freq.
> > > > + */
> > > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > > +{
> > > > +       unsigned long pmspr_val;
> > > > +       s8 local_pstate_id;
> > > > +       int *cur_freq, freq, pstate_id;
> > > > +
> > > > +       cur_freq = (int *)ret_freq;
> > > 
> > > You don't need cur_freq variable at all..
> > 
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Casting integers into void * is a recipe for disaster... what is that
> supposed to be about ?

Like I mentioned elsewhere on this thread, we're calling
powernv_read_cpu_freq via an smp_call_function(). We use this to
obtain the frequency on the cpu where powernv_read_cpu_freq
executes and return it to the caller of smp_call_function.

> We lose all type checking and get exposed
> to endian issues etc... the day somebody uses a different type on both
> sides.
> 
Yes, I understand the problem now. I'll think of a safer way to pass
the return value.

> Also is "freq" a frequency ? In this case an int isn't big enough.

freq is the frequency stored in the cpufreq_table. The value is in
kHz. So, int should be big enough.

> Cheers,
> Ben.
> 
>

--
Thanks and Regards
gautham. 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method
@ 2014-03-22  7:53           ` Gautham R Shenoy
  0 siblings, 0 replies; 80+ messages in thread
From: Gautham R Shenoy @ 2014-03-22  7:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, ego, Linux PM list, Viresh Kumar

Hi Ben,

On Sat, Mar 22, 2014 at 09:56:30AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2014-03-21 at 16:34 +0530, Gautham R Shenoy wrote:
> 
> > > >
> > > > +/*
> > > > + * Computes the current frequency on this cpu
> > > > + * and stores the result in *ret_freq.
> > > > + */
> > > > +static void powernv_read_cpu_freq(void *ret_freq)
> > > > +{
> > > > +       unsigned long pmspr_val;
> > > > +       s8 local_pstate_id;
> > > > +       int *cur_freq, freq, pstate_id;
> > > > +
> > > > +       cur_freq = (int *)ret_freq;
> > > 
> > > You don't need cur_freq variable at all..
> > 
> > I don't like it either. But the compiler complains without this hack
> > :-(
> 
> Casting integers into void * is a recipe for disaster... what is that
> supposed to be about ?

Like I mentioned elsewhere on this thread, we're calling
powernv_read_cpu_freq via an smp_call_function(). We use this to
obtain the frequency on the cpu where powernv_read_cpu_freq
executes and return it to the caller of smp_call_function.

> We lose all type checking and get exposed
> to endian issues etc... the day somebody uses a different type on both
> sides.
> 
Yes, I understand the problem now. I'll think of a safer way to pass
the return value.

> Also is "freq" a frequency ? In this case an int isn't big enough.

freq is the frequency stored in the cpufreq_table. The value is in
kHz. So, int should be big enough.

> Cheers,
> Ben.
> 
>

--
Thanks and Regards
gautham. 

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2014-03-22  7:54 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-20 12:10 [PATCH v3 0/5] powernv: Enable Dynamic Frequency Gautham R. Shenoy
2014-03-20 12:10 ` Gautham R. Shenoy
2014-03-20 12:10 ` [PATCH v3 1/5] powernv: cpufreq driver for powernv platform Gautham R. Shenoy
2014-03-20 12:10   ` Gautham R. Shenoy
2014-03-21  8:41   ` Viresh Kumar
2014-03-21  8:41     ` Viresh Kumar
2014-03-21 10:43     ` Gautham R Shenoy
2014-03-21 10:43       ` Gautham R Shenoy
2014-03-21 10:54       ` Viresh Kumar
2014-03-21 10:54         ` Viresh Kumar
2014-03-21 11:40         ` Srivatsa S. Bhat
2014-03-21 11:40           ` Srivatsa S. Bhat
2014-03-21 13:23         ` Gautham R Shenoy
2014-03-21 13:23           ` Gautham R Shenoy
2014-03-21 13:34           ` Viresh Kumar
2014-03-21 13:34             ` Viresh Kumar
2014-03-21 14:54         ` Gautham R Shenoy
2014-03-21 14:54           ` Gautham R Shenoy
2014-03-22  3:43           ` Viresh Kumar
2014-03-22  3:43             ` Viresh Kumar
2014-03-21 14:48       ` Vaidyanathan Srinivasan
2014-03-21 14:48         ` Vaidyanathan Srinivasan
2014-03-22  3:43         ` Viresh Kumar
2014-03-22  3:43           ` Viresh Kumar
2014-03-21 11:45     ` Srivatsa S. Bhat
2014-03-21 11:45       ` Srivatsa S. Bhat
2014-03-21 11:47       ` Viresh Kumar
2014-03-21 11:47         ` Viresh Kumar
2014-03-20 12:10 ` [PATCH v3 2/5] powernv,cpufreq:Add per-core locking to serialize frequency transitions Gautham R. Shenoy
2014-03-20 12:10   ` [PATCH v3 2/5] powernv, cpufreq:Add " Gautham R. Shenoy
2014-03-21  6:24   ` [PATCH v3 2/5] powernv,cpufreq:Add " Gautham R Shenoy
2014-03-21  6:24     ` [PATCH v3 2/5] powernv, cpufreq:Add " Gautham R Shenoy
2014-03-21  8:42   ` Viresh Kumar
2014-03-21  8:42     ` Viresh Kumar
2014-03-21  9:56     ` [PATCH v3 2/5] powernv,cpufreq:Add " Srivatsa S. Bhat
2014-03-21  9:56       ` [PATCH v3 2/5] powernv, cpufreq:Add " Srivatsa S. Bhat
2014-03-20 12:10 ` [PATCH v3 3/5] powernv:cpufreq: Create pstate_id_to_freq() helper Gautham R. Shenoy
2014-03-20 12:10   ` Gautham R. Shenoy
2014-03-21  6:24   ` Gautham R Shenoy
2014-03-21  6:24     ` Gautham R Shenoy
2014-03-20 12:10 ` [PATCH v3 4/5] powernv:cpufreq: Export nominal frequency via sysfs Gautham R. Shenoy
2014-03-20 12:10   ` Gautham R. Shenoy
2014-03-21  8:47   ` Viresh Kumar
2014-03-21  8:47     ` Viresh Kumar
2014-03-21  9:55     ` Gautham R Shenoy
2014-03-21  9:55       ` Gautham R Shenoy
2014-03-21  9:57       ` Viresh Kumar
2014-03-21  9:57         ` Viresh Kumar
2014-03-20 12:11 ` [PATCH v3 5/5] powernv:cpufreq: Implement the driver->get() method Gautham R. Shenoy
2014-03-20 12:11   ` Gautham R. Shenoy
2014-03-21  6:25   ` Gautham R Shenoy
2014-03-21  6:25     ` Gautham R Shenoy
2014-03-21  9:31   ` Viresh Kumar
2014-03-21  9:31     ` Viresh Kumar
2014-03-21 11:04     ` Gautham R Shenoy
2014-03-21 11:04       ` Gautham R Shenoy
2014-03-21 11:45       ` Viresh Kumar
2014-03-21 11:45         ` Viresh Kumar
2014-03-21 12:01         ` David Laight
2014-03-21 12:01           ` David Laight
2014-03-21 12:31           ` Gautham R Shenoy
2014-03-21 12:31             ` Gautham R Shenoy
2014-03-21 12:34           ` Viresh Kumar
2014-03-21 12:34             ` Viresh Kumar
2014-03-21 15:01             ` David Laight
2014-03-21 15:01               ` David Laight
2014-03-21 13:04         ` Gautham R Shenoy
2014-03-21 13:04           ` Gautham R Shenoy
2014-03-21 13:12           ` Viresh Kumar
2014-03-21 13:12             ` Viresh Kumar
2014-03-21 14:25             ` Gautham R Shenoy
2014-03-21 14:25               ` Gautham R Shenoy
2014-03-21 22:56       ` Benjamin Herrenschmidt
2014-03-21 22:56         ` Benjamin Herrenschmidt
2014-03-22  7:53         ` Gautham R Shenoy
2014-03-22  7:53           ` Gautham R Shenoy
2014-03-21  7:46 ` [PATCH v3 0/5] powernv: Enable Dynamic Frequency Viresh Kumar
2014-03-21  7:46   ` Viresh Kumar
2014-03-21  8:19   ` Gautham R Shenoy
2014-03-21  8:19     ` Gautham R Shenoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.