linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes.
@ 2017-12-13  6:57 Gautham R. Shenoy
  2017-12-13  6:57 ` [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR Gautham R. Shenoy
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Gautham R. Shenoy @ 2017-12-13  6:57 UTC (permalink / raw)
  To: Shilpasri G Bhat, viresh.kumar, rjw, huntbag, akshay.adiga,
	Michael Ellerman, Vaidyanathan Srinivasan, Balbir Singh
  Cc: linux-pm, linux-kernel, linuxppc-dev, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

 This is a third version of the patch to fix pstate related issues in
 the powernv-cpufreq driver.

 The previous versions can be found here:
 [v2]: https://lkml.org/lkml/2017/12/7/1562
 [v1]: https://lkml.org/lkml/2017/11/29/1338

 On POWERNV platform, Pstates are 8-bit values. On POWER8 they are
 negatively numbered while on POWER9 they are positively
 numbered. Thus, on POWER9, the maximum number of pstates could be as
 high as 256.
    
 In multiple places, the current code interprets pstates as a signed
 8-bit value which subsequently gets assigned to a signed integer
 variable. This causes a problem on POWER9 platforms which have more
 than 128 pstates.  On such systems, on a CPU that is in a lower
 pstate whose number is greater than 128, querying the current pstate
 via the pstate_to_idx() returns a "pstate X is out of bound" error
 message and the current pstate is reported as the nominal
 pstate. This is due to the manner in which the bounds are checked in
 pstate_to_idx which again depends on the sign of pstates and whether
 the pstates max to min are monotonically increasing or decreasing.
    
 Further the current code makes a couple of assumptions which is not
 guaranteed by the device-tree bindings:

      1) Pstate ids are continguous.
    
      2) Every Pstate should always lie between the max and the min
         pstates that are explicitly reported in the device tree.
    
Both these assumptions are unwarranted and can change on future
platforms.
    
In this patch-series, we fix the implementation via the following
changes:

PATCH 1: Define a helper function to correctly extract the pstates
         from the PMCR and take care of any sign extentions. This is
         an immediate fix to add the ability to handle more than 128
         pstates on POWER9 systems.
    
PATCH 2: Define a hash-map which will return the index into the
         cpufreq frequency table for a given pstate. Use this hashmap
         in the implementation of pstate_to_idx(). This does away with
         the assumptions (1) mentioned above, and will work with non
         continguous pstate ids. If no entry exists for a particular
         pstate, then such a pstate is treated as being out of
         bounds. This gets rid of assumption (2).

PATCH 3: Treat pstates as opaque 8-bit values consistent with the
         definitions in the PMSR and PMCR. We no longer need any
         sign-extentions nor do we require to interpret the sign of
         the pstates anywhere in the code.
    

Gautham R. Shenoy (3):
  powernv-cpufreq: Add helper to extract pstate from PMSR
  powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  powernv-cpufreq: Treat pstates as opaque 8-bit values

 drivers/cpufreq/powernv-cpufreq.c | 139 ++++++++++++++++++++++++--------------
 1 file changed, 90 insertions(+), 49 deletions(-)

-- 
1.9.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR
  2017-12-13  6:57 [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Gautham R. Shenoy
@ 2017-12-13  6:57 ` Gautham R. Shenoy
  2017-12-17  3:04   ` Balbir Singh
  2017-12-13  6:57 ` [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates Gautham R. Shenoy
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Gautham R. Shenoy @ 2017-12-13  6:57 UTC (permalink / raw)
  To: Shilpasri G Bhat, viresh.kumar, rjw, huntbag, akshay.adiga,
	Michael Ellerman, Vaidyanathan Srinivasan, Balbir Singh
  Cc: linux-pm, linux-kernel, linuxppc-dev, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On POWERNV platform, the fields for pstates in the Power Management
Status Register (PMSR) and the Power Management Control Register
(PMCR) are 8-bits wide. On POWER8 the pstates are negatively numbered
while on POWER9 they are positively numbered.

The device-tree exports pstates as 32-bit entries. The device-tree
implementation sign-extends the 8-bit pstate values to obtain the
corresponding 32-bit entry.

Eg: On POWER8, a pstate value 0x82 [-126] is represented in the
device-tree as 0xfffffff82 while on POWER9, the same value 0x82 [130]
is represented in the device-tree as 0x00000082.

The powernv-cpufreq driver implementation represents pstates using the
integer type. In multiple places in the driver, the code interprets
the pstates extracted from the PMSR as a signed byte and assigns it to
a integer variable to get the sign-extention.

On POWER9 platforms which have greater than 128 pstates, this results
in the driver performing incorrect sign-extention, and thereby
treating a legitimate pstate (say 130) as an invalid pstates (since it
is interpreted as -126).

This patch fixes the issue by implementing a helper function to
extract Pstates from PMSR register, and correctly sign-extend it to be
consistent with the values provided by the device-tree.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index b6d7c4c..f46b60f 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -41,11 +41,9 @@
 #define POWERNV_MAX_PSTATES	256
 #define PMSR_PSAFE_ENABLE	(1UL << 30)
 #define PMSR_SPR_EM_DISABLE	(1UL << 31)
-#define PMSR_MAX(x)		((x >> 32) & 0xFF)
+#define MAX_PSTATE_SHIFT	32
 #define LPSTATE_SHIFT		48
 #define GPSTATE_SHIFT		56
-#define GET_LPSTATE(x)		(((x) >> LPSTATE_SHIFT) & 0xFF)
-#define GET_GPSTATE(x)		(((x) >> GPSTATE_SHIFT) & 0xFF)
 
 #define MAX_RAMP_DOWN_TIME				5120
 /*
@@ -94,6 +92,7 @@ struct global_pstate_info {
 };
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+u32 pstate_sign_prefix;
 static bool rebooting, throttled, occ_reset;
 
 static const char * const throttle_reason[] = {
@@ -148,6 +147,20 @@ enum throttle_reason_type {
 	bool wof_enabled;
 } powernv_pstate_info;
 
+static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
+{
+	int ret = ((pmsr_val >> shift) & 0xFF);
+
+	if (!ret)
+		return ret;
+
+	return (pstate_sign_prefix | ret);
+}
+
+#define extract_local_pstate(x) extract_pstate(x, LPSTATE_SHIFT)
+#define extract_global_pstate(x) extract_pstate(x, GPSTATE_SHIFT)
+#define extract_max_pstate(x)  extract_pstate(x, MAX_PSTATE_SHIFT)
+
 /* Use following macros for conversions between pstate_id and index */
 static inline int idx_to_pstate(unsigned int i)
 {
@@ -278,6 +291,9 @@ static int init_powernv_pstates(void)
 
 	powernv_pstate_info.nr_pstates = nr_pstates;
 	pr_debug("NR PStates %d\n", nr_pstates);
+
+	pstate_sign_prefix = pstate_min & ~0xFF;
+
 	for (i = 0; i < nr_pstates; i++) {
 		u32 id = be32_to_cpu(pstate_ids[i]);
 		u32 freq = be32_to_cpu(pstate_freqs[i]);
@@ -438,17 +454,10 @@ struct powernv_smp_call_data {
 static void powernv_read_cpu_freq(void *arg)
 {
 	unsigned long pmspr_val;
-	s8 local_pstate_id;
 	struct powernv_smp_call_data *freq_data = arg;
 
 	pmspr_val = get_pmspr(SPRN_PMSR);
-
-	/*
-	 * The local pstate id corresponds bits 48..55 in the PMSR.
-	 * Note: Watch out for the sign!
-	 */
-	local_pstate_id = (pmspr_val >> 48) & 0xFF;
-	freq_data->pstate_id = local_pstate_id;
+	freq_data->pstate_id = extract_local_pstate(pmspr_val);
 	freq_data->freq = pstate_id_to_freq(freq_data->pstate_id);
 
 	pr_debug("cpu %d pmsr %016lX pstate_id %d frequency %d kHz\n",
@@ -522,7 +531,7 @@ static void powernv_cpufreq_throttle_check(void *data)
 	chip = this_cpu_read(chip_info);
 
 	/* Check for Pmax Capping */
-	pmsr_pmax = (s8)PMSR_MAX(pmsr);
+	pmsr_pmax = extract_max_pstate(pmsr);
 	pmsr_pmax_idx = pstate_to_idx(pmsr_pmax);
 	if (pmsr_pmax_idx != powernv_pstate_info.max) {
 		if (chip->throttled)
@@ -645,8 +654,8 @@ void gpstate_timer_handler(struct timer_list *t)
 	 * value. Hence, read from PMCR to get correct data.
 	 */
 	val = get_pmspr(SPRN_PMCR);
-	freq_data.gpstate_id = (s8)GET_GPSTATE(val);
-	freq_data.pstate_id = (s8)GET_LPSTATE(val);
+	freq_data.gpstate_id = extract_global_pstate(val);
+	freq_data.pstate_id = extract_local_pstate(val);
 	if (freq_data.gpstate_id  == freq_data.pstate_id) {
 		reset_gpstates(policy);
 		spin_unlock(&gpstates->gpstate_lock);
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2017-12-13  6:57 [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Gautham R. Shenoy
  2017-12-13  6:57 ` [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR Gautham R. Shenoy
@ 2017-12-13  6:57 ` Gautham R. Shenoy
  2017-12-17  3:15   ` Balbir Singh
  2017-12-13  6:57 ` [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values Gautham R. Shenoy
  2018-01-10  9:43 ` [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Viresh Kumar
  3 siblings, 1 reply; 15+ messages in thread
From: Gautham R. Shenoy @ 2017-12-13  6:57 UTC (permalink / raw)
  To: Shilpasri G Bhat, viresh.kumar, rjw, huntbag, akshay.adiga,
	Michael Ellerman, Vaidyanathan Srinivasan, Balbir Singh
  Cc: linux-pm, linux-kernel, linuxppc-dev, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

The code in powernv-cpufreq, makes the following two assumptions which
are not guaranteed by the device-tree bindings:

    1) Pstate ids are continguous: This is used in pstate_to_idx() to
       obtain the reverse map from a pstate to it's corresponding
       entry into the cpufreq frequency table.

    2) Every Pstate should always lie between the max and the min
       pstates that are explicitly reported in the device tree: This
       is used to determine whether a pstate reported by the PMSR is
       out of bounds.

Both these assumptions are unwarranted and can change on future
platforms.

In this patch, we maintain the reverse map from a pstate to it's index
in the cpufreq frequency table and use this in pstate_to_idx(). This
does away with the assumptions (1) mentioned above, and will work with
non continguous pstate ids. If no entry exists for a particular
pstate, then such a pstate is treated as being out of bounds. This
gets rid of assumption (2).

On all the existing platforms, where the pstates are 8-bit long
values, the new implementation of pstate_to_idx() takes constant time.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 85 +++++++++++++++++++++++++++++----------
 1 file changed, 63 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index f46b60f..8e3dbca 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -29,6 +29,7 @@
 #include <linux/reboot.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
+#include <linux/hashtable.h>
 #include <trace/events/power.h>
 
 #include <asm/cputhreads.h>
@@ -38,7 +39,8 @@
 #include <asm/opal.h>
 #include <linux/timer.h>
 
-#define POWERNV_MAX_PSTATES	256
+#define POWERNV_MAX_PSTATES_ORDER  8
+#define POWERNV_MAX_PSTATES	(1UL << (POWERNV_MAX_PSTATES_ORDER))
 #define PMSR_PSAFE_ENABLE	(1UL << 30)
 #define PMSR_SPR_EM_DISABLE	(1UL << 31)
 #define MAX_PSTATE_SHIFT	32
@@ -92,6 +94,27 @@ struct global_pstate_info {
 };
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+
+DEFINE_HASHTABLE(pstate_revmap, POWERNV_MAX_PSTATES_ORDER);
+/**
+ * struct pstate_idx_revmap_data: Entry in the hashmap pstate_revmap
+ *				  indexed by a function of pstate id.
+ *
+ * @pstate_id: pstate id for this entry.
+ *
+ * @cpufreq_table_idx: Index into the powernv_freqs
+ *		       cpufreq_frequency_table for frequency
+ *		       corresponding to pstate_id.
+ *
+ * @hentry: hlist_node that hooks this entry into the pstate_revmap
+ *	    hashtable
+ */
+struct pstate_idx_revmap_data {
+	int pstate_id;
+	unsigned int cpufreq_table_idx;
+	struct hlist_node hentry;
+};
+
 u32 pstate_sign_prefix;
 static bool rebooting, throttled, occ_reset;
 
@@ -161,39 +184,47 @@ static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
 #define extract_global_pstate(x) extract_pstate(x, GPSTATE_SHIFT)
 #define extract_max_pstate(x)  extract_pstate(x, MAX_PSTATE_SHIFT)
 
-/* Use following macros for conversions between pstate_id and index */
+/* Use following functions for conversions between pstate_id and index */
+
+/**
+ * idx_to_pstate : Returns the pstate id corresponding to the
+ *		   frequency in the cpufreq frequency table
+ *		   powernv_freqs indexed by @i.
+ *
+ *		   If @i is out of bound, this will return the pstate
+ *		   corresponding to the nominal frequency.
+ */
 static inline int idx_to_pstate(unsigned int i)
 {
 	if (unlikely(i >= powernv_pstate_info.nr_pstates)) {
-		pr_warn_once("index %u is out of bound\n", i);
+		pr_warn_once("idx_to_pstate: index %u is out of bound\n", i);
 		return powernv_freqs[powernv_pstate_info.nominal].driver_data;
 	}
 
 	return powernv_freqs[i].driver_data;
 }
 
-static inline unsigned int pstate_to_idx(int pstate)
+/**
+ * pstate_to_idx : Returns the index in the cpufreq frequencytable
+ *		   powernv_freqs for the frequency whose corresponding
+ *		   pstate id is @pstate.
+ *
+ *		   If no frequency corresponding to @pstate is found,
+ *		   this will return the index of the nominal
+ *		   frequency.
+ */
+static unsigned int pstate_to_idx(int pstate)
 {
-	int min = powernv_freqs[powernv_pstate_info.min].driver_data;
-	int max = powernv_freqs[powernv_pstate_info.max].driver_data;
+	unsigned int key = pstate % POWERNV_MAX_PSTATES;
+	struct pstate_idx_revmap_data *revmap_data;
 
-	if (min > 0) {
-		if (unlikely((pstate < max) || (pstate > min))) {
-			pr_warn_once("pstate %d is out of bound\n", pstate);
-			return powernv_pstate_info.nominal;
-		}
-	} else {
-		if (unlikely((pstate > max) || (pstate < min))) {
-			pr_warn_once("pstate %d is out of bound\n", pstate);
-			return powernv_pstate_info.nominal;
-		}
+	hash_for_each_possible(pstate_revmap, revmap_data, hentry, key) {
+		if (revmap_data->pstate_id == pstate)
+			return revmap_data->cpufreq_table_idx;
 	}
-	/*
-	 * abs() is deliberately used so that is works with
-	 * both monotonically increasing and decreasing
-	 * pstate values
-	 */
-	return abs(pstate - idx_to_pstate(powernv_pstate_info.max));
+
+	pr_warn_once("pstate_to_idx: pstate %d not found\n", pstate);
+	return powernv_pstate_info.nominal;
 }
 
 static inline void reset_gpstates(struct cpufreq_policy *policy)
@@ -297,11 +328,21 @@ static int init_powernv_pstates(void)
 	for (i = 0; i < nr_pstates; i++) {
 		u32 id = be32_to_cpu(pstate_ids[i]);
 		u32 freq = be32_to_cpu(pstate_freqs[i]);
+		struct pstate_idx_revmap_data *revmap_data;
+		unsigned int key;
 
 		pr_debug("PState id %d freq %d MHz\n", id, freq);
 		powernv_freqs[i].frequency = freq * 1000; /* kHz */
 		powernv_freqs[i].driver_data = id;
 
+		revmap_data = (struct pstate_idx_revmap_data *)
+			      kmalloc(sizeof(*revmap_data), GFP_KERNEL);
+
+		revmap_data->pstate_id = id;
+		revmap_data->cpufreq_table_idx = i;
+		key = id % POWERNV_MAX_PSTATES;
+		hash_add(pstate_revmap, &revmap_data->hentry, key);
+
 		if (id == pstate_max)
 			powernv_pstate_info.max = i;
 		else if (id == pstate_nominal)
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values
  2017-12-13  6:57 [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Gautham R. Shenoy
  2017-12-13  6:57 ` [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR Gautham R. Shenoy
  2017-12-13  6:57 ` [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates Gautham R. Shenoy
@ 2017-12-13  6:57 ` Gautham R. Shenoy
  2017-12-17  3:17   ` Balbir Singh
  2018-01-10  9:43 ` [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Viresh Kumar
  3 siblings, 1 reply; 15+ messages in thread
From: Gautham R. Shenoy @ 2017-12-13  6:57 UTC (permalink / raw)
  To: Shilpasri G Bhat, viresh.kumar, rjw, huntbag, akshay.adiga,
	Michael Ellerman, Vaidyanathan Srinivasan, Balbir Singh
  Cc: linux-pm, linux-kernel, linuxppc-dev, Gautham R. Shenoy

From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>

On POWER8 and POWER9, the PMSR and the PMCR registers define pstates
to be 8-bit wide values. The device-tree exports pstates as 32-bit
wide values of which the lower byte is the actual pstate.

The current implementation in the kernel treats pstates as integer
type, since it used to use the sign of the pstate for performing some
boundary-checks. This is no longer required after the patch
"powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous
pstates".

So, in this patch, we modify the powernv-cpufreq driver to uniformly
treat pstates as opaque 8-bit values obtained from the device-tree or
the PMCR. This simplifies the extract_pstate() helper function since
we no longer no longer require to worry about the sign-extentions.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 47 ++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 8e3dbca..8a4e2ce 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -110,12 +110,11 @@ struct global_pstate_info {
  *	    hashtable
  */
 struct pstate_idx_revmap_data {
-	int pstate_id;
+	u8 pstate_id;
 	unsigned int cpufreq_table_idx;
 	struct hlist_node hentry;
 };
 
-u32 pstate_sign_prefix;
 static bool rebooting, throttled, occ_reset;
 
 static const char * const throttle_reason[] = {
@@ -170,14 +169,9 @@ enum throttle_reason_type {
 	bool wof_enabled;
 } powernv_pstate_info;
 
-static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
+static inline u8 extract_pstate(u64 pmsr_val, unsigned int shift)
 {
-	int ret = ((pmsr_val >> shift) & 0xFF);
-
-	if (!ret)
-		return ret;
-
-	return (pstate_sign_prefix | ret);
+	return ((pmsr_val >> shift) & 0xFF);
 }
 
 #define extract_local_pstate(x) extract_pstate(x, LPSTATE_SHIFT)
@@ -194,7 +188,7 @@ static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
  *		   If @i is out of bound, this will return the pstate
  *		   corresponding to the nominal frequency.
  */
-static inline int idx_to_pstate(unsigned int i)
+static inline u8 idx_to_pstate(unsigned int i)
 {
 	if (unlikely(i >= powernv_pstate_info.nr_pstates)) {
 		pr_warn_once("idx_to_pstate: index %u is out of bound\n", i);
@@ -213,7 +207,7 @@ static inline int idx_to_pstate(unsigned int i)
  *		   this will return the index of the nominal
  *		   frequency.
  */
-static unsigned int pstate_to_idx(int pstate)
+static unsigned int pstate_to_idx(u8 pstate)
 {
 	unsigned int key = pstate % POWERNV_MAX_PSTATES;
 	struct pstate_idx_revmap_data *revmap_data;
@@ -223,7 +217,7 @@ static unsigned int pstate_to_idx(int pstate)
 			return revmap_data->cpufreq_table_idx;
 	}
 
-	pr_warn_once("pstate_to_idx: pstate %d not found\n", pstate);
+	pr_warn_once("pstate_to_idx: pstate 0x%x not found\n", pstate);
 	return powernv_pstate_info.nominal;
 }
 
@@ -291,7 +285,7 @@ static int init_powernv_pstates(void)
 		powernv_pstate_info.wof_enabled = true;
 
 next:
-	pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+	pr_info("cpufreq pstate min 0x%x nominal 0x%x max 0x%x\n", pstate_min,
 		pstate_nominal, pstate_max);
 	pr_info("Workload Optimized Frequency is %s in the platform\n",
 		(powernv_pstate_info.wof_enabled) ? "enabled" : "disabled");
@@ -323,8 +317,6 @@ static int init_powernv_pstates(void)
 	powernv_pstate_info.nr_pstates = nr_pstates;
 	pr_debug("NR PStates %d\n", nr_pstates);
 
-	pstate_sign_prefix = pstate_min & ~0xFF;
-
 	for (i = 0; i < nr_pstates; i++) {
 		u32 id = be32_to_cpu(pstate_ids[i]);
 		u32 freq = be32_to_cpu(pstate_freqs[i]);
@@ -333,14 +325,14 @@ static int init_powernv_pstates(void)
 
 		pr_debug("PState id %d freq %d MHz\n", id, freq);
 		powernv_freqs[i].frequency = freq * 1000; /* kHz */
-		powernv_freqs[i].driver_data = id;
+		powernv_freqs[i].driver_data = id & 0xFF;
 
 		revmap_data = (struct pstate_idx_revmap_data *)
 			      kmalloc(sizeof(*revmap_data), GFP_KERNEL);
 
-		revmap_data->pstate_id = id;
+		revmap_data->pstate_id = id & 0xFF;
 		revmap_data->cpufreq_table_idx = i;
-		key = id % POWERNV_MAX_PSTATES;
+		key = (revmap_data->pstate_id) % POWERNV_MAX_PSTATES;
 		hash_add(pstate_revmap, &revmap_data->hentry, key);
 
 		if (id == pstate_max)
@@ -364,14 +356,13 @@ static int init_powernv_pstates(void)
 }
 
 /* Returns the CPU frequency corresponding to the pstate_id. */
-static unsigned int pstate_id_to_freq(int pstate_id)
+static unsigned int pstate_id_to_freq(u8 pstate_id)
 {
 	int i;
 
 	i = pstate_to_idx(pstate_id);
 	if (i >= powernv_pstate_info.nr_pstates || i < 0) {
-		pr_warn("PState id %d outside of PState table, "
-			"reporting nominal id %d instead\n",
+		pr_warn("PState id 0x%x outside of PState table, reporting nominal id 0x%x instead\n",
 			pstate_id, idx_to_pstate(powernv_pstate_info.nominal));
 		i = powernv_pstate_info.nominal;
 	}
@@ -477,8 +468,8 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
  */
 struct powernv_smp_call_data {
 	unsigned int freq;
-	int pstate_id;
-	int gpstate_id;
+	u8 pstate_id;
+	u8 gpstate_id;
 };
 
 /*
@@ -501,9 +492,9 @@ static void powernv_read_cpu_freq(void *arg)
 	freq_data->pstate_id = extract_local_pstate(pmspr_val);
 	freq_data->freq = pstate_id_to_freq(freq_data->pstate_id);
 
-	pr_debug("cpu %d pmsr %016lX pstate_id %d frequency %d kHz\n",
-		raw_smp_processor_id(), pmspr_val, freq_data->pstate_id,
-		freq_data->freq);
+	pr_debug("cpu %d pmsr %016lX pstate_id 0x%x frequency %d kHz\n",
+		 raw_smp_processor_id(), pmspr_val, freq_data->pstate_id,
+		 freq_data->freq);
 }
 
 /*
@@ -565,7 +556,7 @@ static void powernv_cpufreq_throttle_check(void *data)
 	struct chip *chip;
 	unsigned int cpu = smp_processor_id();
 	unsigned long pmsr;
-	int pmsr_pmax;
+	u8 pmsr_pmax;
 	unsigned int pmsr_pmax_idx;
 
 	pmsr = get_pmspr(SPRN_PMSR);
@@ -579,7 +570,7 @@ static void powernv_cpufreq_throttle_check(void *data)
 			goto next;
 		chip->throttled = true;
 		if (pmsr_pmax_idx > powernv_pstate_info.nominal) {
-			pr_warn_once("CPU %d on Chip %u has Pmax(%d) reduced below nominal frequency(%d)\n",
+			pr_warn_once("CPU %d on Chip %u has Pmax(0x%x) reduced below that of nominal frequency(0x%x)\n",
 				     cpu, chip->id, pmsr_pmax,
 				     idx_to_pstate(powernv_pstate_info.nominal));
 			chip->throttle_sub_turbo++;
-- 
1.9.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR
  2017-12-13  6:57 ` [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR Gautham R. Shenoy
@ 2017-12-17  3:04   ` Balbir Singh
  2017-12-18  9:03     ` Gautham R Shenoy
  0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2017-12-17  3:04 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Shilpasri G Bhat, Viresh Kumar, Rafael J. Wysocki, huntbag,
	Akshay Adiga, Michael Ellerman, Vaidyanathan Srinivasan,
	linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> On POWERNV platform, the fields for pstates in the Power Management
> Status Register (PMSR) and the Power Management Control Register
> (PMCR) are 8-bits wide. On POWER8 the pstates are negatively numbered
> while on POWER9 they are positively numbered.
>
> The device-tree exports pstates as 32-bit entries. The device-tree
> implementation sign-extends the 8-bit pstate values to obtain the
> corresponding 32-bit entry.
>
> Eg: On POWER8, a pstate value 0x82 [-126] is represented in the
> device-tree as 0xfffffff82 while on POWER9, the same value 0x82 [130]
> is represented in the device-tree as 0x00000082.
>
> The powernv-cpufreq driver implementation represents pstates using the
> integer type. In multiple places in the driver, the code interprets
> the pstates extracted from the PMSR as a signed byte and assigns it to
> a integer variable to get the sign-extention.
>
> On POWER9 platforms which have greater than 128 pstates, this results
> in the driver performing incorrect sign-extention, and thereby
> treating a legitimate pstate (say 130) as an invalid pstates (since it
> is interpreted as -126).
>
> This patch fixes the issue by implementing a helper function to
> extract Pstates from PMSR register, and correctly sign-extend it to be
> consistent with the values provided by the device-tree.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---

This looks better

Acked-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2017-12-13  6:57 ` [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates Gautham R. Shenoy
@ 2017-12-17  3:15   ` Balbir Singh
  2017-12-18  8:38     ` Gautham R Shenoy
  0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2017-12-17  3:15 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Shilpasri G Bhat, Viresh Kumar, Rafael J. Wysocki, huntbag,
	Akshay Adiga, Michael Ellerman, Vaidyanathan Srinivasan,
	linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> The code in powernv-cpufreq, makes the following two assumptions which
> are not guaranteed by the device-tree bindings:
>
>     1) Pstate ids are continguous: This is used in pstate_to_idx() to
>        obtain the reverse map from a pstate to it's corresponding
>        entry into the cpufreq frequency table.
>
>     2) Every Pstate should always lie between the max and the min
>        pstates that are explicitly reported in the device tree: This
>        is used to determine whether a pstate reported by the PMSR is
>        out of bounds.
>
> Both these assumptions are unwarranted and can change on future
> platforms.

While this is a good thing, I wonder if it is worth the complexity. Pstates
are contiguous because they define transitions in incremental value
of change in frequency and I can't see how this can be broken in the
future?

Balbir Singh.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values
  2017-12-13  6:57 ` [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values Gautham R. Shenoy
@ 2017-12-17  3:17   ` Balbir Singh
  2017-12-18  8:43     ` Gautham R Shenoy
  0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2017-12-17  3:17 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Shilpasri G Bhat, Viresh Kumar, Rafael J. Wysocki, huntbag,
	Akshay Adiga, Michael Ellerman, Vaidyanathan Srinivasan,
	linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
<ego@linux.vnet.ibm.com> wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>
> On POWER8 and POWER9, the PMSR and the PMCR registers define pstates
> to be 8-bit wide values. The device-tree exports pstates as 32-bit
> wide values of which the lower byte is the actual pstate.
>
> The current implementation in the kernel treats pstates as integer
> type, since it used to use the sign of the pstate for performing some
> boundary-checks. This is no longer required after the patch
> "powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous
> pstates".
>
> So, in this patch, we modify the powernv-cpufreq driver to uniformly
> treat pstates as opaque 8-bit values obtained from the device-tree or
> the PMCR. This simplifies the extract_pstate() helper function since
> we no longer no longer require to worry about the sign-extentions.
>
> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 47 ++++++++++++++++-----------------------
>  1 file changed, 19 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 8e3dbca..8a4e2ce 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -110,12 +110,11 @@ struct global_pstate_info {
>   *         hashtable
>   */
>  struct pstate_idx_revmap_data {
> -       int pstate_id;
> +       u8 pstate_id;
>         unsigned int cpufreq_table_idx;
>         struct hlist_node hentry;
>  };
>
> -u32 pstate_sign_prefix;
>  static bool rebooting, throttled, occ_reset;
>
>  static const char * const throttle_reason[] = {
> @@ -170,14 +169,9 @@ enum throttle_reason_type {
>         bool wof_enabled;
>  } powernv_pstate_info;
>
> -static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
> +static inline u8 extract_pstate(u64 pmsr_val, unsigned int shift)
>  {
> -       int ret = ((pmsr_val >> shift) & 0xFF);
> -
> -       if (!ret)
> -               return ret;
> -
> -       return (pstate_sign_prefix | ret);
> +       return ((pmsr_val >> shift) & 0xFF);
>  }

So we just added this and moved from an int to u8. I was going to ask
if we still need an int in patch1, but I thought the driver dealt with
just integers because of the larger framework.

Balbir Singh.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2017-12-17  3:15   ` Balbir Singh
@ 2017-12-18  8:38     ` Gautham R Shenoy
  2018-01-03 12:07       ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Gautham R Shenoy @ 2017-12-18  8:38 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Gautham R. Shenoy, Shilpasri G Bhat, Viresh Kumar,
	Rafael J. Wysocki, huntbag, Akshay Adiga, Michael Ellerman,
	Vaidyanathan Srinivasan, linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

Hi Balbir,

On Sun, Dec 17, 2017 at 02:15:25PM +1100, Balbir Singh wrote:
> On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > The code in powernv-cpufreq, makes the following two assumptions which
> > are not guaranteed by the device-tree bindings:
> >
> >     1) Pstate ids are continguous: This is used in pstate_to_idx() to
> >        obtain the reverse map from a pstate to it's corresponding
> >        entry into the cpufreq frequency table.
> >
> >     2) Every Pstate should always lie between the max and the min
> >        pstates that are explicitly reported in the device tree: This
> >        is used to determine whether a pstate reported by the PMSR is
> >        out of bounds.
> >
> > Both these assumptions are unwarranted and can change on future
> > platforms.
> 
> While this is a good thing, I wonder if it is worth the complexity. Pstates
> are contiguous because they define transitions in incremental value
> of change in frequency and I can't see how this can be broken in the
> future?

In the future, we can have the OPAL firmware give us a smaller set of
pstates instead of expose every one of them. As it stands today, for
most of the workloads, we will need at best 20-30 pstates and not
beyond that.


> 
> Balbir Singh.
> 

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values
  2017-12-17  3:17   ` Balbir Singh
@ 2017-12-18  8:43     ` Gautham R Shenoy
  0 siblings, 0 replies; 15+ messages in thread
From: Gautham R Shenoy @ 2017-12-18  8:43 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Gautham R. Shenoy, Shilpasri G Bhat, Viresh Kumar,
	Rafael J. Wysocki, huntbag, Akshay Adiga, Michael Ellerman,
	Vaidyanathan Srinivasan, linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

Hi Balbir,

On Sun, Dec 17, 2017 at 02:17:02PM +1100, Balbir Singh wrote:
> On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
[..snip..]
> >
> > -static inline int extract_pstate(u64 pmsr_val, unsigned int shift)
> > +static inline u8 extract_pstate(u64 pmsr_val, unsigned int shift)
> >  {
> > -       int ret = ((pmsr_val >> shift) & 0xFF);
> > -
> > -       if (!ret)
> > -               return ret;
> > -
> > -       return (pstate_sign_prefix | ret);
> > +       return ((pmsr_val >> shift) & 0xFF);
> >  }
> 
> So we just added this and moved from an int to u8. I was going to ask
> if we still need an int in patch1, but I thought the driver dealt with
> just integers because of the larger framework.

The larger framework is with respect to the device tree which defines
pstates as 32-bit integers (I am not aware of the reasons for this
choice, but perhaps device-tree doesn't have a s8/u8 type?!). The
driver still knows that pstates are only 8-bits wide because of the
PMSR,PMCR definitions. Before this patch, the driver was still doing
the conversions from 8-bit to int and vice-versa every time. With this
patch, we do it only once, i.e at the initialization. After that we
treat pstates as 8-bit integers. 


> 
> Balbir Singh.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR
  2017-12-17  3:04   ` Balbir Singh
@ 2017-12-18  9:03     ` Gautham R Shenoy
  0 siblings, 0 replies; 15+ messages in thread
From: Gautham R Shenoy @ 2017-12-18  9:03 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Gautham R. Shenoy, Shilpasri G Bhat, Viresh Kumar,
	Rafael J. Wysocki, huntbag, Akshay Adiga, Michael Ellerman,
	Vaidyanathan Srinivasan, linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

Hi Balbir,

On Sun, Dec 17, 2017 at 02:04:03PM +1100, Balbir Singh wrote:
> On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
> <ego@linux.vnet.ibm.com> wrote:
> > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >
> > On POWERNV platform, the fields for pstates in the Power Management
> > Status Register (PMSR) and the Power Management Control Register
> > (PMCR) are 8-bits wide. On POWER8 the pstates are negatively numbered
> > while on POWER9 they are positively numbered.
> >
> > The device-tree exports pstates as 32-bit entries. The device-tree
> > implementation sign-extends the 8-bit pstate values to obtain the
> > corresponding 32-bit entry.
> >
> > Eg: On POWER8, a pstate value 0x82 [-126] is represented in the
> > device-tree as 0xfffffff82 while on POWER9, the same value 0x82 [130]
> > is represented in the device-tree as 0x00000082.
> >
> > The powernv-cpufreq driver implementation represents pstates using the
> > integer type. In multiple places in the driver, the code interprets
> > the pstates extracted from the PMSR as a signed byte and assigns it to
> > a integer variable to get the sign-extention.
> >
> > On POWER9 platforms which have greater than 128 pstates, this results
> > in the driver performing incorrect sign-extention, and thereby
> > treating a legitimate pstate (say 130) as an invalid pstates (since it
> > is interpreted as -126).
> >
> > This patch fixes the issue by implementing a helper function to
> > extract Pstates from PMSR register, and correctly sign-extend it to be
> > consistent with the values provided by the device-tree.
> >
> > Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> > ---
> 
> This looks better

Thanks for the Review.

> 
> Acked-by: Balbir Singh <bsingharora@gmail.com>
> 
> Balbir Singh
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2017-12-18  8:38     ` Gautham R Shenoy
@ 2018-01-03 12:07       ` Rafael J. Wysocki
  2018-01-03 12:47         ` Balbir Singh
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2018-01-03 12:07 UTC (permalink / raw)
  To: ego, Balbir Singh
  Cc: Shilpasri G Bhat, Viresh Kumar, huntbag, Akshay Adiga,
	Michael Ellerman, Vaidyanathan Srinivasan, linux-pm,
	linux-kernel, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Monday, December 18, 2017 9:38:20 AM CET Gautham R Shenoy wrote:
> Hi Balbir,
> 
> On Sun, Dec 17, 2017 at 02:15:25PM +1100, Balbir Singh wrote:
> > On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
> > <ego@linux.vnet.ibm.com> wrote:
> > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > >
> > > The code in powernv-cpufreq, makes the following two assumptions which
> > > are not guaranteed by the device-tree bindings:
> > >
> > >     1) Pstate ids are continguous: This is used in pstate_to_idx() to
> > >        obtain the reverse map from a pstate to it's corresponding
> > >        entry into the cpufreq frequency table.
> > >
> > >     2) Every Pstate should always lie between the max and the min
> > >        pstates that are explicitly reported in the device tree: This
> > >        is used to determine whether a pstate reported by the PMSR is
> > >        out of bounds.
> > >
> > > Both these assumptions are unwarranted and can change on future
> > > platforms.
> > 
> > While this is a good thing, I wonder if it is worth the complexity. Pstates
> > are contiguous because they define transitions in incremental value
> > of change in frequency and I can't see how this can be broken in the
> > future?
> 
> In the future, we can have the OPAL firmware give us a smaller set of
> pstates instead of expose every one of them. As it stands today, for
> most of the workloads, we will need at best 20-30 pstates and not
> beyond that.

I'm not sure about the status here.

Is this good to go as is or is it going to be updated?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2018-01-03 12:07       ` Rafael J. Wysocki
@ 2018-01-03 12:47         ` Balbir Singh
  2018-01-10  8:55           ` Gautham R Shenoy
  0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2018-01-03 12:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Gautham R Shenoy, Shilpasri G Bhat, Viresh Kumar, Abhishek,
	Akshay Adiga, Michael Ellerman, Vaidyanathan Srinivasan,
	linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Wed, Jan 3, 2018 at 11:07 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, December 18, 2017 9:38:20 AM CET Gautham R Shenoy wrote:
>> Hi Balbir,
>>
>> On Sun, Dec 17, 2017 at 02:15:25PM +1100, Balbir Singh wrote:
>> > On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
>> > <ego@linux.vnet.ibm.com> wrote:
>> > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
>> > >
>> > > The code in powernv-cpufreq, makes the following two assumptions which
>> > > are not guaranteed by the device-tree bindings:
>> > >
>> > >     1) Pstate ids are continguous: This is used in pstate_to_idx() to
>> > >        obtain the reverse map from a pstate to it's corresponding
>> > >        entry into the cpufreq frequency table.
>> > >
>> > >     2) Every Pstate should always lie between the max and the min
>> > >        pstates that are explicitly reported in the device tree: This
>> > >        is used to determine whether a pstate reported by the PMSR is
>> > >        out of bounds.
>> > >
>> > > Both these assumptions are unwarranted and can change on future
>> > > platforms.
>> >
>> > While this is a good thing, I wonder if it is worth the complexity. Pstates
>> > are contiguous because they define transitions in incremental value
>> > of change in frequency and I can't see how this can be broken in the
>> > future?
>>
>> In the future, we can have the OPAL firmware give us a smaller set of
>> pstates instead of expose every one of them. As it stands today, for
>> most of the workloads, we will need at best 20-30 pstates and not
>> beyond that.
>
> I'm not sure about the status here.
>
> Is this good to go as is or is it going to be updated?
>

I have no major objections, except some of the added complexity, but
Gautham makes a point that this is refactoring for the future

Balbir Singh.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2018-01-03 12:47         ` Balbir Singh
@ 2018-01-10  8:55           ` Gautham R Shenoy
  2018-01-10 12:00             ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Gautham R Shenoy @ 2018-01-10  8:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Rafael J. Wysocki, Gautham R Shenoy, Shilpasri G Bhat,
	Viresh Kumar, Abhishek, Akshay Adiga, Michael Ellerman,
	Vaidyanathan Srinivasan, linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

Hi Rafael,

On Wed, Jan 03, 2018 at 11:47:58PM +1100, Balbir Singh wrote:
> On Wed, Jan 3, 2018 at 11:07 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Monday, December 18, 2017 9:38:20 AM CET Gautham R Shenoy wrote:
> >> Hi Balbir,
> >>
> >> On Sun, Dec 17, 2017 at 02:15:25PM +1100, Balbir Singh wrote:
> >> > On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
> >> > <ego@linux.vnet.ibm.com> wrote:
> >> > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> >> > >
> >> > > The code in powernv-cpufreq, makes the following two assumptions which
> >> > > are not guaranteed by the device-tree bindings:
> >> > >
> >> > >     1) Pstate ids are continguous: This is used in pstate_to_idx() to
> >> > >        obtain the reverse map from a pstate to it's corresponding
> >> > >        entry into the cpufreq frequency table.
> >> > >
> >> > >     2) Every Pstate should always lie between the max and the min
> >> > >        pstates that are explicitly reported in the device tree: This
> >> > >        is used to determine whether a pstate reported by the PMSR is
> >> > >        out of bounds.
> >> > >
> >> > > Both these assumptions are unwarranted and can change on future
> >> > > platforms.
> >> >
> >> > While this is a good thing, I wonder if it is worth the complexity. Pstates
> >> > are contiguous because they define transitions in incremental value
> >> > of change in frequency and I can't see how this can be broken in the
> >> > future?
> >>
> >> In the future, we can have the OPAL firmware give us a smaller set of
> >> pstates instead of expose every one of them. As it stands today, for
> >> most of the workloads, we will need at best 20-30 pstates and not
> >> beyond that.
> >
> > I'm not sure about the status here.
> >
> > Is this good to go as is or is it going to be updated?
> >
> 
> I have no major objections, except some of the added complexity, but
> Gautham makes a point that this is refactoring for the future

I have tested this across POWER8 and POWER9. The additional complexity
introduced by the second patch is required for the future when we are
going to reduce the number of pstates.


> 
> Balbir Singh.
> 
--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes.
  2017-12-13  6:57 [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Gautham R. Shenoy
                   ` (2 preceding siblings ...)
  2017-12-13  6:57 ` [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values Gautham R. Shenoy
@ 2018-01-10  9:43 ` Viresh Kumar
  3 siblings, 0 replies; 15+ messages in thread
From: Viresh Kumar @ 2018-01-10  9:43 UTC (permalink / raw)
  To: Gautham R. Shenoy
  Cc: Shilpasri G Bhat, rjw, huntbag, akshay.adiga, Michael Ellerman,
	Vaidyanathan Srinivasan, Balbir Singh, linux-pm, linux-kernel,
	linuxppc-dev

On 13-12-17, 12:27, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> 
>  This is a third version of the patch to fix pstate related issues in
>  the powernv-cpufreq driver.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates
  2018-01-10  8:55           ` Gautham R Shenoy
@ 2018-01-10 12:00             ` Rafael J. Wysocki
  0 siblings, 0 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2018-01-10 12:00 UTC (permalink / raw)
  To: ego
  Cc: Balbir Singh, Shilpasri G Bhat, Viresh Kumar, Abhishek,
	Akshay Adiga, Michael Ellerman, Vaidyanathan Srinivasan,
	linux-pm, linux-kernel,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Wednesday, January 10, 2018 9:55:45 AM CET Gautham R Shenoy wrote:
> Hi Rafael,
> 
> On Wed, Jan 03, 2018 at 11:47:58PM +1100, Balbir Singh wrote:
> > On Wed, Jan 3, 2018 at 11:07 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > On Monday, December 18, 2017 9:38:20 AM CET Gautham R Shenoy wrote:
> > >> Hi Balbir,
> > >>
> > >> On Sun, Dec 17, 2017 at 02:15:25PM +1100, Balbir Singh wrote:
> > >> > On Wed, Dec 13, 2017 at 5:57 PM, Gautham R. Shenoy
> > >> > <ego@linux.vnet.ibm.com> wrote:
> > >> > > From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>
> > >> > >
> > >> > > The code in powernv-cpufreq, makes the following two assumptions which
> > >> > > are not guaranteed by the device-tree bindings:
> > >> > >
> > >> > >     1) Pstate ids are continguous: This is used in pstate_to_idx() to
> > >> > >        obtain the reverse map from a pstate to it's corresponding
> > >> > >        entry into the cpufreq frequency table.
> > >> > >
> > >> > >     2) Every Pstate should always lie between the max and the min
> > >> > >        pstates that are explicitly reported in the device tree: This
> > >> > >        is used to determine whether a pstate reported by the PMSR is
> > >> > >        out of bounds.
> > >> > >
> > >> > > Both these assumptions are unwarranted and can change on future
> > >> > > platforms.
> > >> >
> > >> > While this is a good thing, I wonder if it is worth the complexity. Pstates
> > >> > are contiguous because they define transitions in incremental value
> > >> > of change in frequency and I can't see how this can be broken in the
> > >> > future?
> > >>
> > >> In the future, we can have the OPAL firmware give us a smaller set of
> > >> pstates instead of expose every one of them. As it stands today, for
> > >> most of the workloads, we will need at best 20-30 pstates and not
> > >> beyond that.
> > >
> > > I'm not sure about the status here.
> > >
> > > Is this good to go as is or is it going to be updated?
> > >
> > 
> > I have no major objections, except some of the added complexity, but
> > Gautham makes a point that this is refactoring for the future
> 
> I have tested this across POWER8 and POWER9. The additional complexity
> introduced by the second patch is required for the future when we are
> going to reduce the number of pstates.

I have applied the series, thanks!

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-10 12:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13  6:57 [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Gautham R. Shenoy
2017-12-13  6:57 ` [v3 PATCH 1/3] powernv-cpufreq: Add helper to extract pstate from PMSR Gautham R. Shenoy
2017-12-17  3:04   ` Balbir Singh
2017-12-18  9:03     ` Gautham R Shenoy
2017-12-13  6:57 ` [v3 PATCH 2/3] powernv-cpufreq: Fix pstate_to_idx() to handle non-continguous pstates Gautham R. Shenoy
2017-12-17  3:15   ` Balbir Singh
2017-12-18  8:38     ` Gautham R Shenoy
2018-01-03 12:07       ` Rafael J. Wysocki
2018-01-03 12:47         ` Balbir Singh
2018-01-10  8:55           ` Gautham R Shenoy
2018-01-10 12:00             ` Rafael J. Wysocki
2017-12-13  6:57 ` [v3 PATCH 3/3] powernv-cpufreq: Treat pstates as opaque 8-bit values Gautham R. Shenoy
2017-12-17  3:17   ` Balbir Singh
2017-12-18  8:43     ` Gautham R Shenoy
2018-01-10  9:43 ` [v3 PATCH 0/3] powernv-cpufreq: Multiple pstate related fixes Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).