All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver
@ 2016-01-28  7:25 Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat

In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
CPU when the chip crosses its thermal and power limits. Currently,
powernv-cpufreq driver detects and reports this event as a console
message. Some machines may not sustain the max turbo frequency in all
conditions and can be throttled frequently. This can lead to the
flooding of console with throttle messages. So this patchset aims to
redesign the presentation of this event via sysfs counters and
tracepoints. And it also fixes couple of bugs reported in the driver.

- Patch [1] fixes a memory leak bug
- Patch [2] fixes the cpu hot-plug bug in powernv_cpufreq_work_fn().
- Patch [3] solves a bug in powernv_cpufreq_throttle_check(), which
  calls in to cpu_to_chip_id() in hot path which reads DT every time
  to find the chip id.
- Patches [4] to [6] will add a perf trace point
  "power:powernv_throttle" and sysfs throttle counter stats in
  /sys/devices/system/cpu/cpufreq/chipN.

Changes from v6:
- Changes wrt comments from Balbir Singh and Viresh Kumar.

Changes from v5:
- Fix kbuild error:
drivers/cpufreq/powernv-cpufreq.c:428:2: error: implicit declaration of
function 'get_online_cpus' [-Werror=implicit-function-declaration]

Changes from v4:
- Fix a hot-plug bug in powernv_cpufreq_work_fn()
- Changes wrt Gautham's and Shreyas's comments 

Changes from v3:
- Add a fix to replace cpu_to_chip_id() with simpler PIR shift to 
  obtain the chip id.
- Break patch2 in to two patches separating the tracepoint and sysfs
  attribute changes.

Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]

Shilpasri G Bhat (6):
  cpufre: powernv: Free 'chips' on module exit
  cpufreq: powernv: Hot-plug safe the kworker thread
  cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
  cpufreq: powernv/tracing: Add powernv_throttle tracepoint
  cpufreq: powernv: Replace pr_info with trace print for throttle event
  cpufreq: powernv: Add sysfs attributes to show throttle stats

 Documentation/ABI/testing/sysfs-devices-system-cpu |  45 +++
 drivers/cpufreq/powernv-cpufreq.c                  | 313 +++++++++++++++++----
 include/trace/events/power.h                       |  22 ++
 kernel/trace/power-traces.c                        |   1 +
 4 files changed, 326 insertions(+), 55 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  8:00   ` Gautham R Shenoy
  2016-01-28  8:27   ` Viresh Kumar
  2016-01-28  7:25 ` [PATCH v7 2/6] cpufreq: powernv: Hot-plug safe the kworker thread Shilpasri G Bhat
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat

This will free the dynamically allocated memory of'chips' on
module exit.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
 drivers/cpufreq/powernv-cpufreq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 547890f..53f980b 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -612,6 +612,7 @@ static void __exit powernv_cpufreq_exit(void)
 	unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
 	opal_message_notifier_unregister(OPAL_MSG_OCC,
 					 &powernv_cpufreq_opal_nb);
+	kfree(chips);
 	cpufreq_unregister_driver(&powernv_cpufreq_driver);
 }
 module_exit(powernv_cpufreq_exit);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 2/6] cpufreq: powernv: Hot-plug safe the kworker thread
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat

In the kworker_thread powernv_cpufreq_work_fn(), we can end up
sending an IPI to a cpu going offline. This is a rare corner case
which is fixed using {get/put}_online_cpus(). Along with this fix,
this patch adds changes to do oneshot cpumask_{clear/and} operation.

Suggested-by: Shreyas B Prabhu <shreyas@linux.vnet.ibm.com>
Suggested-by: Gautham R Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
---
No changes from v6.

Changes form v5:
- Fix the kbuild-error:
drivers/cpufreq/powernv-cpufreq.c:428:2: error: implicit declaration of
function 'get_online_cpus' [-Werror=implicit-function-declaration

 drivers/cpufreq/powernv-cpufreq.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 53f980b..a271b0f 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -28,6 +28,7 @@
 #include <linux/of.h>
 #include <linux/reboot.h>
 #include <linux/slab.h>
+#include <linux/cpu.h>
 
 #include <asm/cputhreads.h>
 #include <asm/firmware.h>
@@ -423,18 +424,19 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
 {
 	struct chip *chip = container_of(work, struct chip, throttle);
 	unsigned int cpu;
-	cpumask_var_t mask;
+	cpumask_t mask;
 
-	smp_call_function_any(&chip->mask,
+	get_online_cpus();
+	cpumask_and(&mask, &chip->mask, cpu_online_mask);
+	smp_call_function_any(&mask,
 			      powernv_cpufreq_throttle_check, NULL, 0);
 
 	if (!chip->restore)
-		return;
+		goto out;
 
 	chip->restore = false;
-	cpumask_copy(mask, &chip->mask);
-	for_each_cpu_and(cpu, mask, cpu_online_mask) {
-		int index, tcpu;
+	for_each_cpu(cpu, &mask) {
+		int index;
 		struct cpufreq_policy policy;
 
 		cpufreq_get_policy(&policy, cpu);
@@ -442,9 +444,10 @@ void powernv_cpufreq_work_fn(struct work_struct *work)
 					       policy.cur,
 					       CPUFREQ_RELATION_C, &index);
 		powernv_cpufreq_target_index(&policy, index);
-		for_each_cpu(tcpu, policy.cpus)
-			cpumask_clear_cpu(tcpu, mask);
+		cpumask_andnot(&mask, &mask, policy.cpus);
 	}
+out:
+	put_online_cpus();
 }
 
 static char throttle_reason[][30] = {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 2/6] cpufreq: powernv: Hot-plug safe the kworker thread Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  8:11     ` Gautham R Shenoy
  2016-01-28  8:28   ` Viresh Kumar
  2016-01-28  7:25 ` [PATCH v7 4/6] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat

cpu_to_chip_id() does a DT walk through to find out the chip id by
taking a contended device tree lock. This adds an unnecessary overhead
in a hot path. So instead of calling cpu_to_chip_id() everytime cache
the chip ids for all cores in the array 'core_to_chip_map' and use it
in the hotpath.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v6:
- Minor changes to move the code 'cpumask_copy()' after 'core_to_chip_map'
  is allocated.
- Move 'kfree(chips)' to a separate patch.

No changes from v5.

Changes from v4:
- Taken care of Shreyas's comments to add a core_to_chip_map array to
  store the chip id.

 drivers/cpufreq/powernv-cpufreq.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index a271b0f..c670314 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -43,6 +43,7 @@
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static bool rebooting, throttled, occ_reset;
+static unsigned int *core_to_chip_map;
 
 static struct chip {
 	unsigned int id;
@@ -313,13 +314,14 @@ static inline unsigned int get_nominal_index(void)
 static void powernv_cpufreq_throttle_check(void *data)
 {
 	unsigned int cpu = smp_processor_id();
+	unsigned int chip_id = core_to_chip_map[cpu_core_index_of_thread(cpu)];
 	unsigned long pmsr;
 	int pmsr_pmax, i;
 
 	pmsr = get_pmspr(SPRN_PMSR);
 
 	for (i = 0; i < nr_chips; i++)
-		if (chips[i].id == cpu_to_chip_id(cpu))
+		if (chips[i].id == chip_id)
 			break;
 
 	/* Check for Pmax Capping */
@@ -559,19 +561,29 @@ static int init_chip_info(void)
 	unsigned int chip[256];
 	unsigned int cpu, i;
 	unsigned int prev_chip_id = UINT_MAX;
+	cpumask_t cpu_mask;
+	int ret = -ENOMEM;
 
-	for_each_possible_cpu(cpu) {
+	core_to_chip_map = kcalloc(cpu_nr_cores(), sizeof(unsigned int),
+				   GFP_KERNEL);
+	if (!core_to_chip_map)
+		goto out;
+
+	cpumask_copy(&cpu_mask, cpu_possible_mask);
+	for_each_cpu(cpu, &cpu_mask) {
 		unsigned int id = cpu_to_chip_id(cpu);
 
 		if (prev_chip_id != id) {
 			prev_chip_id = id;
 			chip[nr_chips++] = id;
 		}
+		core_to_chip_map[cpu_core_index_of_thread(cpu)] = id;
+		cpumask_andnot(&cpu_mask, &cpu_mask, cpu_sibling_mask(cpu));
 	}
 
 	chips = kmalloc_array(nr_chips, sizeof(struct chip), GFP_KERNEL);
 	if (!chips)
-		return -ENOMEM;
+		goto free_chip_map;
 
 	for (i = 0; i < nr_chips; i++) {
 		chips[i].id = chip[i];
@@ -582,6 +594,10 @@ static int init_chip_info(void)
 	}
 
 	return 0;
+free_chip_map:
+	kfree(core_to_chip_map);
+out:
+	return ret;
 }
 
 static int __init powernv_cpufreq_init(void)
@@ -616,6 +632,7 @@ static void __exit powernv_cpufreq_exit(void)
 	opal_message_notifier_unregister(OPAL_MSG_OCC,
 					 &powernv_cpufreq_opal_nb);
 	kfree(chips);
+	kfree(core_to_chip_map);
 	cpufreq_unregister_driver(&powernv_cpufreq_driver);
 }
 module_exit(powernv_cpufreq_exit);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 4/6] cpufreq: powernv/tracing: Add powernv_throttle tracepoint
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
                   ` (2 preceding siblings ...)
  2016-01-28  7:25 ` [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  7:25 ` [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event Shilpasri G Bhat
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat, Ingo Molnar, Steven Rostedt

This patch adds the powernv_throttle tracepoint to trace the CPU
frequency throttling event, which is used by the powernv-cpufreq
driver in POWER8.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Steven Rostedt <rostedt@goodmis.org>
---
No changes since v2.

 include/trace/events/power.h | 22 ++++++++++++++++++++++
 kernel/trace/power-traces.c  |  1 +
 2 files changed, 23 insertions(+)

diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 284244e..19e5030 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -38,6 +38,28 @@ DEFINE_EVENT(cpu, cpu_idle,
 	TP_ARGS(state, cpu_id)
 );
 
+TRACE_EVENT(powernv_throttle,
+
+	TP_PROTO(int chip_id, const char *reason, int pmax),
+
+	TP_ARGS(chip_id, reason, pmax),
+
+	TP_STRUCT__entry(
+		__field(int, chip_id)
+		__string(reason, reason)
+		__field(int, pmax)
+	),
+
+	TP_fast_assign(
+		__entry->chip_id = chip_id;
+		__assign_str(reason, reason);
+		__entry->pmax = pmax;
+	),
+
+	TP_printk("Chip %d Pmax %d %s", __entry->chip_id,
+		  __entry->pmax, __get_str(reason))
+);
+
 TRACE_EVENT(pstate_sample,
 
 	TP_PROTO(u32 core_busy,
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index eb4220a..81b8745 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -15,4 +15,5 @@
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(suspend_resume);
 EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
+EXPORT_TRACEPOINT_SYMBOL_GPL(powernv_throttle);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
                   ` (3 preceding siblings ...)
  2016-01-28  7:25 ` [PATCH v7 4/6] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  8:18   ` Gautham R Shenoy
  2016-01-28  8:29   ` Viresh Kumar
  2016-01-28  7:25 ` [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat

Currently we use printk message to notify the throttle event. But this
can flood the console if the cpu is throttled frequently. So replace the
printk with the tracepoint to notify the throttle event. And also events
like throttle below nominal frequency and OCC_RESET are reduced to
pr_warn/pr_warn_once as pointed by MFG to not mark them as critical
messages. This patch adds 'throttle_reason' to struct chip to store the
throttle reason.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
---
Changes from v6:
- Rename struct chip member 'throt_reason' to 'throttle_reason'

No changes from v5.

Changes from v4:
- Taken care of Gautham's comments to remove the new function
  powernv_cpufreq_check_pmax()
- Modified commit message

Changes from v3:
- Separate this patch to contain trace_point changes
- Move struct chip member 'restore' of type bool above 'mask' to reduce
  structure padding.

No changes from v2.

Changes from v1:
- As suggested by Paul Clarke replaced char * throttle_reason[][30] by 
  const char * const throttle_reason[].

 drivers/cpufreq/powernv-cpufreq.c | 73 ++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 39 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index c670314..1bbc10a 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -29,6 +29,7 @@
 #include <linux/reboot.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
+#include <trace/events/power.h>
 
 #include <asm/cputhreads.h>
 #include <asm/firmware.h>
@@ -45,12 +46,22 @@ static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
 static bool rebooting, throttled, occ_reset;
 static unsigned int *core_to_chip_map;
 
+static const char * const throttle_reason[] = {
+	"No throttling",
+	"Power Cap",
+	"Processor Over Temperature",
+	"Power Supply Failure",
+	"Over Current",
+	"OCC Reset"
+};
+
 static struct chip {
 	unsigned int id;
 	bool throttled;
+	bool restore;
+	u8 throttle_reason;
 	cpumask_t mask;
 	struct work_struct throttle;
-	bool restore;
 } *chips;
 
 static int nr_chips;
@@ -331,17 +342,17 @@ static void powernv_cpufreq_throttle_check(void *data)
 			goto next;
 		chips[i].throttled = true;
 		if (pmsr_pmax < powernv_pstate_info.nominal)
-			pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
-				cpu, chips[i].id, pmsr_pmax,
-				powernv_pstate_info.nominal);
-		else
-			pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n",
-				cpu, chips[i].id, pmsr_pmax,
-				powernv_pstate_info.max);
+			pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
+				     cpu, chips[i].id, pmsr_pmax,
+				     powernv_pstate_info.nominal);
+		trace_powernv_throttle(chips[i].id,
+				      throttle_reason[chips[i].throttle_reason],
+				      pmsr_pmax);
 	} else if (chips[i].throttled) {
 		chips[i].throttled = false;
-		pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu,
-			chips[i].id, pmsr_pmax);
+		trace_powernv_throttle(chips[i].id,
+				      throttle_reason[chips[i].throttle_reason],
+				      pmsr_pmax);
 	}
 
 	/* Check if Psafe_mode_active is set in PMSR. */
@@ -359,7 +370,7 @@ next:
 
 	if (throttled) {
 		pr_info("PMSR = %16lx\n", pmsr);
-		pr_crit("CPU Frequency could be throttled\n");
+		pr_warn("CPU Frequency could be throttled\n");
 	}
 }
 
@@ -452,15 +463,6 @@ out:
 	put_online_cpus();
 }
 
-static char throttle_reason[][30] = {
-					"No throttling",
-					"Power Cap",
-					"Processor Over Temperature",
-					"Power Supply Failure",
-					"Over Current",
-					"OCC Reset"
-				     };
-
 static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
 				   unsigned long msg_type, void *_msg)
 {
@@ -486,7 +488,7 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
 		 */
 		if (!throttled) {
 			throttled = true;
-			pr_crit("CPU frequency is throttled for duration\n");
+			pr_warn("CPU frequency is throttled for duration\n");
 		}
 
 		break;
@@ -510,23 +512,18 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
 			return 0;
 		}
 
-		if (omsg.throttle_status &&
+		for (i = 0; i < nr_chips; i++)
+			if (chips[i].id == omsg.chip)
+				break;
+
+		if (omsg.throttle_status >= 0 &&
 		    omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
-			pr_info("OCC: Chip %u Pmax reduced due to %s\n",
-				(unsigned int)omsg.chip,
-				throttle_reason[omsg.throttle_status]);
-		else if (!omsg.throttle_status)
-			pr_info("OCC: Chip %u %s\n", (unsigned int)omsg.chip,
-				throttle_reason[omsg.throttle_status]);
-		else
-			return 0;
+			chips[i].throttle_reason = omsg.throttle_status;
 
-		for (i = 0; i < nr_chips; i++)
-			if (chips[i].id == omsg.chip) {
-				if (!omsg.throttle_status)
-					chips[i].restore = true;
-				schedule_work(&chips[i].throttle);
-			}
+		if (!omsg.throttle_status)
+			chips[i].restore = true;
+
+		schedule_work(&chips[i].throttle);
 	}
 	return 0;
 }
@@ -581,16 +578,14 @@ static int init_chip_info(void)
 		cpumask_andnot(&cpu_mask, &cpu_mask, cpu_sibling_mask(cpu));
 	}
 
-	chips = kmalloc_array(nr_chips, sizeof(struct chip), GFP_KERNEL);
+	chips = kcalloc(nr_chips, sizeof(struct chip), GFP_KERNEL);
 	if (!chips)
 		goto free_chip_map;
 
 	for (i = 0; i < nr_chips; i++) {
 		chips[i].id = chip[i];
-		chips[i].throttled = false;
 		cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
 		INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
-		chips[i].restore = false;
 	}
 
 	return 0;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
                   ` (4 preceding siblings ...)
  2016-01-28  7:25 ` [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event Shilpasri G Bhat
@ 2016-01-28  7:25 ` Shilpasri G Bhat
  2016-01-28  8:34     ` Gautham R Shenoy
                     ` (2 more replies)
  2016-01-28  8:25 ` [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Viresh Kumar
  2016-01-28 11:10 ` Balbir Singh
  7 siblings, 3 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  7:25 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel
  Cc: rjw, viresh.kumar, linux-pm, pc, anton, ego, shreyas,
	bsingharora, mpe, Shilpasri G Bhat, linux-api

Create sysfs attributes to export throttle information in
/sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
follows:

1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
  This gives the throttle stats for each of the available frequencies.
  The throttle stat of a frequency is the total number of times the max
  frequency is reduced to that frequency.
  # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
  4023000 0
  3990000 0
  3956000 1
  3923000 0
  3890000 0
  3857000 2
  3823000 0
  3790000 0
  3757000 2
  3724000 1
  3690000 1
  ...

2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
  This directory contains throttle reason files. Each file gives the
  total number of times the max frequency is throttled, except for
  'unthrottle_count', which gives the total number of times the max
  frequency is unthrottled after being throttled.
  # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
  # cat cpu_over_temperature
  7
  # cat occ_reset
  0
  # cat over_current
  0
  # cat power_cap
  0
  # cat power_supply_failure
  0
  # cat unthrottle_count
  7

3)/sys/devices/system/cpu/cpufreq/chip0/throttle_stat
  This gives the total number of events of max frequency throttling to
  lower frequencies in the turbo range of frequencies and the sub-turbo(at
  and below nominal) range of frequencies.
  # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_stat
  turbo 7
  sub-turbo 0

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Cc: linux-api@vger.kernel.org
---
Changes from v6:
- Rename struct chip members 'throt_{nominal/turbo}' to throttle_*
- Rename sysfs throttle_reason attribute 'throttle_reset' to
  'unthrottle_count'
- Add sysfs attribute details in
  Documentation/ABI/testing/sysfs-devices-system-cpu
- Add helper routine get_chip_index_from_kobj() for throttle sysfs
  attribute show() to get chip index from kobject.
- Add the chip id in the pr_warn_once

No changes from v5.

Changes from v4:
- Taken care of Gautham's comments to use inline get_chip_index()

Changes from v3:
- Seperate the patch to contain only the throttle sysfs attribute changes.
- Add helper inline function get_chip_index()

Changes from v2:
- Fixed kbuild test warning.
drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return
value of 'kstrtoint', declared with attribute warn_unused_result
[-Wunused-result]

Changes from v1:
- Added a kobject to struct chip
- Grouped the throttle reasons under a separate attribute_group and
  exported each reason as individual file.
- Moved the sysfs files from /sys/devices/system/node/nodeN to
  /sys/devices/system/cpu/cpufreq/chipN
- As suggested by Paul Clarke replaced 'Nominal' with 'sub-turbo'.

 Documentation/ABI/testing/sysfs-devices-system-cpu |  45 +++++
 drivers/cpufreq/powernv-cpufreq.c                  | 205 ++++++++++++++++++++-
 2 files changed, 241 insertions(+), 9 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index b683e8e..dea4620 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -271,3 +271,48 @@ Description:	Parameters for the CPU cache attributes
 			- WriteBack: data is written only to the cache line and
 				     the modified cache line is written to main
 				     memory only when it is replaced
+
+What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_stats
+Date:		Jan 2016
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+		Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	CPU Frequency throttle stat for the chip
+
+		This attribute gives the total number of events of max-frequency
+		throttling to any lower frequency in the turbo(above nominal) and
+		the sub-turbo (at and below nominal) range of frequencies.
+
+
+What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_frequencies
+Date:		Jan 2016
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+		Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	CPU Frequency throttle stat for all available frequencies in the chip
+
+		This attribute gives the throttle stats for each of the available
+		frequencies. The throttle stat of a frequency is the total
+		number of times the max frequency is reduced to that frequency.
+
+What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_reasons/<reason_attributes>
+Date:		Jan 2016
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+		Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description:	CPU Frequency throttle reason stat for the chip
+
+		This directory contains throttle reason files. Each file gives
+		the total number of times the max frequency is throttled, except
+		for 'unthrottle_count', which gives the total number of times
+		the max frequency is unthrottled after being throttled. Below
+		are the reason attributes.
+
+		cpu_over_temperature: Throttled due to cpu over temperature
+
+		occ_reset: Throttled due to reset of OCC
+
+		over_current: Throttled due to over current
+
+		power_cap: Throttled due to power capping
+
+		power_supply_failure: Throttled due to power supply failure
+
+		unthrottle_count: Unthrottled from any of the above reasons.
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index 1bbc10a..b647941 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -55,6 +55,16 @@ static const char * const throttle_reason[] = {
 	"OCC Reset"
 };
 
+enum throttle_reason_type {
+	NO_THROTTLE = 0,
+	POWERCAP,
+	CPU_OVERTEMP,
+	POWER_SUPPLY_FAILURE,
+	OVERCURRENT,
+	OCC_RESET_THROTTLE,
+	OCC_MAX_REASON
+};
+
 static struct chip {
 	unsigned int id;
 	bool throttled;
@@ -62,6 +72,11 @@ static struct chip {
 	u8 throttle_reason;
 	cpumask_t mask;
 	struct work_struct throttle;
+	int throttle_turbo;
+	int throttle_nominal;
+	int reason[OCC_MAX_REASON];
+	int *pstate_stat;
+	struct kobject *kobj;
 } *chips;
 
 static int nr_chips;
@@ -196,6 +211,126 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
 	NULL,
 };
 
+static inline int get_chip_index(unsigned int id)
+{
+	int i;
+
+	for (i = 0; i < nr_chips; i++)
+		if (chips[i].id == id)
+			return i;
+
+	return -EINVAL;
+}
+
+static inline int get_chip_index_from_kobj(struct kobject *kobj)
+{
+	int ret, id;
+	int len = strlen("chip");
+
+	ret = kstrtoint(kobj->name + len, 0, &id);
+	if (ret)
+		return ret;
+
+	ret = get_chip_index(id);
+	if (ret < 0)
+		pr_warn_once("%s Matching chip-id not found %d\n", __func__,
+			     id);
+	return ret;
+}
+
+static ssize_t throttle_freq_show(struct kobject *kobj,
+				  struct kobj_attribute *attr, char *buf)
+{
+	int i, count = 0, id;
+
+	id = get_chip_index_from_kobj(kobj);
+	if (id < 0)
+		return id;
+
+	for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
+		count += sprintf(&buf[count], "%d %d\n",
+			       powernv_freqs[i].frequency,
+			       chips[id].pstate_stat[i]);
+
+	return count;
+}
+
+static struct kobj_attribute attr_throttle_frequencies =
+__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);
+
+static ssize_t throttle_stat_show(struct kobject *kobj,
+				  struct kobj_attribute *attr, char *buf)
+{
+	int id, count = 0;
+
+	id = get_chip_index_from_kobj(kobj);
+	if (id < 0)
+		return id;
+
+	count += sprintf(&buf[count], "turbo %d\n", chips[id].throttle_turbo);
+	count += sprintf(&buf[count], "sub-turbo %d\n",
+					chips[id].throttle_nominal);
+
+	return count;
+}
+
+static struct kobj_attribute attr_throttle_stat =
+__ATTR(throttle_stat, 0444, throttle_stat_show, NULL);
+
+#define define_throttle_reason_attr(attr_name, val)			  \
+static ssize_t attr_name##_show(struct kobject *kobj,			  \
+				  struct kobj_attribute *attr, char *buf) \
+{									  \
+	int id;								  \
+									  \
+	id = get_chip_index_from_kobj(kobj);				  \
+	if (id < 0)							  \
+		return id;						  \
+									  \
+	return sprintf(buf, "%d\n", chips[id].reason[val]);		  \
+}									  \
+									  \
+static struct kobj_attribute attr_##attr_name =				  \
+__ATTR(attr_name, 0444, attr_name##_show, NULL)
+
+define_throttle_reason_attr(unthrottle_count, NO_THROTTLE);
+define_throttle_reason_attr(power_cap, POWERCAP);
+define_throttle_reason_attr(cpu_over_temperature, CPU_OVERTEMP);
+define_throttle_reason_attr(power_supply_failure, POWER_SUPPLY_FAILURE);
+define_throttle_reason_attr(over_current, OVERCURRENT);
+define_throttle_reason_attr(occ_reset, OCC_RESET_THROTTLE);
+
+static struct attribute *throttle_reason_attrs[] = {
+	&attr_unthrottle_count.attr,
+	&attr_power_cap.attr,
+	&attr_cpu_over_temperature.attr,
+	&attr_power_supply_failure.attr,
+	&attr_over_current.attr,
+	&attr_occ_reset.attr,
+	NULL
+};
+
+static struct attribute *throttle_stat_attrs[] = {
+	&attr_throttle_frequencies.attr,
+	&attr_throttle_stat.attr,
+	NULL
+};
+
+static const struct attribute_group throttle_reason_group = {
+	.name   = "throttle_reasons",
+	.attrs  = throttle_reason_attrs,
+};
+
+static const struct attribute_group throttle_stat_group = {
+	.attrs = throttle_stat_attrs,
+};
+
+static const struct attribute_group *throttle_attr_groups[] = {
+	&throttle_stat_group,
+	&throttle_reason_group,
+	NULL
+};
+
 /* Helper routines */
 
 /* Access helpers to power mgt SPR */
@@ -327,13 +462,16 @@ static void powernv_cpufreq_throttle_check(void *data)
 	unsigned int cpu = smp_processor_id();
 	unsigned int chip_id = core_to_chip_map[cpu_core_index_of_thread(cpu)];
 	unsigned long pmsr;
-	int pmsr_pmax, i;
+	int pmsr_pmax, i, index;
 
 	pmsr = get_pmspr(SPRN_PMSR);
 
-	for (i = 0; i < nr_chips; i++)
-		if (chips[i].id == chip_id)
-			break;
+	i = get_chip_index(chip_id);
+	if (unlikely(i < 0)) {
+		pr_warn_once("%s Matching chip-id not found %d\n", __func__,
+			     chip_id);
+		return;
+	}
 
 	/* Check for Pmax Capping */
 	pmsr_pmax = (s8)PMSR_MAX(pmsr);
@@ -341,10 +479,19 @@ static void powernv_cpufreq_throttle_check(void *data)
 		if (chips[i].throttled)
 			goto next;
 		chips[i].throttled = true;
-		if (pmsr_pmax < powernv_pstate_info.nominal)
+		if (pmsr_pmax < powernv_pstate_info.nominal) {
 			pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n",
 				     cpu, chips[i].id, pmsr_pmax,
 				     powernv_pstate_info.nominal);
+			chips[i].throttle_nominal++;
+		} else {
+			chips[i].throttle_turbo++;
+		}
+
+		index  = powernv_pstate_info.max - pmsr_pmax;
+		if (index >= 0 && index < powernv_pstate_info.nr_pstates)
+			chips[i].pstate_stat[index]++;
+
 		trace_powernv_throttle(chips[i].id,
 				      throttle_reason[chips[i].throttle_reason],
 				      pmsr_pmax);
@@ -512,13 +659,19 @@ static int powernv_cpufreq_occ_msg(struct notifier_block *nb,
 			return 0;
 		}
 
-		for (i = 0; i < nr_chips; i++)
-			if (chips[i].id == omsg.chip)
-				break;
+
+		i = get_chip_index(omsg.chip);
+		if (i < 0) {
+			pr_warn_once("%s Matching chip-id not found %d\n",
+				     __func__, (int)omsg.chip);
+			return i;
+		}
 
 		if (omsg.throttle_status >= 0 &&
-		    omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS)
+		    omsg.throttle_status <= OCC_MAX_THROTTLE_STATUS) {
 			chips[i].throttle_reason = omsg.throttle_status;
+			chips[i].reason[omsg.throttle_status]++;
+		}
 
 		if (!omsg.throttle_status)
 			chips[i].restore = true;
@@ -583,12 +736,38 @@ static int init_chip_info(void)
 		goto free_chip_map;
 
 	for (i = 0; i < nr_chips; i++) {
+		char name[10];
+
 		chips[i].id = chip[i];
 		cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
 		INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
+		chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates,
+						sizeof(int), GFP_KERNEL);
+		if (!chips[i].pstate_stat)
+			goto free;
+
+		sprintf(name, "chip%d", chips[i].id);
+		chips[i].kobj = kobject_create_and_add(name,
+						       cpufreq_global_kobject);
+		if (!chips[i].kobj)
+			goto free;
+
+		ret = sysfs_create_groups(chips[i].kobj, throttle_attr_groups);
+		if (ret) {
+			pr_info("Chip %d failed to create throttle sysfs group\n",
+				chips[i].id);
+			goto free;
+		}
 	}
 
 	return 0;
+free:
+	nr_chips = i;
+	for (i = 0; i <= nr_chips; i++) {
+		kobject_put(chips[i].kobj);
+		kfree(chips[i].pstate_stat);
+	}
+	kfree(chips);
 free_chip_map:
 	kfree(core_to_chip_map);
 out:
@@ -623,9 +802,17 @@ module_init(powernv_cpufreq_init);
 
 static void __exit powernv_cpufreq_exit(void)
 {
+	int i;
+
 	unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
 	opal_message_notifier_unregister(OPAL_MSG_OCC,
 					 &powernv_cpufreq_opal_nb);
+
+	for (i = 0; i < nr_chips; i++) {
+		kobject_put(chips[i].kobj);
+		kfree(chips[i].pstate_stat);
+	}
+
 	kfree(chips);
 	kfree(core_to_chip_map);
 	cpufreq_unregister_driver(&powernv_cpufreq_driver);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit
  2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
@ 2016-01-28  8:00   ` Gautham R Shenoy
  2016-01-28  8:27   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:00 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	anton, ego, shreyas, bsingharora, mpe

On Thu, Jan 28, 2016 at 12:55:36PM +0530, Shilpasri G Bhat wrote:
> This will free the dynamically allocated memory of'chips' on
> module exit.
> 
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
  2016-01-28  7:25 ` [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
@ 2016-01-28  8:11     ` Gautham R Shenoy
  2016-01-28  8:28   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:11 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	anton, ego, shreyas, bsingharora, mpe

On Thu, Jan 28, 2016 at 12:55:38PM +0530, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by
> taking a contended device tree lock. This adds an unnecessary overhead
> in a hot path. So instead of calling cpu_to_chip_id() everytime cache
> the chip ids for all cores in the array 'core_to_chip_map' and use it
> in the hotpath.
> 
> Reported-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
@ 2016-01-28  8:11     ` Gautham R Shenoy
  0 siblings, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:11 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	anton, ego, shreyas, bsingharora, mpe

On Thu, Jan 28, 2016 at 12:55:38PM +0530, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by
> taking a contended device tree lock. This adds an unnecessary overhead
> in a hot path. So instead of calling cpu_to_chip_id() everytime cache
> the chip ids for all cores in the array 'core_to_chip_map' and use it
> in the hotpath.
> 
> Reported-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event
  2016-01-28  7:25 ` [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event Shilpasri G Bhat
@ 2016-01-28  8:18   ` Gautham R Shenoy
  2016-01-28  8:29   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:18 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	anton, ego, shreyas, bsingharora, mpe

On Thu, Jan 28, 2016 at 12:55:40PM +0530, Shilpasri G Bhat wrote:
> Currently we use printk message to notify the throttle event. But this
> can flood the console if the cpu is throttled frequently. So replace the
> printk with the tracepoint to notify the throttle event. And also events
> like throttle below nominal frequency and OCC_RESET are reduced to
> pr_warn/pr_warn_once as pointed by MFG to not mark them as critical
> messages. This patch adds 'throttle_reason' to struct chip to store the
> throttle reason.
> 
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
                   ` (5 preceding siblings ...)
  2016-01-28  7:25 ` [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
@ 2016-01-28  8:25 ` Viresh Kumar
  2016-01-28 11:10 ` Balbir Singh
  7 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  8:25 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
> CPU when the chip crosses its thermal and power limits. Currently,
> powernv-cpufreq driver detects and reports this event as a console
> message. Some machines may not sustain the max turbo frequency in all
> conditions and can be throttled frequently. This can lead to the
> flooding of console with throttle messages. So this patchset aims to
> redesign the presentation of this event via sysfs counters and
> tracepoints. And it also fixes couple of bugs reported in the driver.
> 
> - Patch [1] fixes a memory leak bug
> - Patch [2] fixes the cpu hot-plug bug in powernv_cpufreq_work_fn().
> - Patch [3] solves a bug in powernv_cpufreq_throttle_check(), which
>   calls in to cpu_to_chip_id() in hot path which reads DT every time
>   to find the chip id.
> - Patches [4] to [6] will add a perf trace point
>   "power:powernv_throttle" and sysfs throttle counter stats in
>   /sys/devices/system/cpu/cpufreq/chipN.
> 
> Changes from v6:
> - Changes wrt comments from Balbir Singh and Viresh Kumar.

Who cares about these names in version-log ?? You have completely
missed what should have been present here. This is version log and
that's what should be present here :)

And because of that, I have to
- search for your earlier version in my mailbox
- Read all my comments
- Haven't read what Balbir have said

See ..

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit
  2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
  2016-01-28  8:00   ` Gautham R Shenoy
@ 2016-01-28  8:27   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  8:27 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> This will free the dynamically allocated memory of'chips' on
> module exit.

Though it has a 'space' issues before 'chips', but I don't really care
much about that and so you aren't required to resend, unless you have
to send a v8 for something else.

> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
>  drivers/cpufreq/powernv-cpufreq.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
> index 547890f..53f980b 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -612,6 +612,7 @@ static void __exit powernv_cpufreq_exit(void)
>  	unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
>  	opal_message_notifier_unregister(OPAL_MSG_OCC,
>  					 &powernv_cpufreq_opal_nb);
> +	kfree(chips);
>  	cpufreq_unregister_driver(&powernv_cpufreq_driver);
>  }
>  module_exit(powernv_cpufreq_exit);

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
  2016-01-28  7:25 ` [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
  2016-01-28  8:11     ` Gautham R Shenoy
@ 2016-01-28  8:28   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  8:28 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> cpu_to_chip_id() does a DT walk through to find out the chip id by
> taking a contended device tree lock. This adds an unnecessary overhead
> in a hot path. So instead of calling cpu_to_chip_id() everytime cache
> the chip ids for all cores in the array 'core_to_chip_map' and use it
> in the hotpath.
> 
> Reported-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
> Changes from v6:
> - Minor changes to move the code 'cpumask_copy()' after 'core_to_chip_map'
>   is allocated.
> - Move 'kfree(chips)' to a separate patch.

See, you weren't that bad :)

Just that you missed saying that individual patches contain
version-log in cover-letter :)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event
  2016-01-28  7:25 ` [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event Shilpasri G Bhat
  2016-01-28  8:18   ` Gautham R Shenoy
@ 2016-01-28  8:29   ` Viresh Kumar
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  8:29 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> Currently we use printk message to notify the throttle event. But this
> can flood the console if the cpu is throttled frequently. So replace the
> printk with the tracepoint to notify the throttle event. And also events
> like throttle below nominal frequency and OCC_RESET are reduced to
> pr_warn/pr_warn_once as pointed by MFG to not mark them as critical
> messages. This patch adds 'throttle_reason' to struct chip to store the
> throttle reason.
> 
> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
> ---
> Changes from v6:
> - Rename struct chip member 'throt_reason' to 'throttle_reason'

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28  8:34     ` Gautham R Shenoy
  0 siblings, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:34 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	anton, ego, shreyas, bsingharora, mpe, linux-api

Hi Shilpa,

A minor nit.
On Thu, Jan 28, 2016 at 12:55:41PM +0530, Shilpasri G Bhat wrote:

[..snip..]
> +
> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_reasons/<reason_attributes>
> +Date:		Jan 2016
> +Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
> +		Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
> +Description:	CPU Frequency throttle reason stat for the chip
> +
> +		This directory contains throttle reason files. Each file gives
> +		the total number of times the max frequency is throttled, except
> +		for 'unthrottle_count', which gives the total number of times
> +		the max frequency is unthrottled after being throttled. Below
> +		are the reason attributes.
> +
> +		cpu_over_temperature: Throttled due to cpu over temperature
> +
> +		occ_reset: Throttled due to reset of OCC
> +
> +		over_current: Throttled due to over current

Overcurrent is a single word. No need of the extra space. You could
fix that and add my Reviewed-by.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28  8:34     ` Gautham R Shenoy
  0 siblings, 0 replies; 25+ messages in thread
From: Gautham R Shenoy @ 2016-01-28  8:34 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev-mnsaURCQ41sdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-LthD3rsA81gm4RdzfppkhA,
	viresh.kumar-QSEj5FYQhm4dnm+yROfE0A,
	linux-pm-u79uwXL29TY76Z2rM5mHXA, pc-r/Jw6+rmf7HQT0dZR+AlfA,
	anton-eUNUBHrolfbYtjvyW6yDsg,
	ego-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	shreyas-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w, mpe-Gsx/Oe8HsFggBc27wqDAHg,
	linux-api-u79uwXL29TY76Z2rM5mHXA

Hi Shilpa,

A minor nit.
On Thu, Jan 28, 2016 at 12:55:41PM +0530, Shilpasri G Bhat wrote:

[..snip..]
> +
> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_reasons/<reason_attributes>
> +Date:		Jan 2016
> +Contact:	Linux kernel mailing list <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> +		Linux for PowerPC mailing list <linuxppc-dev-mnsaURCQ41sdnm+yROfE0A@public.gmane.org>
> +Description:	CPU Frequency throttle reason stat for the chip
> +
> +		This directory contains throttle reason files. Each file gives
> +		the total number of times the max frequency is throttled, except
> +		for 'unthrottle_count', which gives the total number of times
> +		the max frequency is unthrottled after being throttled. Below
> +		are the reason attributes.
> +
> +		cpu_over_temperature: Throttled due to cpu over temperature
> +
> +		occ_reset: Throttled due to reset of OCC
> +
> +		over_current: Throttled due to over current

Overcurrent is a single word. No need of the extra space. You could
fix that and add my Reviewed-by.

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
  2016-01-28  7:25 ` [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
  2016-01-28  8:34     ` Gautham R Shenoy
@ 2016-01-28  8:40   ` Viresh Kumar
  2016-01-28  9:36     ` Shilpasri G Bhat
  2016-01-28 11:40     ` Viresh Kumar
  2 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  8:40 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe, linux-api

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index b683e8e..dea4620 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -271,3 +271,48 @@ Description:	Parameters for the CPU cache attributes
>  			- WriteBack: data is written only to the cache line and
>  				     the modified cache line is written to main
>  				     memory only when it is replaced
> +
> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_stats

What about the chip directory ? Shouldn't that be documented? And
shouldn't that mention that this is just for powerpc ?

And before that, I don't think that you are doing this properly. I am
sorry that I never came to a point where I could review it, and you
continued with it, version after version.

But, I really have strong objections to the way this is done. And you
are making things more complex then they are.

So, these stats are per-policy, right ?

Then why aren't they added on the policy->kobj instead, just like
cpufreq-stats? And maybe inside cpufreq-stats folder only?

That will solve many complexities you have in place here and will look
sane as well.

Right now, you have stats as two places, cpu/cpufreq/chip/ and
cpu/cpuX/cpufreq/stats/, which doesn't look wise and adds to
confusion.

What do you say?

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
  2016-01-28  8:40   ` Viresh Kumar
@ 2016-01-28  9:36     ` Shilpasri G Bhat
  2016-01-28  9:41         ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  9:36 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe, linux-api

Hi Viresh,

On 01/28/2016 02:10 PM, Viresh Kumar wrote:
> On 28-01-16, 12:55, Shilpasri G Bhat wrote:
>> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> index b683e8e..dea4620 100644
>> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
>> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> @@ -271,3 +271,48 @@ Description:	Parameters for the CPU cache attributes
>>  			- WriteBack: data is written only to the cache line and
>>  				     the modified cache line is written to main
>>  				     memory only when it is replaced
>> +
>> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_stats
> 
> What about the chip directory ? Shouldn't that be documented? And
> shouldn't that mention that this is just for powerpc ?
> 
> And before that, I don't think that you are doing this properly. I am
> sorry that I never came to a point where I could review it, and you
> continued with it, version after version.
> 
> But, I really have strong objections to the way this is done. And you
> are making things more complex then they are.
> 
> So, these stats are per-policy, right ?

First of all sorry about the version log.
No these stats are not per-policy. They are per-chip. The throttle event is
common for all cores in the chip.

> 
> Then why aren't they added on the policy->kobj instead, just like
> cpufreq-stats? And maybe inside cpufreq-stats folder only?
> 
> That will solve many complexities you have in place here and will look
> sane as well.
> 
> Right now, you have stats as two places, cpu/cpufreq/chip/ and
> cpu/cpuX/cpufreq/stats/, which doesn't look wise and adds to
> confusion.
> 
> What do you say?
> 

Yes agree that it will be much cleaner with policy->kobj. But using policy->kobj
will result in multiple copies of the throttle-chip stats exported for each
policy in the chip. And moving it to cpu/cpuX/cpufreq/stats/
will add a dependency on CONFIG_CPU_FREQ_STAT

We want throttle attributes to be either in cpu/cpufreq or cpu/cpuX/cpufreq. If
multiple copies is not an issue, then I will move it to cpu/cpuX/cpufreq.

Thanks and Regards,
Shilpa

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28  9:41         ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  9:41 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe, linux-api

On 28-01-16, 15:06, Shilpasri G Bhat wrote:
> No these stats are not per-policy. They are per-chip. The throttle event is
> common for all cores in the chip.

How do you define a chip? And how is it different then the group of
CPUs represented by the policy ?

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28  9:41         ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28  9:41 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev-mnsaURCQ41sdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-pm-u79uwXL29TY76Z2rM5mHXA, pc-r/Jw6+rmf7HQT0dZR+AlfA,
	anton-eUNUBHrolfbYtjvyW6yDsg,
	ego-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	shreyas-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w, mpe-Gsx/Oe8HsFggBc27wqDAHg,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 28-01-16, 15:06, Shilpasri G Bhat wrote:
> No these stats are not per-policy. They are per-chip. The throttle event is
> common for all cores in the chip.

How do you define a chip? And how is it different then the group of
CPUs represented by the policy ?

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
  2016-01-28  9:41         ` Viresh Kumar
  (?)
@ 2016-01-28  9:54         ` Shilpasri G Bhat
  -1 siblings, 0 replies; 25+ messages in thread
From: Shilpasri G Bhat @ 2016-01-28  9:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe, linux-api



On 01/28/2016 03:11 PM, Viresh Kumar wrote:
> On 28-01-16, 15:06, Shilpasri G Bhat wrote:
>> No these stats are not per-policy. They are per-chip. The throttle event is
>> common for all cores in the chip.
> 
> How do you define a chip? And how is it different then the group of
> CPUs represented by the policy ?
> 

Chip is a group of policies.
Hmm yes I see your point. We anyways maintain frequency stats which is
per-policy. We might as well have throttle stats exported per-policy which
points to per-chip data.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver
  2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
                   ` (6 preceding siblings ...)
  2016-01-28  8:25 ` [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Viresh Kumar
@ 2016-01-28 11:10 ` Balbir Singh
  7 siblings, 0 replies; 25+ messages in thread
From: Balbir Singh @ 2016-01-28 11:10 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, viresh.kumar, linux-pm, pc,
	Anton Blanchard, Gautham R Shenoy, shreyas, mpe

On Thu, Jan 28, 2016 at 6:25 PM, Shilpasri G Bhat
<shilpa.bhat@linux.vnet.ibm.com> wrote:
> In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
> CPU when the chip crosses its thermal and power limits. Currently,
> powernv-cpufreq driver detects and reports this event as a console
> message. Some machines may not sustain the max turbo frequency in all
> conditions and can be throttled frequently. This can lead to the
> flooding of console with throttle messages. So this patchset aims to
> redesign the presentation of this event via sysfs counters and
> tracepoints. And it also fixes couple of bugs reported in the driver.
>
> - Patch [1] fixes a memory leak bug
> - Patch [2] fixes the cpu hot-plug bug in powernv_cpufreq_work_fn().
> - Patch [3] solves a bug in powernv_cpufreq_throttle_check(), which
>   calls in to cpu_to_chip_id() in hot path which reads DT every time
>   to find the chip id.
> - Patches [4] to [6] will add a perf trace point
>   "power:powernv_throttle" and sysfs throttle counter stats in
>   /sys/devices/system/cpu/cpufreq/chipN.
>

Looks good to me. You've got the reviews and acks you need.

Balbir Singh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28 11:40     ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28 11:40 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev, linux-kernel, rjw, linux-pm, pc, anton, ego,
	shreyas, bsingharora, mpe, linux-api

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> Create sysfs attributes to export throttle information in
> /sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
> follows:
> 
> 1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
>   This gives the throttle stats for each of the available frequencies.
>   The throttle stat of a frequency is the total number of times the max
>   frequency is reduced to that frequency.
>   # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
>   4023000 0
>   3990000 0
>   3956000 1
>   3923000 0
>   3890000 0
>   3857000 2
>   3823000 0
>   3790000 0
>   3757000 2
>   3724000 1
>   3690000 1
>   ...
> 
> 2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
>   This directory contains throttle reason files. Each file gives the
>   total number of times the max frequency is throttled, except for
>   'unthrottle_count', which gives the total number of times the max
>   frequency is unthrottled after being throttled.
>   # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
>   # cat cpu_over_temperature
>   7
>   # cat occ_reset
>   0
>   # cat over_current
>   0
>   # cat power_cap
>   0
>   # cat power_supply_failure
>   0
>   # cat unthrottle_count
>   7

Wouldn't it be better to keep a two dimensional table here, something
like: trans_table ?

You can have "reasons" in the vertical dimension and frequencies in
the horizontal one? So, that user can see which frequencies got
throttled and why?

> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index b683e8e..dea4620 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -271,3 +271,48 @@ Description:	Parameters for the CPU cache attributes
>  			- WriteBack: data is written only to the cache line and
>  				     the modified cache line is written to main
>  				     memory only when it is replaced
> +
> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_stats

You need documentation for chip*/ as well..

And how can a user know which policies or CPUs belong to a chip?

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>  static struct chip {
>  	unsigned int id;
>  	bool throttled;
> @@ -62,6 +72,11 @@ static struct chip {
>  	u8 throttle_reason;
>  	cpumask_t mask;
>  	struct work_struct throttle;
> +	int throttle_turbo;
> +	int throttle_nominal;
> +	int reason[OCC_MAX_REASON];
> +	int *pstate_stat;
> +	struct kobject *kobj;
>  } *chips;
>  
>  static int nr_chips;
> @@ -196,6 +211,126 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
>  	NULL,
>  };
>  
> +static inline int get_chip_index(unsigned int id)
> +{
> +	int i;
> +
> +	for (i = 0; i < nr_chips; i++)
> +		if (chips[i].id == id)
> +			return i;
> +
> +	return -EINVAL;
> +}
> +
> +static inline int get_chip_index_from_kobj(struct kobject *kobj)
> +{
> +	int ret, id;
> +	int len = strlen("chip");
> +
> +	ret = kstrtoint(kobj->name + len, 0, &id);

That doesn't look nice. What about keeping the kobject in the 'struct
chip' and using container of here?

You can register the kobject with: kobject_init_and_add().

> +	if (ret)
> +		return ret;
> +
> +	ret = get_chip_index(id);
> +	if (ret < 0)
> +		pr_warn_once("%s Matching chip-id not found %d\n", __func__,
> +			     id);
> +	return ret;
> +}
> +
> +static ssize_t throttle_freq_show(struct kobject *kobj,
> +				  struct kobj_attribute *attr, char *buf)
> +{
> +	int i, count = 0, id;
> +
> +	id = get_chip_index_from_kobj(kobj);
> +	if (id < 0)
> +		return id;
> +
> +	for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
> +		count += sprintf(&buf[count], "%d %d\n",
> +			       powernv_freqs[i].frequency,
> +			       chips[id].pstate_stat[i]);
> +
> +	return count;
> +}
> +
> +static struct kobj_attribute attr_throttle_frequencies =
> +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);

Use DEVICE_ATTR_RO macro ?

> @@ -583,12 +736,38 @@ static int init_chip_info(void)
>  		goto free_chip_map;
>  
>  	for (i = 0; i < nr_chips; i++) {
> +		char name[10];
> +
>  		chips[i].id = chip[i];
>  		cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
>  		INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
> +		chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates,
> +						sizeof(int), GFP_KERNEL);
> +		if (!chips[i].pstate_stat)
> +			goto free;
> +
> +		sprintf(name, "chip%d", chips[i].id);

Always use snprintf ..

> +		chips[i].kobj = kobject_create_and_add(name,
> +						       cpufreq_global_kobject);
> +		if (!chips[i].kobj)
> +			goto free;
> +
> +		ret = sysfs_create_groups(chips[i].kobj, throttle_attr_groups);
> +		if (ret) {
> +			pr_info("Chip %d failed to create throttle sysfs group\n",
> +				chips[i].id);
> +			goto free;
> +		}
>  	}
>  
>  	return 0;
> +free:
> +	nr_chips = i;
> +	for (i = 0; i <= nr_chips; i++) {

Rather do:

        while (--i >= 0) {
                kobject_put(chips[i].kobj);
                kfree(chips[i].pstate_stat);
        }

> +		kobject_put(chips[i].kobj);
> +		kfree(chips[i].pstate_stat);

You haven't removed the sysfs-group. Also make sure you aren't trying
to free/remove resources you haven't allocated, as the earlier part
can fail after 3 different allocations and you are doing same thing
here for all of them.

> +	}
> +	kfree(chips);
>  free_chip_map:
>  	kfree(core_to_chip_map);
>  out:
> @@ -623,9 +802,17 @@ module_init(powernv_cpufreq_init);
>  
>  static void __exit powernv_cpufreq_exit(void)
>  {
> +	int i;
> +
>  	unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
>  	opal_message_notifier_unregister(OPAL_MSG_OCC,
>  					 &powernv_cpufreq_opal_nb);
> +
> +	for (i = 0; i < nr_chips; i++) {
> +		kobject_put(chips[i].kobj);

remove groups ?

> +		kfree(chips[i].pstate_stat);
> +	}
> +
>  	kfree(chips);
>  	kfree(core_to_chip_map);
>  	cpufreq_unregister_driver(&powernv_cpufreq_driver);
> -- 
> 1.9.3

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats
@ 2016-01-28 11:40     ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2016-01-28 11:40 UTC (permalink / raw)
  To: Shilpasri G Bhat
  Cc: linuxppc-dev-mnsaURCQ41sdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, rjw-LthD3rsA81gm4RdzfppkhA,
	linux-pm-u79uwXL29TY76Z2rM5mHXA, pc-r/Jw6+rmf7HQT0dZR+AlfA,
	anton-eUNUBHrolfbYtjvyW6yDsg,
	ego-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	shreyas-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w, mpe-Gsx/Oe8HsFggBc27wqDAHg,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 28-01-16, 12:55, Shilpasri G Bhat wrote:
> Create sysfs attributes to export throttle information in
> /sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as
> follows:
> 
> 1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
>   This gives the throttle stats for each of the available frequencies.
>   The throttle stat of a frequency is the total number of times the max
>   frequency is reduced to that frequency.
>   # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies
>   4023000 0
>   3990000 0
>   3956000 1
>   3923000 0
>   3890000 0
>   3857000 2
>   3823000 0
>   3790000 0
>   3757000 2
>   3724000 1
>   3690000 1
>   ...
> 
> 2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
>   This directory contains throttle reason files. Each file gives the
>   total number of times the max frequency is throttled, except for
>   'unthrottle_count', which gives the total number of times the max
>   frequency is unthrottled after being throttled.
>   # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons
>   # cat cpu_over_temperature
>   7
>   # cat occ_reset
>   0
>   # cat over_current
>   0
>   # cat power_cap
>   0
>   # cat power_supply_failure
>   0
>   # cat unthrottle_count
>   7

Wouldn't it be better to keep a two dimensional table here, something
like: trans_table ?

You can have "reasons" in the vertical dimension and frequencies in
the horizontal one? So, that user can see which frequencies got
throttled and why?

> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index b683e8e..dea4620 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -271,3 +271,48 @@ Description:	Parameters for the CPU cache attributes
>  			- WriteBack: data is written only to the cache line and
>  				     the modified cache line is written to main
>  				     memory only when it is replaced
> +
> +What:		/sys/devices/system/cpu/cpufreq/chip*/throttle_stats

You need documentation for chip*/ as well..

And how can a user know which policies or CPUs belong to a chip?

> diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
>  static struct chip {
>  	unsigned int id;
>  	bool throttled;
> @@ -62,6 +72,11 @@ static struct chip {
>  	u8 throttle_reason;
>  	cpumask_t mask;
>  	struct work_struct throttle;
> +	int throttle_turbo;
> +	int throttle_nominal;
> +	int reason[OCC_MAX_REASON];
> +	int *pstate_stat;
> +	struct kobject *kobj;
>  } *chips;
>  
>  static int nr_chips;
> @@ -196,6 +211,126 @@ static struct freq_attr *powernv_cpu_freq_attr[] = {
>  	NULL,
>  };
>  
> +static inline int get_chip_index(unsigned int id)
> +{
> +	int i;
> +
> +	for (i = 0; i < nr_chips; i++)
> +		if (chips[i].id == id)
> +			return i;
> +
> +	return -EINVAL;
> +}
> +
> +static inline int get_chip_index_from_kobj(struct kobject *kobj)
> +{
> +	int ret, id;
> +	int len = strlen("chip");
> +
> +	ret = kstrtoint(kobj->name + len, 0, &id);

That doesn't look nice. What about keeping the kobject in the 'struct
chip' and using container of here?

You can register the kobject with: kobject_init_and_add().

> +	if (ret)
> +		return ret;
> +
> +	ret = get_chip_index(id);
> +	if (ret < 0)
> +		pr_warn_once("%s Matching chip-id not found %d\n", __func__,
> +			     id);
> +	return ret;
> +}
> +
> +static ssize_t throttle_freq_show(struct kobject *kobj,
> +				  struct kobj_attribute *attr, char *buf)
> +{
> +	int i, count = 0, id;
> +
> +	id = get_chip_index_from_kobj(kobj);
> +	if (id < 0)
> +		return id;
> +
> +	for (i = 0; i < powernv_pstate_info.nr_pstates; i++)
> +		count += sprintf(&buf[count], "%d %d\n",
> +			       powernv_freqs[i].frequency,
> +			       chips[id].pstate_stat[i]);
> +
> +	return count;
> +}
> +
> +static struct kobj_attribute attr_throttle_frequencies =
> +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL);

Use DEVICE_ATTR_RO macro ?

> @@ -583,12 +736,38 @@ static int init_chip_info(void)
>  		goto free_chip_map;
>  
>  	for (i = 0; i < nr_chips; i++) {
> +		char name[10];
> +
>  		chips[i].id = chip[i];
>  		cpumask_copy(&chips[i].mask, cpumask_of_node(chip[i]));
>  		INIT_WORK(&chips[i].throttle, powernv_cpufreq_work_fn);
> +		chips[i].pstate_stat = kcalloc(powernv_pstate_info.nr_pstates,
> +						sizeof(int), GFP_KERNEL);
> +		if (!chips[i].pstate_stat)
> +			goto free;
> +
> +		sprintf(name, "chip%d", chips[i].id);

Always use snprintf ..

> +		chips[i].kobj = kobject_create_and_add(name,
> +						       cpufreq_global_kobject);
> +		if (!chips[i].kobj)
> +			goto free;
> +
> +		ret = sysfs_create_groups(chips[i].kobj, throttle_attr_groups);
> +		if (ret) {
> +			pr_info("Chip %d failed to create throttle sysfs group\n",
> +				chips[i].id);
> +			goto free;
> +		}
>  	}
>  
>  	return 0;
> +free:
> +	nr_chips = i;
> +	for (i = 0; i <= nr_chips; i++) {

Rather do:

        while (--i >= 0) {
                kobject_put(chips[i].kobj);
                kfree(chips[i].pstate_stat);
        }

> +		kobject_put(chips[i].kobj);
> +		kfree(chips[i].pstate_stat);

You haven't removed the sysfs-group. Also make sure you aren't trying
to free/remove resources you haven't allocated, as the earlier part
can fail after 3 different allocations and you are doing same thing
here for all of them.

> +	}
> +	kfree(chips);
>  free_chip_map:
>  	kfree(core_to_chip_map);
>  out:
> @@ -623,9 +802,17 @@ module_init(powernv_cpufreq_init);
>  
>  static void __exit powernv_cpufreq_exit(void)
>  {
> +	int i;
> +
>  	unregister_reboot_notifier(&powernv_cpufreq_reboot_nb);
>  	opal_message_notifier_unregister(OPAL_MSG_OCC,
>  					 &powernv_cpufreq_opal_nb);
> +
> +	for (i = 0; i < nr_chips; i++) {
> +		kobject_put(chips[i].kobj);

remove groups ?

> +		kfree(chips[i].pstate_stat);
> +	}
> +
>  	kfree(chips);
>  	kfree(core_to_chip_map);
>  	cpufreq_unregister_driver(&powernv_cpufreq_driver);
> -- 
> 1.9.3

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-01-28 11:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28  7:25 [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Shilpasri G Bhat
2016-01-28  7:25 ` [PATCH v7 1/6] cpufreq: powernv: Free 'chips' on module exit Shilpasri G Bhat
2016-01-28  8:00   ` Gautham R Shenoy
2016-01-28  8:27   ` Viresh Kumar
2016-01-28  7:25 ` [PATCH v7 2/6] cpufreq: powernv: Hot-plug safe the kworker thread Shilpasri G Bhat
2016-01-28  7:25 ` [PATCH v7 3/6] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path Shilpasri G Bhat
2016-01-28  8:11   ` Gautham R Shenoy
2016-01-28  8:11     ` Gautham R Shenoy
2016-01-28  8:28   ` Viresh Kumar
2016-01-28  7:25 ` [PATCH v7 4/6] cpufreq: powernv/tracing: Add powernv_throttle tracepoint Shilpasri G Bhat
2016-01-28  7:25 ` [PATCH v7 5/6] cpufreq: powernv: Replace pr_info with trace print for throttle event Shilpasri G Bhat
2016-01-28  8:18   ` Gautham R Shenoy
2016-01-28  8:29   ` Viresh Kumar
2016-01-28  7:25 ` [PATCH v7 6/6] cpufreq: powernv: Add sysfs attributes to show throttle stats Shilpasri G Bhat
2016-01-28  8:34   ` Gautham R Shenoy
2016-01-28  8:34     ` Gautham R Shenoy
2016-01-28  8:40   ` Viresh Kumar
2016-01-28  9:36     ` Shilpasri G Bhat
2016-01-28  9:41       ` Viresh Kumar
2016-01-28  9:41         ` Viresh Kumar
2016-01-28  9:54         ` Shilpasri G Bhat
2016-01-28 11:40   ` Viresh Kumar
2016-01-28 11:40     ` Viresh Kumar
2016-01-28  8:25 ` [PATCH v7 0/6] cpufreq: powernv: Redesign the presentation of throttle notification and solve bug-fixes in the driver Viresh Kumar
2016-01-28 11:10 ` Balbir Singh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.