linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
@ 2013-06-05  8:11 Chanwoo Choi
  2013-06-07 10:23 ` Viresh Kumar
  2013-06-11 22:14 ` Rafael J. Wysocki
  0 siblings, 2 replies; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-05  8:11 UTC (permalink / raw)
  To: rjw, viresh.kumar
  Cc: linux-kernel, linux-pm, cpufreq, kyungmin.park, myungjoo.ham, cw00.choi

This patch add new sysfs file to show previous accumulated data of CPU load
as following path. This sysfs file is used to judge the correct system state
or determine suitable system resource on user-space.
- /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table

This sysfs file include following data:
- Measurement point of time
- CPU frequency
- Per-CPU load

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---

Additionally, I explain an example using 'load_table' sysfs entry.

Exynos4412 series has Quad-core and all cores share the power-line.
I cann't set diffent voltage/frequency to each CPU. To reduce power-
consumption, I certainly have to turn on/off CPU online state
according to CPU load on runtime. As a result, I peridically need to
monitor current cpu state to determine a proper amount of system
resource(necessary number of online cpu) and to delete wasted power.
So, I need 'load_table' sysfs file to monitor current cpu state.

I add a table which show power consumption of target based on
Exynos4412 SoC. This table indicate the difference power-consumption
according to a number of online core and with same number of running task.

[Environment of power estimation]
- cpufreq governor used performance mode to estimate power-consumption on each frequency step.
- Use infinite-loop test program which execute while statement infinitely.
- Always measure the power consumption under same temperature during 1 minutes.
- Unit is mA.
------------------------------------------------------------------------------------------------------------------------------------
A number of Online core	| Core 1	| Core 2		| Core 3			| Core 4
A number of nr_running	| 0	1	| 0	1	2	| 0	1	2	3	| 0	1	2	3	4
------------------------------------------------------------------------------------------------------------------------------------
 CPU Frequency		| 
 800  MHz		| 293 	330 	| 295 	338 	379 	| 300 	339 	386 	433 	| 303 	341 	390 	412 	482 
 1000 MHz		| 312 	371 	| 316 	377 	435 	| 318 	383 	454 	522 	| 322 	391 	462 	526 	596 
 1200 MHz		| 323 	404 	| 328 	418 	504 	| 336 	423 	521 	615 	| 343 	433 	499 	639 	748 
 1600 MHz 		| 380 	525 	| 388 	556 	771 	| 399 	575 	771 	1011 	| 412 	597 	822 	1172 	1684 
------------------------------------------------------------------------------------------------------------------------------------

For example,
The case A/B/C have the same condition except for a number of online core.
- case A: Online core is 2, 1000MHz and nr_running is 1	: 377mA
- case B: Online core is 3, 1000MHz and nr_running is 1	: 383mA
- case C: Online core is 4, 1000Mz and nr_running is 1	: 391mA

If system has only one running task, cpu hotplug policy, by monitoring
cpu state through 'load_table' sysfs file on user-space,
will determine 'case A' state for reducing power-consumption.

Show the result of reading 'load_table sysfs file as following:
- cpufreq governor is ondemand_org governor.

$ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
       Time  Frequency cpu0 cpu1 cpu2 cpu3
1300500043122   1600000   32    6    0   26
1300600079080    800000   63   10    2   45
1300700065288    800000   51    9    1   42
1300800228747    800000   51    9    1   43
1300900182997    800000   78   11    3   47
1301000106163    800000   96   26    6   48
1301100056247   1600000   45    7    1   27
1301200071373   1000000   55    9    1   37
1301300096082    800000   54   10    0   45
1301400132832    800000   70   11    2   46
1301500082290    800000   61   11    1   43
1301600236415    800000   61    9    2   43
1301700071498    800000   59   10    2   43
1301800159290    800000   55    9    1   42
1301900076332    800000   66   10    2   43
1302000102165    800000   47    9    0   43
1302100086623    800000   75   11    2   50
1302200101082    800000   66   10    4   46
1302300108873    800000   53   10    1   44
1302400373373    600000   63   12    1   54

Please let me know some opinion about this patch.

Best regards and Thanks,
Chanwoo Choi

---
 drivers/cpufreq/cpufreq.c          |  4 +++
 drivers/cpufreq/cpufreq_governor.c | 21 ++++++++++--
 drivers/cpufreq/cpufreq_stats.c    | 70 ++++++++++++++++++++++++++++++++++++++
 include/linux/cpufreq.h            |  6 ++++
 4 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2d53f47..cbaaff0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -292,6 +292,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
 		if (likely(policy) && likely(policy->cpu == freqs->cpu))
 			policy->cur = freqs->new;
 		break;
+	case CPUFREQ_LOADCHECK:
+		srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
+				CPUFREQ_LOADCHECK, freqs);
+		break;
 	}
 }
 /**
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 5af40ad..bc50624 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -23,12 +23,17 @@
 #include <linux/kernel_stat.h>
 #include <linux/mutex.h>
 #include <linux/slab.h>
+#include <linux/sched.h>
 #include <linux/tick.h>
 #include <linux/types.h>
 #include <linux/workqueue.h>
 
 #include "cpufreq_governor.h"
 
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+	struct cpufreq_freqs freqs;
+#endif
+
 static struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
 {
 	if (have_governor_per_policy())
@@ -143,11 +148,17 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 			idle_time += jiffies_to_usecs(cur_nice_jiffies);
 		}
 
-		if (unlikely(!wall_time || wall_time < idle_time))
+		if (unlikely(!wall_time || wall_time < idle_time)) {
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+			freqs.load[j] = 0;
+#endif
 			continue;
+		}
 
 		load = 100 * (wall_time - idle_time) / wall_time;
-
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+		freqs.load[j] = load;
+#endif
 		if (dbs_data->cdata->governor == GOV_ONDEMAND) {
 			int freq_avg = __cpufreq_driver_getavg(policy, j);
 			if (freq_avg <= 0)
@@ -160,6 +171,12 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 			max_load = load;
 	}
 
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+	freqs.time = ktime_to_ns(ktime_get());
+	freqs.old = freqs.new = policy->cur;
+
+	cpufreq_notify_transition(policy, &freqs, CPUFREQ_LOADCHECK);
+#endif
 	dbs_data->cdata->gov_check_cpu(cpu, max_load);
 }
 EXPORT_SYMBOL_GPL(dbs_check_cpu);
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index fb65dec..2379b1d 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -22,6 +22,8 @@
 #include <linux/notifier.h>
 #include <asm/cputime.h>
 
+#define CPUFREQ_LOAD_TABLE_MAX		20
+
 static spinlock_t cpufreq_stats_lock;
 
 struct cpufreq_stats {
@@ -35,6 +37,10 @@ struct cpufreq_stats {
 	unsigned int *freq_table;
 #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
 	unsigned int *trans_table;
+
+	struct cpufreq_freqs *load_table;
+	unsigned int load_last_index;
+	unsigned int load_max_index;
 #endif
 };
 
@@ -131,6 +137,38 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
 	return len;
 }
 cpufreq_freq_attr_ro(trans_table);
+
+static ssize_t show_load_table(struct cpufreq_policy *policy, char *buf)
+{
+	struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
+	struct cpufreq_freqs *load_table = stat->load_table;
+	ssize_t len = 0;
+	int i;
+	int j;
+
+	len += sprintf(buf + len, "%11s %10s", "Time", "Frequency");
+	for (j = 0; j < NR_CPUS; j++)
+		len += sprintf(buf + len, " %3s%d", "cpu", j);
+	len += sprintf(buf + len, "\n");
+
+	i = stat->load_last_index;
+	do {
+		len += sprintf(buf + len, "%lld %9d",
+				load_table[i].time,
+				load_table[i].old);
+
+		for (j = 0; j < NR_CPUS; j++)
+			len += sprintf(buf + len, " %4d",
+					load_table[i].load[j]);
+		len += sprintf(buf + len, "\n");
+
+		if (++i == stat->load_max_index)
+			i = 0;
+	} while (i != stat->load_last_index);
+
+	return len;
+}
+cpufreq_freq_attr_ro(load_table);
 #endif
 
 cpufreq_freq_attr_ro(total_trans);
@@ -141,6 +179,7 @@ static struct attribute *default_attrs[] = {
 	&time_in_state.attr,
 #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
 	&trans_table.attr,
+	&load_table.attr,
 #endif
 	NULL
 };
@@ -167,6 +206,9 @@ static void cpufreq_stats_free_table(unsigned int cpu)
 
 	if (stat) {
 		pr_debug("%s: Free stat table\n", __func__);
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+		kfree(stat->load_table);
+#endif
 		kfree(stat->time_in_state);
 		kfree(stat);
 		per_cpu(cpufreq_stats_table, cpu) = NULL;
@@ -244,6 +286,16 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
 
 #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
 	stat->trans_table = stat->freq_table + count;
+
+	stat->load_max_index = CPUFREQ_LOAD_TABLE_MAX;
+	stat->load_last_index = 0;
+
+	alloc_size = sizeof(struct cpufreq_freqs) * stat->load_max_index;
+	stat->load_table = kzalloc(alloc_size, GFP_KERNEL);
+	if (!stat->load_table) {
+		ret = -ENOMEM;
+		goto error_out;
+	}
 #endif
 	j = 0;
 	for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
@@ -312,13 +364,31 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
 	struct cpufreq_stats *stat;
 	int old_index, new_index;
 
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+	if (val != CPUFREQ_POSTCHANGE && val != CPUFREQ_LOADCHECK)
+#else
 	if (val != CPUFREQ_POSTCHANGE)
+#endif
 		return 0;
 
 	stat = per_cpu(cpufreq_stats_table, freq->cpu);
 	if (!stat)
 		return 0;
 
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+	if (val == CPUFREQ_LOADCHECK) {
+		struct cpufreq_freqs *dest_freq;
+
+		dest_freq = &stat->load_table[stat->load_last_index];
+		memcpy(dest_freq, freq, sizeof(struct cpufreq_freqs));
+
+		if (++stat->load_last_index == stat->load_max_index)
+			stat->load_last_index = 0;
+
+		return 0;
+	}
+#endif
+
 	old_index = stat->last_index;
 	new_index = freq_table_get_index(stat, freq->new);
 
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 037d36a..f780645 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -140,12 +140,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
 #define CPUFREQ_POSTCHANGE	(1)
 #define CPUFREQ_RESUMECHANGE	(8)
 #define CPUFREQ_SUSPENDCHANGE	(9)
+#define CPUFREQ_LOADCHECK	(10)
 
 struct cpufreq_freqs {
 	unsigned int cpu;	/* cpu nr */
 	unsigned int old;
 	unsigned int new;
 	u8 flags;		/* flags of cpufreq_driver, see below. */
+
+#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
+	int64_t time;
+	unsigned int load[NR_CPUS];
+#endif
 };
 
 
-- 
1.8.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-05  8:11 [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU Chanwoo Choi
@ 2013-06-07 10:23 ` Viresh Kumar
  2013-06-10 12:13   ` Chanwoo Choi
  2013-06-11 22:14 ` Rafael J. Wysocki
  1 sibling, 1 reply; 15+ messages in thread
From: Viresh Kumar @ 2013-06-07 10:23 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

Hi Chanwoo,

On 5 June 2013 13:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:
> This patch add new sysfs file to show previous accumulated data of CPU load

Please mention we are only accumulating latest 20 values.

> as following path. This sysfs file is used to judge the correct system state
> or determine suitable system resource on user-space.

Please write it as:

load_table will be used to judge how many cpus would be sufficient for
managing current load.

> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>
> This sysfs file include following data:
> - Measurement point of time
> - CPU frequency
> - Per-CPU load
>
> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>
> Additionally, I explain an example using 'load_table' sysfs entry.
>
> Exynos4412 series has Quad-core and all cores share the power-line.
> I cann't set diffent voltage/frequency to each CPU. To reduce power-
> consumption, I certainly have to turn on/off CPU online state
> according to CPU load on runtime. As a result, I peridically need to
> monitor current cpu state to determine a proper amount of system
> resource(necessary number of online cpu) and to delete wasted power.
> So, I need 'load_table' sysfs file to monitor current cpu state.
>
> I add a table which show power consumption of target based on
> Exynos4412 SoC. This table indicate the difference power-consumption
> according to a number of online core and with same number of running task.
>
> [Environment of power estimation]
> - cpufreq governor used performance mode to estimate power-consumption on each frequency step.
> - Use infinite-loop test program which execute while statement infinitely.
> - Always measure the power consumption under same temperature during 1 minutes.
> - Unit is mA.
> ------------------------------------------------------------------------------------------------------------------------------------
> A number of Online core | Core 1        | Core 2                | Core 3                        | Core 4
> A number of nr_running  | 0     1       | 0     1       2       | 0     1       2       3       | 0     1       2       3       4
> ------------------------------------------------------------------------------------------------------------------------------------
>  CPU Frequency          |
>  800  MHz               | 293   330     | 295   338     379     | 300   339     386     433     | 303   341     390     412     482
>  1000 MHz               | 312   371     | 316   377     435     | 318   383     454     522     | 322   391     462     526     596
>  1200 MHz               | 323   404     | 328   418     504     | 336   423     521     615     | 343   433     499     639     748
>  1600 MHz               | 380   525     | 388   556     771     | 399   575     771     1011    | 412   597     822     1172    1684
> ------------------------------------------------------------------------------------------------------------------------------------
>
> For example,
> The case A/B/C have the same condition except for a number of online core.
> - case A: Online core is 2, 1000MHz and nr_running is 1 : 377mA
> - case B: Online core is 3, 1000MHz and nr_running is 1 : 383mA
> - case C: Online core is 4, 1000Mz and nr_running is 1  : 391mA
>
> If system has only one running task, cpu hotplug policy, by monitoring
> cpu state through 'load_table' sysfs file on user-space,
> will determine 'case A' state for reducing power-consumption.
>
> Show the result of reading 'load_table sysfs file as following:
> - cpufreq governor is ondemand_org governor.
>
> $ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>        Time  Frequency cpu0 cpu1 cpu2 cpu3
> 1300500043122   1600000   32    6    0   26
> 1300600079080    800000   63   10    2   45
> 1300700065288    800000   51    9    1   42
> 1300800228747    800000   51    9    1   43
> 1300900182997    800000   78   11    3   47
> 1301000106163    800000   96   26    6   48
> 1301100056247   1600000   45    7    1   27
> 1301200071373   1000000   55    9    1   37
> 1301300096082    800000   54   10    0   45
> 1301400132832    800000   70   11    2   46
> 1301500082290    800000   61   11    1   43
> 1301600236415    800000   61    9    2   43
> 1301700071498    800000   59   10    2   43
> 1301800159290    800000   55    9    1   42
> 1301900076332    800000   66   10    2   43
> 1302000102165    800000   47    9    0   43
> 1302100086623    800000   75   11    2   50
> 1302200101082    800000   66   10    4   46
> 1302300108873    800000   53   10    1   44
> 1302400373373    600000   63   12    1   54

How are you getting loads different for all your cpus? I believe you
are just recording these values for policy->cpu and all cpus share
same policy on your platform.

> Please let me know some opinion about this patch.
>
> Best regards and Thanks,
> Chanwoo Choi
>
> ---
>  drivers/cpufreq/cpufreq.c          |  4 +++
>  drivers/cpufreq/cpufreq_governor.c | 21 ++++++++++--
>  drivers/cpufreq/cpufreq_stats.c    | 70 ++++++++++++++++++++++++++++++++++++++
>  include/linux/cpufreq.h            |  6 ++++
>  4 files changed, 99 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2d53f47..cbaaff0 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -292,6 +292,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
>                 if (likely(policy) && likely(policy->cpu == freqs->cpu))
>                         policy->cur = freqs->new;
>                 break;
> +       case CPUFREQ_LOADCHECK:
> +               srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
> +                               CPUFREQ_LOADCHECK, freqs);
> +               break;
>         }
>  }
>  /**
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad..bc50624 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -23,12 +23,17 @@
>  #include <linux/kernel_stat.h>
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
> +#include <linux/sched.h>
>  #include <linux/tick.h>
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
>
>  #include "cpufreq_governor.h"
>
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +       struct cpufreq_freqs freqs;
> +#endif

Why do you need this to be global?

>  static struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
>  {
>         if (have_governor_per_policy())
> @@ -143,11 +148,17 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
>                 }
>
> -               if (unlikely(!wall_time || wall_time < idle_time))
> +               if (unlikely(!wall_time || wall_time < idle_time)) {
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +                       freqs.load[j] = 0;
> +#endif
>                         continue;
> +               }
>
>                 load = 100 * (wall_time - idle_time) / wall_time;
> -
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +               freqs.load[j] = load;
> +#endif
>                 if (dbs_data->cdata->governor == GOV_ONDEMAND) {
>                         int freq_avg = __cpufreq_driver_getavg(policy, j);
>                         if (freq_avg <= 0)
> @@ -160,6 +171,12 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>                         max_load = load;
>         }
>
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +       freqs.time = ktime_to_ns(ktime_get());
> +       freqs.old = freqs.new = policy->cur;
> +
> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_LOADCHECK);
> +#endif
>         dbs_data->cdata->gov_check_cpu(cpu, max_load);
>  }
>  EXPORT_SYMBOL_GPL(dbs_check_cpu);
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index fb65dec..2379b1d 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -22,6 +22,8 @@
>  #include <linux/notifier.h>
>  #include <asm/cputime.h>
>
> +#define CPUFREQ_LOAD_TABLE_MAX         20
> +
>  static spinlock_t cpufreq_stats_lock;
>
>  struct cpufreq_stats {
> @@ -35,6 +37,10 @@ struct cpufreq_stats {
>         unsigned int *freq_table;
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>         unsigned int *trans_table;
> +
> +       struct cpufreq_freqs *load_table;
> +       unsigned int load_last_index;
> +       unsigned int load_max_index;
>  #endif
>  };
>
> @@ -131,6 +137,38 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
>         return len;
>  }
>  cpufreq_freq_attr_ro(trans_table);
> +
> +static ssize_t show_load_table(struct cpufreq_policy *policy, char *buf)
> +{
> +       struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
> +       struct cpufreq_freqs *load_table = stat->load_table;
> +       ssize_t len = 0;
> +       int i;
> +       int j;

merge above two lines.

> +
> +       len += sprintf(buf + len, "%11s %10s", "Time", "Frequency");
> +       for (j = 0; j < NR_CPUS; j++)
> +               len += sprintf(buf + len, " %3s%d", "cpu", j);
> +       len += sprintf(buf + len, "\n");
> +
> +       i = stat->load_last_index;
> +       do {
> +               len += sprintf(buf + len, "%lld %9d",
> +                               load_table[i].time,
> +                               load_table[i].old);
> +
> +               for (j = 0; j < NR_CPUS; j++)
> +                       len += sprintf(buf + len, " %4d",
> +                                       load_table[i].load[j]);
> +               len += sprintf(buf + len, "\n");
> +
> +               if (++i == stat->load_max_index)
> +                       i = 0;
> +       } while (i != stat->load_last_index);

You want/need some locking to protect addition to this list while
we are reading from it?

> +       return len;
> +}
> +cpufreq_freq_attr_ro(load_table);
>  #endif
>
>  cpufreq_freq_attr_ro(total_trans);
> @@ -141,6 +179,7 @@ static struct attribute *default_attrs[] = {
>         &time_in_state.attr,
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>         &trans_table.attr,
> +       &load_table.attr,
>  #endif
>         NULL
>  };
> @@ -167,6 +206,9 @@ static void cpufreq_stats_free_table(unsigned int cpu)
>
>         if (stat) {
>                 pr_debug("%s: Free stat table\n", __func__);
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +               kfree(stat->load_table);
> +#endif
>                 kfree(stat->time_in_state);
>                 kfree(stat);
>                 per_cpu(cpufreq_stats_table, cpu) = NULL;
> @@ -244,6 +286,16 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
>
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>         stat->trans_table = stat->freq_table + count;
> +
> +       stat->load_max_index = CPUFREQ_LOAD_TABLE_MAX;
> +       stat->load_last_index = 0;
> +
> +       alloc_size = sizeof(struct cpufreq_freqs) * stat->load_max_index;

We aren't using this variable multiple times so get rid of it and also you need
to do: sizeof(*stat->load_table).

> +       stat->load_table = kzalloc(alloc_size, GFP_KERNEL);
> +       if (!stat->load_table) {
> +               ret = -ENOMEM;
> +               goto error_out;
> +       }
>  #endif
>         j = 0;
>         for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
> @@ -312,13 +364,31 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
>         struct cpufreq_stats *stat;
>         int old_index, new_index;
>
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +       if (val != CPUFREQ_POSTCHANGE && val != CPUFREQ_LOADCHECK)
> +#else
>         if (val != CPUFREQ_POSTCHANGE)
> +#endif
>                 return 0;
>
>         stat = per_cpu(cpufreq_stats_table, freq->cpu);
>         if (!stat)
>                 return 0;
>
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +       if (val == CPUFREQ_LOADCHECK) {
> +               struct cpufreq_freqs *dest_freq;
> +
> +               dest_freq = &stat->load_table[stat->load_last_index];
> +               memcpy(dest_freq, freq, sizeof(struct cpufreq_freqs));

again sizeof()...

You don't need to copy full structure probably.

> +
> +               if (++stat->load_last_index == stat->load_max_index)
> +                       stat->load_last_index = 0;
> +
> +               return 0;
> +       }
> +#endif
> +
>         old_index = stat->last_index;
>         new_index = freq_table_get_index(stat, freq->new);
>
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 037d36a..f780645 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -140,12 +140,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
>  #define CPUFREQ_POSTCHANGE     (1)
>  #define CPUFREQ_RESUMECHANGE   (8)
>  #define CPUFREQ_SUSPENDCHANGE  (9)
> +#define CPUFREQ_LOADCHECK      (10)
>
>  struct cpufreq_freqs {
>         unsigned int cpu;       /* cpu nr */
>         unsigned int old;
>         unsigned int new;
>         u8 flags;               /* flags of cpufreq_driver, see below. */
> +
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +       int64_t time;
> +       unsigned int load[NR_CPUS];
> +#endif
>  };

Other wise it looks good mostly.

PS: I have cc'd you for a patch of mine which will get rid of most
of the CONFIG_*** you used in your code.. But wait for it to be
applied to change your code accordingly..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-07 10:23 ` Viresh Kumar
@ 2013-06-10 12:13   ` Chanwoo Choi
  2013-06-11  5:06     ` Viresh Kumar
  0 siblings, 1 reply; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-10 12:13 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

Hi Viresh,

On 06/07/2013 07:23 PM, Viresh Kumar wrote:
> Hi Chanwoo,
>
> On 5 June 2013 13:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:
>> This patch add new sysfs file to show previous accumulated data of CPU load
> Please mention we are only accumulating latest 20 values.
It should be modified so that user could enter a proper number of data
for accumulating data. I'll fix it by using Kconfig.

>
>> as following path. This sysfs file is used to judge the correct system state
>> or determine suitable system resource on user-space.
> Please write it as:
>
> load_table will be used to judge how many cpus would be sufficient for
> managing current load.

I define 'busy_cpu_threshold' as following way:                                                                         
- busy_cpu_threshold = (current cpu load * current cpu frequency ) / maximum cpu frequency
and then users can define busy_cpu_threshold according to
various cpu hotplug policy or environment dependent on h/w.

If busy_cpu_threshold is 30%, show each busy_cpu_threshold with cpu frequency:

Frequency    | busy_cpu_threshold
1600MHz      | 30%
1400MHz      | 34%
1200MHz      | 40%
1000MHz      | 48%
800MHz        | 60%
500MHz        | 96%

So, I use 'busy_cpu_threshold' value to judge whether cpu is busy or not.
If specific cpu exceeds defined busy_cpu_threshold, I determine to need
additional cpu resource. and then turn on additional CPU.

For example, I expalin how to get a numer of busy cpu
on current cpu load.

$ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
...

1301500082290 800000 61 11 1 43

...

When 1301500082290 ns:
cpu0's busy_cpu_threshold : 32 = 64 * (800000/1600000)
cpu1's busy_cpu_threshold : 5.5 = 11 * (800000/1600000)
cpu2's busy_cpu_threshold : 0 = 1 * (800000/1600000)
cpu3's busy_cpu_threshold : 16 = 43 * (800000/1600000)

A number of busy cpu is only 1 because cpu0's busy_cpu_threshold(32)
is larger than defined busy_cpu_threshold(30).

I can get a number of busy cpu through load_table sysfs.


>> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>
>> This sysfs file include following data:
>> - Measurement point of time
>> - CPU frequency
>> - Per-CPU load
>>
>> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
>> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> ---
>>
>> Additionally, I explain an example using 'load_table' sysfs entry.
>>
>> Exynos4412 series has Quad-core and all cores share the power-line.
>> I cann't set diffent voltage/frequency to each CPU. To reduce power-
>> consumption, I certainly have to turn on/off CPU online state
>> according to CPU load on runtime. As a result, I peridically need to
>> monitor current cpu state to determine a proper amount of system
>> resource(necessary number of online cpu) and to delete wasted power.
>> So, I need 'load_table' sysfs file to monitor current cpu state.
>>
>> I add a table which show power consumption of target based on
>> Exynos4412 SoC. This table indicate the difference power-consumption
>> according to a number of online core and with same number of running task.
>>
>> [Environment of power estimation]
>> - cpufreq governor used performance mode to estimate power-consumption on each frequency step.
>> - Use infinite-loop test program which execute while statement infinitely.
>> - Always measure the power consumption under same temperature during 1 minutes.
>> - Unit is mA.
>> ------------------------------------------------------------------------------------------------------------------------------------
>> A number of Online core | Core 1        | Core 2                | Core 3                        | Core 4
>> A number of nr_running  | 0     1       | 0     1       2       | 0     1       2       3       | 0     1       2       3       4
>> ------------------------------------------------------------------------------------------------------------------------------------
>>  CPU Frequency          |
>>  800  MHz               | 293   330     | 295   338     379     | 300   339     386     433     | 303   341     390     412     482
>>  1000 MHz               | 312   371     | 316   377     435     | 318   383     454     522     | 322   391     462     526     596
>>  1200 MHz               | 323   404     | 328   418     504     | 336   423     521     615     | 343   433     499     639     748
>>  1600 MHz               | 380   525     | 388   556     771     | 399   575     771     1011    | 412   597     822     1172    1684
>> ------------------------------------------------------------------------------------------------------------------------------------
>>
>> For example,
>> The case A/B/C have the same condition except for a number of online core.
>> - case A: Online core is 2, 1000MHz and nr_running is 1 : 377mA
>> - case B: Online core is 3, 1000MHz and nr_running is 1 : 383mA
>> - case C: Online core is 4, 1000Mz and nr_running is 1  : 391mA
>>
>> If system has only one running task, cpu hotplug policy, by monitoring
>> cpu state through 'load_table' sysfs file on user-space,
>> will determine 'case A' state for reducing power-consumption.
>>
>> Show the result of reading 'load_table sysfs file as following:
>> - cpufreq governor is ondemand_org governor.
>>
>> $ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>        Time  Frequency cpu0 cpu1 cpu2 cpu3
>> 1300500043122   1600000   32    6    0   26
>> 1300600079080    800000   63   10    2   45
>> 1300700065288    800000   51    9    1   42
>> 1300800228747    800000   51    9    1   43
>> 1300900182997    800000   78   11    3   47
>> 1301000106163    800000   96   26    6   48
>> 1301100056247   1600000   45    7    1   27
>> 1301200071373   1000000   55    9    1   37
>> 1301300096082    800000   54   10    0   45
>> 1301400132832    800000   70   11    2   46
>> 1301500082290    800000   61   11    1   43
>> 1301600236415    800000   61    9    2   43
>> 1301700071498    800000   59   10    2   43
>> 1301800159290    800000   55    9    1   42
>> 1301900076332    800000   66   10    2   43
>> 1302000102165    800000   47    9    0   43
>> 1302100086623    800000   75   11    2   50
>> 1302200101082    800000   66   10    4   46
>> 1302300108873    800000   53   10    1   44
>> 1302400373373    600000   63   12    1   54
> How are you getting loads different for all your cpus? I believe you
> are just recording these values for policy->cpu and all cpus share
> same policy on your platform.
>
I got the Per-CPU load by using cpufreq_notify_transition().
when cpufreq governor call dbs_check_cpu().
>> Please let me know some opinion about this patch.
>>
>> Best regards and Thanks,
>> Chanwoo Choi
>>
>> ---
>>  drivers/cpufreq/cpufreq.c          |  4 +++
>>  drivers/cpufreq/cpufreq_governor.c | 21 ++++++++++--
>>  drivers/cpufreq/cpufreq_stats.c    | 70 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/cpufreq.h            |  6 ++++
>>  4 files changed, 99 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 2d53f47..cbaaff0 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -292,6 +292,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
>>                 if (likely(policy) && likely(policy->cpu == freqs->cpu))
>>                         policy->cur = freqs->new;
>>                 break;
>> +       case CPUFREQ_LOADCHECK:
>> +               srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
>> +                               CPUFREQ_LOADCHECK, freqs);
>> +               break;
>>         }
>>  }
>>  /**
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 5af40ad..bc50624 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -23,12 +23,17 @@
>>  #include <linux/kernel_stat.h>
>>  #include <linux/mutex.h>
>>  #include <linux/slab.h>
>> +#include <linux/sched.h>
>>  #include <linux/tick.h>
>>  #include <linux/types.h>
>>  #include <linux/workqueue.h>
>>
>>  #include "cpufreq_governor.h"
>>
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +       struct cpufreq_freqs freqs;
>> +#endif
> Why do you need this to be global?
I'll remove global variable and move 'freqs' in some structure.

>>  static struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
>>  {
>>         if (have_governor_per_policy())
>> @@ -143,11 +148,17 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
>>                 }
>>
>> -               if (unlikely(!wall_time || wall_time < idle_time))
>> +               if (unlikely(!wall_time || wall_time < idle_time)) {
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +                       freqs.load[j] = 0;
>> +#endif
>>                         continue;
>> +               }
>>
>>                 load = 100 * (wall_time - idle_time) / wall_time;
>> -
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +               freqs.load[j] = load;
>> +#endif
>>                 if (dbs_data->cdata->governor == GOV_ONDEMAND) {
>>                         int freq_avg = __cpufreq_driver_getavg(policy, j);
>>                         if (freq_avg <= 0)
>> @@ -160,6 +171,12 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                         max_load = load;
>>         }
>>
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +       freqs.time = ktime_to_ns(ktime_get());
>> +       freqs.old = freqs.new = policy->cur;
>> +
>> +       cpufreq_notify_transition(policy, &freqs, CPUFREQ_LOADCHECK);
>> +#endif
>>         dbs_data->cdata->gov_check_cpu(cpu, max_load);
>>  }
>>  EXPORT_SYMBOL_GPL(dbs_check_cpu);
>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>> index fb65dec..2379b1d 100644
>> --- a/drivers/cpufreq/cpufreq_stats.c
>> +++ b/drivers/cpufreq/cpufreq_stats.c
>> @@ -22,6 +22,8 @@
>>  #include <linux/notifier.h>
>>  #include <asm/cputime.h>
>>
>> +#define CPUFREQ_LOAD_TABLE_MAX         20
>> +
>>  static spinlock_t cpufreq_stats_lock;
>>
>>  struct cpufreq_stats {
>> @@ -35,6 +37,10 @@ struct cpufreq_stats {
>>         unsigned int *freq_table;
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>         unsigned int *trans_table;
>> +
>> +       struct cpufreq_freqs *load_table;
>> +       unsigned int load_last_index;
>> +       unsigned int load_max_index;
>>  #endif
>>  };
>>
>> @@ -131,6 +137,38 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
>>         return len;
>>  }
>>  cpufreq_freq_attr_ro(trans_table);
>> +
>> +static ssize_t show_load_table(struct cpufreq_policy *policy, char *buf)
>> +{
>> +       struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
>> +       struct cpufreq_freqs *load_table = stat->load_table;
>> +       ssize_t len = 0;
>> +       int i;
>> +       int j;
> merge above two lines.
OK.
>
>> +
>> +       len += sprintf(buf + len, "%11s %10s", "Time", "Frequency");
>> +       for (j = 0; j < NR_CPUS; j++)
>> +               len += sprintf(buf + len, " %3s%d", "cpu", j);
>> +       len += sprintf(buf + len, "\n");
>> +
>> +       i = stat->load_last_index;
>> +       do {
>> +               len += sprintf(buf + len, "%lld %9d",
>> +                               load_table[i].time,
>> +                               load_table[i].old);
>> +
>> +               for (j = 0; j < NR_CPUS; j++)
>> +                       len += sprintf(buf + len, " %4d",
>> +                                       load_table[i].load[j]);
>> +               len += sprintf(buf + len, "\n");
>> +
>> +               if (++i == stat->load_max_index)
>> +                       i = 0;
>> +       } while (i != stat->load_last_index);
> You want/need some locking to protect addition to this list while
> we are reading from it?
OK, I'll consider locking method and implement it on next version patch.
>> +       return len;
>> +}
>> +cpufreq_freq_attr_ro(load_table);
>>  #endif
>>
>>  cpufreq_freq_attr_ro(total_trans);
>> @@ -141,6 +179,7 @@ static struct attribute *default_attrs[] = {
>>         &time_in_state.attr,
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>         &trans_table.attr,
>> +       &load_table.attr,
>>  #endif
>>         NULL
>>  };
>> @@ -167,6 +206,9 @@ static void cpufreq_stats_free_table(unsigned int cpu)
>>
>>         if (stat) {
>>                 pr_debug("%s: Free stat table\n", __func__);
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +               kfree(stat->load_table);
>> +#endif
>>                 kfree(stat->time_in_state);
>>                 kfree(stat);
>>                 per_cpu(cpufreq_stats_table, cpu) = NULL;
>> @@ -244,6 +286,16 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
>>
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>         stat->trans_table = stat->freq_table + count;
>> +
>> +       stat->load_max_index = CPUFREQ_LOAD_TABLE_MAX;
>> +       stat->load_last_index = 0;
>> +
>> +       alloc_size = sizeof(struct cpufreq_freqs) * stat->load_max_index;
> We aren't using this variable multiple times so get rid of it and also you need
> to do: sizeof(*stat->load_table).
>
OK, I'll fix it.
>> +       stat->load_table = kzalloc(alloc_size, GFP_KERNEL);
>> +       if (!stat->load_table) {
>> +               ret = -ENOMEM;
>> +               goto error_out;
>> +       }
>>  #endif
>>         j = 0;
>>         for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
>> @@ -312,13 +364,31 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
>>         struct cpufreq_stats *stat;
>>         int old_index, new_index;
>>
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +       if (val != CPUFREQ_POSTCHANGE && val != CPUFREQ_LOADCHECK)
>> +#else
>>         if (val != CPUFREQ_POSTCHANGE)
>> +#endif
>>                 return 0;
>>
>>         stat = per_cpu(cpufreq_stats_table, freq->cpu);
>>         if (!stat)
>>                 return 0;
>>
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +       if (val == CPUFREQ_LOADCHECK) {
>> +               struct cpufreq_freqs *dest_freq;
>> +
>> +               dest_freq = &stat->load_table[stat->load_last_index];
>> +               memcpy(dest_freq, freq, sizeof(struct cpufreq_freqs));
> again sizeof()...
>
> You don't need to copy full structure probably.
OK, I'll fix it.
>
>> +
>> +               if (++stat->load_last_index == stat->load_max_index)
>> +                       stat->load_last_index = 0;
>> +
>> +               return 0;
>> +       }
>> +#endif
>> +
>>         old_index = stat->last_index;
>>         new_index = freq_table_get_index(stat, freq->new);
>>
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 037d36a..f780645 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -140,12 +140,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
>>  #define CPUFREQ_POSTCHANGE     (1)
>>  #define CPUFREQ_RESUMECHANGE   (8)
>>  #define CPUFREQ_SUSPENDCHANGE  (9)
>> +#define CPUFREQ_LOADCHECK      (10)
>>
>>  struct cpufreq_freqs {
>>         unsigned int cpu;       /* cpu nr */
>>         unsigned int old;
>>         unsigned int new;
>>         u8 flags;               /* flags of cpufreq_driver, see below. */
>> +
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +       int64_t time;
>> +       unsigned int load[NR_CPUS];
>> +#endif
>>  };
> Other wise it looks good mostly.
Thanks for your comment.
> PS: I have cc'd you for a patch of mine which will get rid of most
> of the CONFIG_*** you used in your code.. But wait for it to be
> applied to change your code accordingly..
>
OK, I will resend this patch after applied below patch.
- [PATCH] cpufreq: stats: Remove CONFIG_CPU_FREQ_STAT_DETAILS

Best Regards,
Chanwoo Choi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-10 12:13   ` Chanwoo Choi
@ 2013-06-11  5:06     ` Viresh Kumar
  2013-06-11  5:19       ` Chanwoo Choi
  0 siblings, 1 reply; 15+ messages in thread
From: Viresh Kumar @ 2013-06-11  5:06 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

On 10 June 2013 17:43, Chanwoo Choi <cw00.choi@samsung.com> wrote:
> On 06/07/2013 07:23 PM, Viresh Kumar wrote:
>> On 5 June 2013 13:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:

> 1301500082290 800000 61 11 1 43
>
> ...
>
> When 1301500082290 ns:
> cpu0's busy_cpu_threshold : 32 = 64 * (800000/1600000)

s/64/61 :)

>> How are you getting loads different for all your cpus? I believe you
>> are just recording these values for policy->cpu and all cpus share
>> same policy on your platform.
>>
> I got the Per-CPU load by using cpufreq_notify_transition().
> when cpufreq governor call dbs_check_cpu().

I forgot to remove this line in my earlier reply. I understood this towards
the end of patch.

>>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>> +       struct cpufreq_freqs freqs;
>>> +#endif
>> Why do you need this to be global?
> I'll remove global variable and move 'freqs' in some structure.

??

You can just make it a local variable in the only function it is used.

TIP: Always place a blank line before and after your reply to kernel
mails, this makes it much more readable.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-11  5:06     ` Viresh Kumar
@ 2013-06-11  5:19       ` Chanwoo Choi
  2013-06-11  5:22         ` Viresh Kumar
  0 siblings, 1 reply; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-11  5:19 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

On 06/11/2013 02:06 PM, Viresh Kumar wrote:
> On 10 June 2013 17:43, Chanwoo Choi <cw00.choi@samsung.com> wrote:
>> On 06/07/2013 07:23 PM, Viresh Kumar wrote:
>>> On 5 June 2013 13:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:
>> 1301500082290 800000 61 11 1 43
>>
>> ...
>>
>> When 1301500082290 ns:
>> cpu0's busy_cpu_threshold : 32 = 64 * (800000/1600000)
> s/64/61 :)

Sorry, my mistake.

>
>>> How are you getting loads different for all your cpus? I believe you
>>> are just recording these values for policy->cpu and all cpus share
>>> same policy on your platform.
>>>
>> I got the Per-CPU load by using cpufreq_notify_transition().
>> when cpufreq governor call dbs_check_cpu().
> I forgot to remove this line in my earlier reply. I understood this towards
> the end of patch.

OK.

>
>>>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>>> +       struct cpufreq_freqs freqs;
>>>> +#endif
>>> Why do you need this to be global?
>> I'll remove global variable and move 'freqs' in some structure.
> ??
>
> You can just make it a local variable in the only function it is used.

You are right. I'll fix it by using local variable.

>
> TIP: Always place a blank line before and after your reply to kernel
> mails, this makes it much more readable.

I will modify this patch according to your comment
and then resend it after merged below patch.
- [PATCH] cpufreq: stats: Remove CONFIG_CPU_FREQ_STAT_DETAILS

Thanks your comment and tip.

Best Regards,
Chanwoo Choi




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-11  5:19       ` Chanwoo Choi
@ 2013-06-11  5:22         ` Viresh Kumar
  2013-06-11  6:10           ` Chanwoo Choi
  0 siblings, 1 reply; 15+ messages in thread
From: Viresh Kumar @ 2013-06-11  5:22 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

On 11 June 2013 10:49, Chanwoo Choi <cw00.choi@samsung.com> wrote:
> I will modify this patch according to your comment
> and then resend it after merged below patch.
> - [PATCH] cpufreq: stats: Remove CONFIG_CPU_FREQ_STAT_DETAILS

Hmm.. Maybe you can sent it without rebasing your patch over mine and
we can rebase it over my patch later. So, that you don't end up waiting for
my patches to be included.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-11  5:22         ` Viresh Kumar
@ 2013-06-11  6:10           ` Chanwoo Choi
  0 siblings, 0 replies; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-11  6:10 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham, Lists linaro-kernel

On 06/11/2013 02:22 PM, Viresh Kumar wrote:
> On 11 June 2013 10:49, Chanwoo Choi <cw00.choi@samsung.com> wrote:
>> I will modify this patch according to your comment
>> and then resend it after merged below patch.
>> - [PATCH] cpufreq: stats: Remove CONFIG_CPU_FREQ_STAT_DETAILS
> Hmm.. Maybe you can sent it without rebasing your patch over mine and
> we can rebase it over my patch later. So, that you don't end up waiting for
> my patches to be included.

OK, I understand.

Thanks,
Chanwoo Choi


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-05  8:11 [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU Chanwoo Choi
  2013-06-07 10:23 ` Viresh Kumar
@ 2013-06-11 22:14 ` Rafael J. Wysocki
  2013-06-12  0:51   ` Chanwoo Choi
  2013-06-12  4:02   ` Viresh Kumar
  1 sibling, 2 replies; 15+ messages in thread
From: Rafael J. Wysocki @ 2013-06-11 22:14 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On Wednesday, June 05, 2013 05:11:22 PM Chanwoo Choi wrote:
> This patch add new sysfs file to show previous accumulated data of CPU load
> as following path. This sysfs file is used to judge the correct system state
> or determine suitable system resource on user-space.
> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
> 
> This sysfs file include following data:
> - Measurement point of time
> - CPU frequency
> - Per-CPU load
> 
> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>

Well, first of all, there is the "one value per file" rule for sysfs attributes
which is evidently violated by this code.

Second, this looks like a feature needed to handle one particular platform, so
why do you want to add it to the cpufreq core?

Rafael


> ---
> 
> Additionally, I explain an example using 'load_table' sysfs entry.
> 
> Exynos4412 series has Quad-core and all cores share the power-line.
> I cann't set diffent voltage/frequency to each CPU. To reduce power-
> consumption, I certainly have to turn on/off CPU online state
> according to CPU load on runtime. As a result, I peridically need to
> monitor current cpu state to determine a proper amount of system
> resource(necessary number of online cpu) and to delete wasted power.
> So, I need 'load_table' sysfs file to monitor current cpu state.
> 
> I add a table which show power consumption of target based on
> Exynos4412 SoC. This table indicate the difference power-consumption
> according to a number of online core and with same number of running task.
> 
> [Environment of power estimation]
> - cpufreq governor used performance mode to estimate power-consumption on each frequency step.
> - Use infinite-loop test program which execute while statement infinitely.
> - Always measure the power consumption under same temperature during 1 minutes.
> - Unit is mA.
> ------------------------------------------------------------------------------------------------------------------------------------
> A number of Online core	| Core 1	| Core 2		| Core 3			| Core 4
> A number of nr_running	| 0	1	| 0	1	2	| 0	1	2	3	| 0	1	2	3	4
> ------------------------------------------------------------------------------------------------------------------------------------
>  CPU Frequency		| 
>  800  MHz		| 293 	330 	| 295 	338 	379 	| 300 	339 	386 	433 	| 303 	341 	390 	412 	482 
>  1000 MHz		| 312 	371 	| 316 	377 	435 	| 318 	383 	454 	522 	| 322 	391 	462 	526 	596 
>  1200 MHz		| 323 	404 	| 328 	418 	504 	| 336 	423 	521 	615 	| 343 	433 	499 	639 	748 
>  1600 MHz 		| 380 	525 	| 388 	556 	771 	| 399 	575 	771 	1011 	| 412 	597 	822 	1172 	1684 
> ------------------------------------------------------------------------------------------------------------------------------------
> 
> For example,
> The case A/B/C have the same condition except for a number of online core.
> - case A: Online core is 2, 1000MHz and nr_running is 1	: 377mA
> - case B: Online core is 3, 1000MHz and nr_running is 1	: 383mA
> - case C: Online core is 4, 1000Mz and nr_running is 1	: 391mA
> 
> If system has only one running task, cpu hotplug policy, by monitoring
> cpu state through 'load_table' sysfs file on user-space,
> will determine 'case A' state for reducing power-consumption.
> 
> Show the result of reading 'load_table sysfs file as following:
> - cpufreq governor is ondemand_org governor.
> 
> $ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>        Time  Frequency cpu0 cpu1 cpu2 cpu3
> 1300500043122   1600000   32    6    0   26
> 1300600079080    800000   63   10    2   45
> 1300700065288    800000   51    9    1   42
> 1300800228747    800000   51    9    1   43
> 1300900182997    800000   78   11    3   47
> 1301000106163    800000   96   26    6   48
> 1301100056247   1600000   45    7    1   27
> 1301200071373   1000000   55    9    1   37
> 1301300096082    800000   54   10    0   45
> 1301400132832    800000   70   11    2   46
> 1301500082290    800000   61   11    1   43
> 1301600236415    800000   61    9    2   43
> 1301700071498    800000   59   10    2   43
> 1301800159290    800000   55    9    1   42
> 1301900076332    800000   66   10    2   43
> 1302000102165    800000   47    9    0   43
> 1302100086623    800000   75   11    2   50
> 1302200101082    800000   66   10    4   46
> 1302300108873    800000   53   10    1   44
> 1302400373373    600000   63   12    1   54
> 
> Please let me know some opinion about this patch.
> 
> Best regards and Thanks,
> Chanwoo Choi
> 
> ---
>  drivers/cpufreq/cpufreq.c          |  4 +++
>  drivers/cpufreq/cpufreq_governor.c | 21 ++++++++++--
>  drivers/cpufreq/cpufreq_stats.c    | 70 ++++++++++++++++++++++++++++++++++++++
>  include/linux/cpufreq.h            |  6 ++++
>  4 files changed, 99 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2d53f47..cbaaff0 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -292,6 +292,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
>  		if (likely(policy) && likely(policy->cpu == freqs->cpu))
>  			policy->cur = freqs->new;
>  		break;
> +	case CPUFREQ_LOADCHECK:
> +		srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
> +				CPUFREQ_LOADCHECK, freqs);
> +		break;
>  	}
>  }
>  /**
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 5af40ad..bc50624 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -23,12 +23,17 @@
>  #include <linux/kernel_stat.h>
>  #include <linux/mutex.h>
>  #include <linux/slab.h>
> +#include <linux/sched.h>
>  #include <linux/tick.h>
>  #include <linux/types.h>
>  #include <linux/workqueue.h>
>  
>  #include "cpufreq_governor.h"
>  
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +	struct cpufreq_freqs freqs;
> +#endif
> +
>  static struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
>  {
>  	if (have_governor_per_policy())
> @@ -143,11 +148,17 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  			idle_time += jiffies_to_usecs(cur_nice_jiffies);
>  		}
>  
> -		if (unlikely(!wall_time || wall_time < idle_time))
> +		if (unlikely(!wall_time || wall_time < idle_time)) {
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +			freqs.load[j] = 0;
> +#endif
>  			continue;
> +		}
>  
>  		load = 100 * (wall_time - idle_time) / wall_time;
> -
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +		freqs.load[j] = load;
> +#endif
>  		if (dbs_data->cdata->governor == GOV_ONDEMAND) {
>  			int freq_avg = __cpufreq_driver_getavg(policy, j);
>  			if (freq_avg <= 0)
> @@ -160,6 +171,12 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>  			max_load = load;
>  	}
>  
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +	freqs.time = ktime_to_ns(ktime_get());
> +	freqs.old = freqs.new = policy->cur;
> +
> +	cpufreq_notify_transition(policy, &freqs, CPUFREQ_LOADCHECK);
> +#endif
>  	dbs_data->cdata->gov_check_cpu(cpu, max_load);
>  }
>  EXPORT_SYMBOL_GPL(dbs_check_cpu);
> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
> index fb65dec..2379b1d 100644
> --- a/drivers/cpufreq/cpufreq_stats.c
> +++ b/drivers/cpufreq/cpufreq_stats.c
> @@ -22,6 +22,8 @@
>  #include <linux/notifier.h>
>  #include <asm/cputime.h>
>  
> +#define CPUFREQ_LOAD_TABLE_MAX		20
> +
>  static spinlock_t cpufreq_stats_lock;
>  
>  struct cpufreq_stats {
> @@ -35,6 +37,10 @@ struct cpufreq_stats {
>  	unsigned int *freq_table;
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>  	unsigned int *trans_table;
> +
> +	struct cpufreq_freqs *load_table;
> +	unsigned int load_last_index;
> +	unsigned int load_max_index;
>  #endif
>  };
>  
> @@ -131,6 +137,38 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
>  	return len;
>  }
>  cpufreq_freq_attr_ro(trans_table);
> +
> +static ssize_t show_load_table(struct cpufreq_policy *policy, char *buf)
> +{
> +	struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
> +	struct cpufreq_freqs *load_table = stat->load_table;
> +	ssize_t len = 0;
> +	int i;
> +	int j;
> +
> +	len += sprintf(buf + len, "%11s %10s", "Time", "Frequency");
> +	for (j = 0; j < NR_CPUS; j++)
> +		len += sprintf(buf + len, " %3s%d", "cpu", j);
> +	len += sprintf(buf + len, "\n");
> +
> +	i = stat->load_last_index;
> +	do {
> +		len += sprintf(buf + len, "%lld %9d",
> +				load_table[i].time,
> +				load_table[i].old);
> +
> +		for (j = 0; j < NR_CPUS; j++)
> +			len += sprintf(buf + len, " %4d",
> +					load_table[i].load[j]);
> +		len += sprintf(buf + len, "\n");
> +
> +		if (++i == stat->load_max_index)
> +			i = 0;
> +	} while (i != stat->load_last_index);
> +
> +	return len;
> +}
> +cpufreq_freq_attr_ro(load_table);
>  #endif
>  
>  cpufreq_freq_attr_ro(total_trans);
> @@ -141,6 +179,7 @@ static struct attribute *default_attrs[] = {
>  	&time_in_state.attr,
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>  	&trans_table.attr,
> +	&load_table.attr,
>  #endif
>  	NULL
>  };
> @@ -167,6 +206,9 @@ static void cpufreq_stats_free_table(unsigned int cpu)
>  
>  	if (stat) {
>  		pr_debug("%s: Free stat table\n", __func__);
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +		kfree(stat->load_table);
> +#endif
>  		kfree(stat->time_in_state);
>  		kfree(stat);
>  		per_cpu(cpufreq_stats_table, cpu) = NULL;
> @@ -244,6 +286,16 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
>  
>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>  	stat->trans_table = stat->freq_table + count;
> +
> +	stat->load_max_index = CPUFREQ_LOAD_TABLE_MAX;
> +	stat->load_last_index = 0;
> +
> +	alloc_size = sizeof(struct cpufreq_freqs) * stat->load_max_index;
> +	stat->load_table = kzalloc(alloc_size, GFP_KERNEL);
> +	if (!stat->load_table) {
> +		ret = -ENOMEM;
> +		goto error_out;
> +	}
>  #endif
>  	j = 0;
>  	for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
> @@ -312,13 +364,31 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
>  	struct cpufreq_stats *stat;
>  	int old_index, new_index;
>  
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +	if (val != CPUFREQ_POSTCHANGE && val != CPUFREQ_LOADCHECK)
> +#else
>  	if (val != CPUFREQ_POSTCHANGE)
> +#endif
>  		return 0;
>  
>  	stat = per_cpu(cpufreq_stats_table, freq->cpu);
>  	if (!stat)
>  		return 0;
>  
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +	if (val == CPUFREQ_LOADCHECK) {
> +		struct cpufreq_freqs *dest_freq;
> +
> +		dest_freq = &stat->load_table[stat->load_last_index];
> +		memcpy(dest_freq, freq, sizeof(struct cpufreq_freqs));
> +
> +		if (++stat->load_last_index == stat->load_max_index)
> +			stat->load_last_index = 0;
> +
> +		return 0;
> +	}
> +#endif
> +
>  	old_index = stat->last_index;
>  	new_index = freq_table_get_index(stat, freq->new);
>  
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 037d36a..f780645 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -140,12 +140,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
>  #define CPUFREQ_POSTCHANGE	(1)
>  #define CPUFREQ_RESUMECHANGE	(8)
>  #define CPUFREQ_SUSPENDCHANGE	(9)
> +#define CPUFREQ_LOADCHECK	(10)
>  
>  struct cpufreq_freqs {
>  	unsigned int cpu;	/* cpu nr */
>  	unsigned int old;
>  	unsigned int new;
>  	u8 flags;		/* flags of cpufreq_driver, see below. */
> +
> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
> +	int64_t time;
> +	unsigned int load[NR_CPUS];
> +#endif
>  };
>  
>  
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-11 22:14 ` Rafael J. Wysocki
@ 2013-06-12  0:51   ` Chanwoo Choi
  2013-06-12  4:02   ` Viresh Kumar
  1 sibling, 0 replies; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-12  0:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

Hi Rafael,

On 06/12/2013 07:14 AM, Rafael J. Wysocki wrote:
> On Wednesday, June 05, 2013 05:11:22 PM Chanwoo Choi wrote:
>> This patch add new sysfs file to show previous accumulated data of CPU load
>> as following path. This sysfs file is used to judge the correct system state
>> or determine suitable system resource on user-space.
>> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>
>> This sysfs file include following data:
>> - Measurement point of time
>> - CPU frequency
>> - Per-CPU load
>>
>> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
>> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> Well, first of all, there is the "one value per file" rule for sysfs attributes
> which is evidently violated by this code.

OK, then I will use debugfs for tacking cpu load instead of sysfs.
What is your opinion using debugfs?

> Second, this looks like a feature needed to handle one particular platform, so
> why do you want to add it to the cpufreq core?

I have cpu hotplug feature in user-space to reduce power-consumption
because of estimating difference power-consumption according to a number
of online cpu. The cpu hotplug feature in user-space must need accumulated data of
per-cpu load and current cpu frequency to monitor current system state(cpu load).

My goal is that cpu hotplug feature must use only standard kernel node and
be always operated on mainline kernel. So, I want to add load_table sysfs as standard kernel node.

Thanks,

Best Regards,
Chanwoo Choi
>> ---
>>
>> Additionally, I explain an example using 'load_table' sysfs entry.
>>
>> Exynos4412 series has Quad-core and all cores share the power-line.
>> I cann't set diffent voltage/frequency to each CPU. To reduce power-
>> consumption, I certainly have to turn on/off CPU online state
>> according to CPU load on runtime. As a result, I peridically need to
>> monitor current cpu state to determine a proper amount of system
>> resource(necessary number of online cpu) and to delete wasted power.
>> So, I need 'load_table' sysfs file to monitor current cpu state.
>>
>> I add a table which show power consumption of target based on
>> Exynos4412 SoC. This table indicate the difference power-consumption
>> according to a number of online core and with same number of running task.
>>
>> [Environment of power estimation]
>> - cpufreq governor used performance mode to estimate power-consumption on each frequency step.
>> - Use infinite-loop test program which execute while statement infinitely.
>> - Always measure the power consumption under same temperature during 1 minutes.
>> - Unit is mA.
>> ------------------------------------------------------------------------------------------------------------------------------------
>> A number of Online core	| Core 1	| Core 2		| Core 3			| Core 4
>> A number of nr_running	| 0	1	| 0	1	2	| 0	1	2	3	| 0	1	2	3	4
>> ------------------------------------------------------------------------------------------------------------------------------------
>>  CPU Frequency		| 
>>  800  MHz		| 293 	330 	| 295 	338 	379 	| 300 	339 	386 	433 	| 303 	341 	390 	412 	482 
>>  1000 MHz		| 312 	371 	| 316 	377 	435 	| 318 	383 	454 	522 	| 322 	391 	462 	526 	596 
>>  1200 MHz		| 323 	404 	| 328 	418 	504 	| 336 	423 	521 	615 	| 343 	433 	499 	639 	748 
>>  1600 MHz 		| 380 	525 	| 388 	556 	771 	| 399 	575 	771 	1011 	| 412 	597 	822 	1172 	1684 
>> ------------------------------------------------------------------------------------------------------------------------------------
>>
>> For example,
>> The case A/B/C have the same condition except for a number of online core.
>> - case A: Online core is 2, 1000MHz and nr_running is 1	: 377mA
>> - case B: Online core is 3, 1000MHz and nr_running is 1	: 383mA
>> - case C: Online core is 4, 1000Mz and nr_running is 1	: 391mA
>>
>> If system has only one running task, cpu hotplug policy, by monitoring
>> cpu state through 'load_table' sysfs file on user-space,
>> will determine 'case A' state for reducing power-consumption.
>>
>> Show the result of reading 'load_table sysfs file as following:
>> - cpufreq governor is ondemand_org governor.
>>
>> $ cat /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>        Time  Frequency cpu0 cpu1 cpu2 cpu3
>> 1300500043122   1600000   32    6    0   26
>> 1300600079080    800000   63   10    2   45
>> 1300700065288    800000   51    9    1   42
>> 1300800228747    800000   51    9    1   43
>> 1300900182997    800000   78   11    3   47
>> 1301000106163    800000   96   26    6   48
>> 1301100056247   1600000   45    7    1   27
>> 1301200071373   1000000   55    9    1   37
>> 1301300096082    800000   54   10    0   45
>> 1301400132832    800000   70   11    2   46
>> 1301500082290    800000   61   11    1   43
>> 1301600236415    800000   61    9    2   43
>> 1301700071498    800000   59   10    2   43
>> 1301800159290    800000   55    9    1   42
>> 1301900076332    800000   66   10    2   43
>> 1302000102165    800000   47    9    0   43
>> 1302100086623    800000   75   11    2   50
>> 1302200101082    800000   66   10    4   46
>> 1302300108873    800000   53   10    1   44
>> 1302400373373    600000   63   12    1   54

>> Please let me know some opinion about this patch.
>>
>> Best regards and Thanks,
>> Chanwoo Choi
>>
>> ---
>>  drivers/cpufreq/cpufreq.c          |  4 +++
>>  drivers/cpufreq/cpufreq_governor.c | 21 ++++++++++--
>>  drivers/cpufreq/cpufreq_stats.c    | 70 ++++++++++++++++++++++++++++++++++++++
>>  include/linux/cpufreq.h            |  6 ++++
>>  4 files changed, 99 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 2d53f47..cbaaff0 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -292,6 +292,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
>>  		if (likely(policy) && likely(policy->cpu == freqs->cpu))
>>  			policy->cur = freqs->new;
>>  		break;
>> +	case CPUFREQ_LOADCHECK:
>> +		srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
>> +				CPUFREQ_LOADCHECK, freqs);
>> +		break;
>>  	}
>>  }
>>  /**
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 5af40ad..bc50624 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -23,12 +23,17 @@
>>  #include <linux/kernel_stat.h>
>>  #include <linux/mutex.h>
>>  #include <linux/slab.h>
>> +#include <linux/sched.h>
>>  #include <linux/tick.h>
>>  #include <linux/types.h>
>>  #include <linux/workqueue.h>
>>  
>>  #include "cpufreq_governor.h"
>>  
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +	struct cpufreq_freqs freqs;
>> +#endif
>> +
>>  static struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
>>  {
>>  	if (have_governor_per_policy())
>> @@ -143,11 +148,17 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>  			idle_time += jiffies_to_usecs(cur_nice_jiffies);
>>  		}
>>  
>> -		if (unlikely(!wall_time || wall_time < idle_time))
>> +		if (unlikely(!wall_time || wall_time < idle_time)) {
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +			freqs.load[j] = 0;
>> +#endif
>>  			continue;
>> +		}
>>  
>>  		load = 100 * (wall_time - idle_time) / wall_time;
>> -
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +		freqs.load[j] = load;
>> +#endif
>>  		if (dbs_data->cdata->governor == GOV_ONDEMAND) {
>>  			int freq_avg = __cpufreq_driver_getavg(policy, j);
>>  			if (freq_avg <= 0)
>> @@ -160,6 +171,12 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>  			max_load = load;
>>  	}
>>  
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +	freqs.time = ktime_to_ns(ktime_get());
>> +	freqs.old = freqs.new = policy->cur;
>> +
>> +	cpufreq_notify_transition(policy, &freqs, CPUFREQ_LOADCHECK);
>> +#endif
>>  	dbs_data->cdata->gov_check_cpu(cpu, max_load);
>>  }
>>  EXPORT_SYMBOL_GPL(dbs_check_cpu);
>> diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
>> index fb65dec..2379b1d 100644
>> --- a/drivers/cpufreq/cpufreq_stats.c
>> +++ b/drivers/cpufreq/cpufreq_stats.c
>> @@ -22,6 +22,8 @@
>>  #include <linux/notifier.h>
>>  #include <asm/cputime.h>
>>  
>> +#define CPUFREQ_LOAD_TABLE_MAX		20
>> +
>>  static spinlock_t cpufreq_stats_lock;
>>  
>>  struct cpufreq_stats {
>> @@ -35,6 +37,10 @@ struct cpufreq_stats {
>>  	unsigned int *freq_table;
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>  	unsigned int *trans_table;
>> +
>> +	struct cpufreq_freqs *load_table;
>> +	unsigned int load_last_index;
>> +	unsigned int load_max_index;
>>  #endif
>>  };
>>  
>> @@ -131,6 +137,38 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
>>  	return len;
>>  }
>>  cpufreq_freq_attr_ro(trans_table);
>> +
>> +static ssize_t show_load_table(struct cpufreq_policy *policy, char *buf)
>> +{
>> +	struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
>> +	struct cpufreq_freqs *load_table = stat->load_table;
>> +	ssize_t len = 0;
>> +	int i;
>> +	int j;
>> +
>> +	len += sprintf(buf + len, "%11s %10s", "Time", "Frequency");
>> +	for (j = 0; j < NR_CPUS; j++)
>> +		len += sprintf(buf + len, " %3s%d", "cpu", j);
>> +	len += sprintf(buf + len, "\n");
>> +
>> +	i = stat->load_last_index;
>> +	do {
>> +		len += sprintf(buf + len, "%lld %9d",
>> +				load_table[i].time,
>> +				load_table[i].old);
>> +
>> +		for (j = 0; j < NR_CPUS; j++)
>> +			len += sprintf(buf + len, " %4d",
>> +					load_table[i].load[j]);
>> +		len += sprintf(buf + len, "\n");
>> +
>> +		if (++i == stat->load_max_index)
>> +			i = 0;
>> +	} while (i != stat->load_last_index);
>> +
>> +	return len;
>> +}
>> +cpufreq_freq_attr_ro(load_table);
>>  #endif
>>  
>>  cpufreq_freq_attr_ro(total_trans);
>> @@ -141,6 +179,7 @@ static struct attribute *default_attrs[] = {
>>  	&time_in_state.attr,
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>  	&trans_table.attr,
>> +	&load_table.attr,
>>  #endif
>>  	NULL
>>  };
>> @@ -167,6 +206,9 @@ static void cpufreq_stats_free_table(unsigned int cpu)
>>  
>>  	if (stat) {
>>  		pr_debug("%s: Free stat table\n", __func__);
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +		kfree(stat->load_table);
>> +#endif
>>  		kfree(stat->time_in_state);
>>  		kfree(stat);
>>  		per_cpu(cpufreq_stats_table, cpu) = NULL;
>> @@ -244,6 +286,16 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
>>  
>>  #ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>>  	stat->trans_table = stat->freq_table + count;
>> +
>> +	stat->load_max_index = CPUFREQ_LOAD_TABLE_MAX;
>> +	stat->load_last_index = 0;
>> +
>> +	alloc_size = sizeof(struct cpufreq_freqs) * stat->load_max_index;
>> +	stat->load_table = kzalloc(alloc_size, GFP_KERNEL);
>> +	if (!stat->load_table) {
>> +		ret = -ENOMEM;
>> +		goto error_out;
>> +	}
>>  #endif
>>  	j = 0;
>>  	for (i = 0; table[i].frequency != CPUFREQ_TABLE_END; i++) {
>> @@ -312,13 +364,31 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
>>  	struct cpufreq_stats *stat;
>>  	int old_index, new_index;
>>  
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +	if (val != CPUFREQ_POSTCHANGE && val != CPUFREQ_LOADCHECK)
>> +#else
>>  	if (val != CPUFREQ_POSTCHANGE)
>> +#endif
>>  		return 0;
>>  
>>  	stat = per_cpu(cpufreq_stats_table, freq->cpu);
>>  	if (!stat)
>>  		return 0;
>>  
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +	if (val == CPUFREQ_LOADCHECK) {
>> +		struct cpufreq_freqs *dest_freq;
>> +
>> +		dest_freq = &stat->load_table[stat->load_last_index];
>> +		memcpy(dest_freq, freq, sizeof(struct cpufreq_freqs));
>> +
>> +		if (++stat->load_last_index == stat->load_max_index)
>> +			stat->load_last_index = 0;
>> +
>> +		return 0;
>> +	}
>> +#endif
>> +
>>  	old_index = stat->last_index;
>>  	new_index = freq_table_get_index(stat, freq->new);
>>  
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 037d36a..f780645 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -140,12 +140,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
>>  #define CPUFREQ_POSTCHANGE	(1)
>>  #define CPUFREQ_RESUMECHANGE	(8)
>>  #define CPUFREQ_SUSPENDCHANGE	(9)
>> +#define CPUFREQ_LOADCHECK	(10)
>>  
>>  struct cpufreq_freqs {
>>  	unsigned int cpu;	/* cpu nr */
>>  	unsigned int old;
>>  	unsigned int new;
>>  	u8 flags;		/* flags of cpufreq_driver, see below. */
>> +
>> +#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
>> +	int64_t time;
>> +	unsigned int load[NR_CPUS];
>> +#endif
>>  };
>>  
>>  
>>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-11 22:14 ` Rafael J. Wysocki
  2013-06-12  0:51   ` Chanwoo Choi
@ 2013-06-12  4:02   ` Viresh Kumar
  2013-06-12 11:05     ` Rafael J. Wysocki
  1 sibling, 1 reply; 15+ messages in thread
From: Viresh Kumar @ 2013-06-12  4:02 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chanwoo Choi, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On 12 June 2013 03:44, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Wednesday, June 05, 2013 05:11:22 PM Chanwoo Choi wrote:
>> This patch add new sysfs file to show previous accumulated data of CPU load
>> as following path. This sysfs file is used to judge the correct system state
>> or determine suitable system resource on user-space.
>> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>
>> This sysfs file include following data:
>> - Measurement point of time
>> - CPU frequency
>> - Per-CPU load
>>
>> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
>> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>
> Well, first of all, there is the "one value per file" rule for sysfs attributes
> which is evidently violated by this code.

Even this was enclosed in CONFIG_CPU_FREQ_STAT_DETAILS,
so even sysfs isn't that bad as we already had something similar here.

> Second, this looks like a feature needed to handle one particular platform, so
> why do you want to add it to the cpufreq core?

I really felt this would be useful to others. They can track the load on
all cores for some time and that will really be useful. People can
understand their loads and system more easily with this patch in.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-12  4:02   ` Viresh Kumar
@ 2013-06-12 11:05     ` Rafael J. Wysocki
  2013-06-14  2:11       ` Chanwoo Choi
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2013-06-12 11:05 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Chanwoo Choi, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On Wednesday, June 12, 2013 09:32:16 AM Viresh Kumar wrote:
> On 12 June 2013 03:44, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Wednesday, June 05, 2013 05:11:22 PM Chanwoo Choi wrote:
> >> This patch add new sysfs file to show previous accumulated data of CPU load
> >> as following path. This sysfs file is used to judge the correct system state
> >> or determine suitable system resource on user-space.
> >> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
> >>
> >> This sysfs file include following data:
> >> - Measurement point of time
> >> - CPU frequency
> >> - Per-CPU load
> >>
> >> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
> >> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >
> > Well, first of all, there is the "one value per file" rule for sysfs attributes
> > which is evidently violated by this code.
> 
> Even this was enclosed in CONFIG_CPU_FREQ_STAT_DETAILS,
> so even sysfs isn't that bad as we already had something similar here.

Yes, we did, and yes, it was a mistake.  It should have been in debugfs from
the very beginning.

> > Second, this looks like a feature needed to handle one particular platform, so
> > why do you want to add it to the cpufreq core?
> 
> I really felt this would be useful to others. They can track the load on
> all cores for some time and that will really be useful. People can
> understand their loads and system more easily with this patch in.

If it were in debugfs, I'd have no objections.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-12 11:05     ` Rafael J. Wysocki
@ 2013-06-14  2:11       ` Chanwoo Choi
  2013-06-14  4:11         ` Viresh Kumar
  0 siblings, 1 reply; 15+ messages in thread
From: Chanwoo Choi @ 2013-06-14  2:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On 06/12/2013 08:05 PM, Rafael J. Wysocki wrote:
> On Wednesday, June 12, 2013 09:32:16 AM Viresh Kumar wrote:
>> On 12 June 2013 03:44, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> On Wednesday, June 05, 2013 05:11:22 PM Chanwoo Choi wrote:
>>>> This patch add new sysfs file to show previous accumulated data of CPU load
>>>> as following path. This sysfs file is used to judge the correct system state
>>>> or determine suitable system resource on user-space.
>>>> - /sys/devices/system/cpu/cpu0/cpufreq/stats/load_table
>>>>
>>>> This sysfs file include following data:
>>>> - Measurement point of time
>>>> - CPU frequency
>>>> - Per-CPU load
>>>>
>>>> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
>>>> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>
>>> Well, first of all, there is the "one value per file" rule for sysfs attributes
>>> which is evidently violated by this code.
>>
>> Even this was enclosed in CONFIG_CPU_FREQ_STAT_DETAILS,
>> so even sysfs isn't that bad as we already had something similar here.
> 
> Yes, we did, and yes, it was a mistake.  It should have been in debugfs from
> the very beginning.
> 
>>> Second, this looks like a feature needed to handle one particular platform, so
>>> why do you want to add it to the cpufreq core?
>>
>> I really felt this would be useful to others. They can track the load on
>> all cores for some time and that will really be useful. People can
>> understand their loads and system more easily with this patch in.
> 
> If it were in debugfs, I'd have no objections.
> 

OK, I will reimplement load_table node by using debugfs instead of sysfs.

Thanks.

Best Regards,
Chanwoo Choi

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-14  2:11       ` Chanwoo Choi
@ 2013-06-14  4:11         ` Viresh Kumar
  2013-06-14 12:48           ` Rafael J. Wysocki
  0 siblings, 1 reply; 15+ messages in thread
From: Viresh Kumar @ 2013-06-14  4:11 UTC (permalink / raw)
  To: Chanwoo Choi
  Cc: Rafael J. Wysocki, linux-kernel, linux-pm, cpufreq,
	kyungmin.park, myungjoo.ham

On 14 June 2013 07:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:
> OK, I will reimplement load_table node by using debugfs instead of sysfs.

@Rafael: Would it be move trans_table to debugfs??

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-14  4:11         ` Viresh Kumar
@ 2013-06-14 12:48           ` Rafael J. Wysocki
  2013-06-14 14:49             ` Viresh Kumar
  0 siblings, 1 reply; 15+ messages in thread
From: Rafael J. Wysocki @ 2013-06-14 12:48 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Chanwoo Choi, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On Friday, June 14, 2013 09:41:13 AM Viresh Kumar wrote:
> On 14 June 2013 07:41, Chanwoo Choi <cw00.choi@samsung.com> wrote:
> > OK, I will reimplement load_table node by using debugfs instead of sysfs.
> 
> @Rafael: Would it be move trans_table to debugfs??

I'm not sure who uses it now.  If there are any user space tools depending on
it in use, I'm afraid we can't just move it around.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU
  2013-06-14 12:48           ` Rafael J. Wysocki
@ 2013-06-14 14:49             ` Viresh Kumar
  0 siblings, 0 replies; 15+ messages in thread
From: Viresh Kumar @ 2013-06-14 14:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chanwoo Choi, linux-kernel, linux-pm, cpufreq, kyungmin.park,
	myungjoo.ham

On 14 June 2013 18:18, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, June 14, 2013 09:41:13 AM Viresh Kumar wrote:
>> @Rafael: Would it be move trans_table to debugfs??

Oops!! What stupid english.

s/be/be sensible to/

> I'm not sure who uses it now.  If there are any user space tools depending on
> it in use, I'm afraid we can't just move it around.

Even I am also unsure about it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-06-14 14:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-05  8:11 [RESEND][PATCH] cpufreq: stats: Add 'load_table' sysfs file to show accumulated data of CPU Chanwoo Choi
2013-06-07 10:23 ` Viresh Kumar
2013-06-10 12:13   ` Chanwoo Choi
2013-06-11  5:06     ` Viresh Kumar
2013-06-11  5:19       ` Chanwoo Choi
2013-06-11  5:22         ` Viresh Kumar
2013-06-11  6:10           ` Chanwoo Choi
2013-06-11 22:14 ` Rafael J. Wysocki
2013-06-12  0:51   ` Chanwoo Choi
2013-06-12  4:02   ` Viresh Kumar
2013-06-12 11:05     ` Rafael J. Wysocki
2013-06-14  2:11       ` Chanwoo Choi
2013-06-14  4:11         ` Viresh Kumar
2013-06-14 12:48           ` Rafael J. Wysocki
2013-06-14 14:49             ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).