* [PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
2016-04-19 13:35 ` [v6,1/6] " Guenter Roeck
2016-04-06 7:44 ` [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power Huang Rui
` (4 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
driver. Because the following patch will use the interface from
x86/kernel/cpu/amd.c.
Otherwise, the below error might be encountered:
All errors (new ones prefixed by >>):
drivers/built-in.o: In function `fam15h_power_probe':
>> fam15h_power.c:(.text+0x26e3a3): undefined reference to
>> `amd_get_cores_per_cu'
fam15h_power.c:(.text+0x26e41e): undefined reference to
`amd_get_cores_per_cu'
Reported-by: build test robot <lkp@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
drivers/hwmon/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5c2d13a..4be3792 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -288,7 +288,7 @@ config SENSORS_K10TEMP
config SENSORS_FAM15H_POWER
tristate "AMD Family 15h processor power"
- depends on X86 && PCI
+ depends on X86 && PCI && CPU_SUP_AMD
help
If you say yes here you get support for processor power
information of your AMD family 15h CPU.
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [v6,1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence
2016-04-06 7:44 ` [PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence Huang Rui
@ 2016-04-19 13:35 ` Guenter Roeck
0 siblings, 0 replies; 11+ messages in thread
From: Guenter Roeck @ 2016-04-19 13:35 UTC (permalink / raw)
To: Huang Rui
Cc: Jean Delvare, linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz
On Wed, Apr 06, 2016 at 03:44:10PM +0800, Huang Rui wrote:
> This patch adds CONFIG_CPU_SUP_AMD as the dependence of fam15h_power
> driver. Because the following patch will use the interface from
> x86/kernel/cpu/amd.c.
>
> Otherwise, the below error might be encountered:
>
> All errors (new ones prefixed by >>):
>
> drivers/built-in.o: In function `fam15h_power_probe':
> >> fam15h_power.c:(.text+0x26e3a3): undefined reference to
> >> `amd_get_cores_per_cu'
> fam15h_power.c:(.text+0x26e41e): undefined reference to
> `amd_get_cores_per_cu'
>
> Reported-by: build test robot <lkp@intel.com>
> Acked-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
>
Series applied.
Thanks,
Guenter
> ---
> drivers/hwmon/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index 5c2d13a..4be3792 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -288,7 +288,7 @@ config SENSORS_K10TEMP
>
> config SENSORS_FAM15H_POWER
> tristate "AMD Family 15h processor power"
> - depends on X86 && PCI
> + depends on X86 && PCI && CPU_SUP_AMD
> help
> If you say yes here you get support for processor power
> information of your AMD family 15h CPU.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
2016-04-06 7:44 ` [PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
2016-04-06 15:30 ` Guenter Roeck
2016-04-06 7:44 ` [PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for " Huang Rui
` (3 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
This patch adds a member in fam15h_power_data which specifies the
compute unit accumulated power. It adds do_read_registers_on_cu to do
all the read to all MSRs and run it on one of the online cores on each
compute unit with smp_call_function_many(). This behavior can decrease
IPI numbers.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
drivers/hwmon/fam15h_power.c | 72 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 71 insertions(+), 1 deletion(-)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4f695d8..4edbaf0 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -25,6 +25,8 @@
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/bitops.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
#include <asm/processor.h>
#include <asm/msr.h>
@@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
#define FAM15H_MIN_NUM_ATTRS 2
#define FAM15H_NUM_GROUPS 2
+#define MAX_CUS 8
+#define MSR_F15H_CU_PWR_ACCUMULATOR 0xc001007a
#define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
#define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -59,6 +63,8 @@ struct fam15h_power_data {
struct attribute_group group;
/* maximum accumulated power of a compute unit */
u64 max_cu_acc_power;
+ /* accumulated power of the compute units */
+ u64 cu_acc_power[MAX_CUS];
};
static ssize_t show_power(struct device *dev,
@@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev,
}
static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
+static void do_read_registers_on_cu(void *_data)
+{
+ struct fam15h_power_data *data = _data;
+ int cpu, cu;
+
+ cpu = smp_processor_id();
+
+ /*
+ * With the new x86 topology modelling, cpu core id actually
+ * is compute unit id.
+ */
+ cu = cpu_data(cpu).cpu_core_id;
+
+ rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
+}
+
+/*
+ * This function is only able to be called when CPUID
+ * Fn8000_0007:EDX[12] is set.
+ */
+static int read_registers(struct fam15h_power_data *data)
+{
+ int this_cpu, ret, cpu;
+ int core, this_core;
+ cpumask_var_t mask;
+
+ ret = zalloc_cpumask_var(&mask, GFP_KERNEL);
+ if (!ret)
+ return -ENOMEM;
+
+ get_online_cpus();
+ this_cpu = smp_processor_id();
+
+ /*
+ * Choose the first online core of each compute unit, and then
+ * read their MSR value of power and ptsc in a single IPI,
+ * because the MSR value of CPU core represent the compute
+ * unit's.
+ */
+ core = -1;
+
+ for_each_online_cpu(cpu) {
+ this_core = topology_core_id(cpu);
+
+ if (this_core == core)
+ continue;
+
+ core = this_core;
+
+ /* get any CPU on this compute unit */
+ cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), mask);
+ }
+
+ if (cpumask_test_cpu(this_cpu, mask))
+ do_read_registers_on_cu(data);
+
+ smp_call_function_many(mask, do_read_registers_on_cu, data, true);
+ put_online_cpus();
+
+ free_cpumask_var(mask);
+
+ return 0;
+}
+
static int fam15h_power_init_attrs(struct pci_dev *pdev,
struct fam15h_power_data *data)
{
@@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
data->max_cu_acc_power = tmp;
- return 0;
+ return read_registers(data);
}
static int fam15h_power_probe(struct pci_dev *pdev,
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
2016-04-06 7:44 ` [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power Huang Rui
@ 2016-04-06 15:30 ` Guenter Roeck
2016-04-07 5:05 ` Huang Rui
0 siblings, 1 reply; 11+ messages in thread
From: Guenter Roeck @ 2016-04-06 15:30 UTC (permalink / raw)
To: Huang Rui
Cc: Jean Delvare, linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz
On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote:
> This patch adds a member in fam15h_power_data which specifies the
> compute unit accumulated power. It adds do_read_registers_on_cu to do
> all the read to all MSRs and run it on one of the online cores on each
> compute unit with smp_call_function_many(). This behavior can decrease
> IPI numbers.
>
> Suggested-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> ---
> drivers/hwmon/fam15h_power.c | 72 +++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index 4f695d8..4edbaf0 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -25,6 +25,8 @@
> #include <linux/module.h>
> #include <linux/pci.h>
> #include <linux/bitops.h>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> #include <asm/processor.h>
> #include <asm/msr.h>
>
> @@ -44,7 +46,9 @@ MODULE_LICENSE("GPL");
>
> #define FAM15H_MIN_NUM_ATTRS 2
> #define FAM15H_NUM_GROUPS 2
> +#define MAX_CUS 8
>
> +#define MSR_F15H_CU_PWR_ACCUMULATOR 0xc001007a
> #define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
>
> #define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
> @@ -59,6 +63,8 @@ struct fam15h_power_data {
> struct attribute_group group;
> /* maximum accumulated power of a compute unit */
> u64 max_cu_acc_power;
> + /* accumulated power of the compute units */
> + u64 cu_acc_power[MAX_CUS];
> };
>
> static ssize_t show_power(struct device *dev,
> @@ -125,6 +131,70 @@ static ssize_t show_power_crit(struct device *dev,
> }
> static DEVICE_ATTR(power1_crit, S_IRUGO, show_power_crit, NULL);
>
> +static void do_read_registers_on_cu(void *_data)
> +{
> + struct fam15h_power_data *data = _data;
> + int cpu, cu;
> +
> + cpu = smp_processor_id();
> +
Is this function now defined in non-SMP code ? If so, can you point me to the
patch or branch introducing it ? It doesn't seem to be in mainline or in -next
unless I am missing it.
> + /*
> + * With the new x86 topology modelling, cpu core id actually
> + * is compute unit id.
> + */
> + cu = cpu_data(cpu).cpu_core_id;
> +
> + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
> +}
> +
> +/*
> + * This function is only able to be called when CPUID
> + * Fn8000_0007:EDX[12] is set.
> + */
> +static int read_registers(struct fam15h_power_data *data)
> +{
> + int this_cpu, ret, cpu;
> + int core, this_core;
> + cpumask_var_t mask;
> +
> + ret = zalloc_cpumask_var(&mask, GFP_KERNEL);
> + if (!ret)
> + return -ENOMEM;
> +
> + get_online_cpus();
> + this_cpu = smp_processor_id();
> +
> + /*
> + * Choose the first online core of each compute unit, and then
> + * read their MSR value of power and ptsc in a single IPI,
> + * because the MSR value of CPU core represent the compute
> + * unit's.
> + */
> + core = -1;
> +
> + for_each_online_cpu(cpu) {
> + this_core = topology_core_id(cpu);
> +
> + if (this_core == core)
> + continue;
> +
> + core = this_core;
> +
Sorry if I missed some context - is it guaranteed that all cores in the same
compute unit are returned next to each other from for_each_online_cpu() ?
Thanks,
Guenter
> + /* get any CPU on this compute unit */
> + cpumask_set_cpu(cpumask_any(topology_sibling_cpumask(cpu)), mask);
> + }
> +
> + if (cpumask_test_cpu(this_cpu, mask))
> + do_read_registers_on_cu(data);
> +
> + smp_call_function_many(mask, do_read_registers_on_cu, data, true);
> + put_online_cpus();
> +
> + free_cpumask_var(mask);
> +
> + return 0;
> +}
> +
> static int fam15h_power_init_attrs(struct pci_dev *pdev,
> struct fam15h_power_data *data)
> {
> @@ -263,7 +333,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
>
> data->max_cu_acc_power = tmp;
>
> - return 0;
> + return read_registers(data);
> }
>
> static int fam15h_power_probe(struct pci_dev *pdev,
> --
> 1.9.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
2016-04-06 15:30 ` Guenter Roeck
@ 2016-04-07 5:05 ` Huang Rui
2016-04-07 5:25 ` Guenter Roeck
0 siblings, 1 reply; 11+ messages in thread
From: Huang Rui @ 2016-04-07 5:05 UTC (permalink / raw)
To: Guenter Roeck
Cc: Jean Delvare, linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz
On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote:
> On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote:
> >
> > +static void do_read_registers_on_cu(void *_data)
> > +{
> > + struct fam15h_power_data *data = _data;
> > + int cpu, cu;
> > +
> > + cpu = smp_processor_id();
> > +
>
> Is this function now defined in non-SMP code ? If so, can you point me to the
> patch or branch introducing it ? It doesn't seem to be in mainline or in -next
> unless I am missing it.
>
In include/linux/smp.h
#else /* !SMP */
static inline void smp_send_stop(void) { }
/*
* These macros fold the SMP functionality into a single CPU system
*/
#define raw_smp_processor_id() 0
...
/*
* smp_processor_id(): get the current CPU ID.
*
* if DEBUG_PREEMPT is enabled then we check whether it is
* used in a preemption-safe way. (smp_processor_id() is safe
* if it's used in a preemption-off critical section, or in
* a thread that is bound to the current CPU.)
*
* NOTE: raw_smp_processor_id() is for internal use only
* (smp_processor_id() is the preferred variant), but in rare
* instances it might also be used to turn off false positives
* (i.e. smp_processor_id() use that the debugging code reports but
* which use for some reason is legal). Don't use this to hack around
* the warning message, as your code might not work under PREEMPT.
*/
#ifdef CONFIG_DEBUG_PREEMPT
extern unsigned int debug_smp_processor_id(void);
# define smp_processor_id() debug_smp_processor_id()
#else
# define smp_processor_id() raw_smp_processor_id()
#endif
Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP.
> > + /*
> > + * With the new x86 topology modelling, cpu core id actually
> > + * is compute unit id.
> > + */
> > + cu = cpu_data(cpu).cpu_core_id;
> > +
> > + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
> > +}
> > +
> > +/*
> > + * This function is only able to be called when CPUID
> > + * Fn8000_0007:EDX[12] is set.
> > + */
> > +static int read_registers(struct fam15h_power_data *data)
> > +{
> > + int this_cpu, ret, cpu;
> > + int core, this_core;
> > + cpumask_var_t mask;
> > +
> > + ret = zalloc_cpumask_var(&mask, GFP_KERNEL);
> > + if (!ret)
> > + return -ENOMEM;
> > +
> > + get_online_cpus();
> > + this_cpu = smp_processor_id();
> > +
> > + /*
> > + * Choose the first online core of each compute unit, and then
> > + * read their MSR value of power and ptsc in a single IPI,
> > + * because the MSR value of CPU core represent the compute
> > + * unit's.
> > + */
> > + core = -1;
> > +
> > + for_each_online_cpu(cpu) {
> > + this_core = topology_core_id(cpu);
> > +
> > + if (this_core == core)
> > + continue;
> > +
> > + core = this_core;
> > +
> Sorry if I missed some context - is it guaranteed that all cores in the same
> compute unit are returned next to each other from for_each_online_cpu() ?
>
Yes, there is a documentation which introduced from v4.6-rc2:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079
- topology_core_id();
The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
"core_id."
...
AMD nomenclature for CMT systems:
[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
-> [Compute Unit Core 1] -> Linux CPU 1
-> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
-> [Compute Unit Core 1] -> Linux CPU 3
ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id"
core id : 0
core id : 0
core id : 1
core id : 1
"this_core" here actually means the [Compute Unit] id which current
[Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core].
Thanks,
Rui
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power
2016-04-07 5:05 ` Huang Rui
@ 2016-04-07 5:25 ` Guenter Roeck
0 siblings, 0 replies; 11+ messages in thread
From: Guenter Roeck @ 2016-04-07 5:25 UTC (permalink / raw)
To: Huang Rui
Cc: Jean Delvare, linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz
On 04/06/2016 10:05 PM, Huang Rui wrote:
> On Wed, Apr 06, 2016 at 08:30:25AM -0700, Guenter Roeck wrote:
>> On Wed, Apr 06, 2016 at 03:44:11PM +0800, Huang Rui wrote:
>>>
>>> +static void do_read_registers_on_cu(void *_data)
>>> +{
>>> + struct fam15h_power_data *data = _data;
>>> + int cpu, cu;
>>> +
>>> + cpu = smp_processor_id();
>>> +
>>
>> Is this function now defined in non-SMP code ? If so, can you point me to the
>> patch or branch introducing it ? It doesn't seem to be in mainline or in -next
>> unless I am missing it.
>>
>
> In include/linux/smp.h
>
>
> #else /* !SMP */
>
> static inline void smp_send_stop(void) { }
>
> /*
> * These macros fold the SMP functionality into a single CPU system
> */
> #define raw_smp_processor_id() 0
>
> ...
>
> /*
> * smp_processor_id(): get the current CPU ID.
> *
> * if DEBUG_PREEMPT is enabled then we check whether it is
> * used in a preemption-safe way. (smp_processor_id() is safe
> * if it's used in a preemption-off critical section, or in
> * a thread that is bound to the current CPU.)
> *
> * NOTE: raw_smp_processor_id() is for internal use only
> * (smp_processor_id() is the preferred variant), but in rare
> * instances it might also be used to turn off false positives
> * (i.e. smp_processor_id() use that the debugging code reports but
> * which use for some reason is legal). Don't use this to hack around
> * the warning message, as your code might not work under PREEMPT.
> */
> #ifdef CONFIG_DEBUG_PREEMPT
> extern unsigned int debug_smp_processor_id(void);
> # define smp_processor_id() debug_smp_processor_id()
> #else
> # define smp_processor_id() raw_smp_processor_id()
> #endif
>
>
> Actually smp_processor_id() should returns 0 if we disable CONFIG_SMP.
>
>>> + /*
>>> + * With the new x86 topology modelling, cpu core id actually
>>> + * is compute unit id.
>>> + */
>>> + cu = cpu_data(cpu).cpu_core_id;
>>> +
>>> + rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
>>> +}
>>> +
>>> +/*
>>> + * This function is only able to be called when CPUID
>>> + * Fn8000_0007:EDX[12] is set.
>>> + */
>>> +static int read_registers(struct fam15h_power_data *data)
>>> +{
>>> + int this_cpu, ret, cpu;
>>> + int core, this_core;
>>> + cpumask_var_t mask;
>>> +
>>> + ret = zalloc_cpumask_var(&mask, GFP_KERNEL);
>>> + if (!ret)
>>> + return -ENOMEM;
>>> +
>>> + get_online_cpus();
>>> + this_cpu = smp_processor_id();
>>> +
>>> + /*
>>> + * Choose the first online core of each compute unit, and then
>>> + * read their MSR value of power and ptsc in a single IPI,
>>> + * because the MSR value of CPU core represent the compute
>>> + * unit's.
>>> + */
>>> + core = -1;
>>> +
>>> + for_each_online_cpu(cpu) {
>>> + this_core = topology_core_id(cpu);
>>> +
>>> + if (this_core == core)
>>> + continue;
>>> +
>>> + core = this_core;
>>> +
>> Sorry if I missed some context - is it guaranteed that all cores in the same
>> compute unit are returned next to each other from for_each_online_cpu() ?
>>
>
> Yes, there is a documentation which introduced from v4.6-rc2:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f7be8610bca88e59dd2fd5d98fcbc5031ef0e079
>
> - topology_core_id();
>
> The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
> "core_id."
>
> ...
>
> AMD nomenclature for CMT systems:
>
> [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
> -> [Compute Unit Core 1] -> Linux CPU 1
> -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
> -> [Compute Unit Core 1] -> Linux CPU 3
>
> ray@hr-ub:~/tip$ cat /proc/cpuinfo | grep "core id"
> core id : 0
> core id : 0
> core id : 1
> core id : 1
>
> "this_core" here actually means the [Compute Unit] id which current
> [Compute Unit Core] belongs to. And "cpu" here means the [Compute Unit Core].
>
Ok, thanks.
Guenter
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for accumulated power
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
2016-04-06 7:44 ` [PATCH v6 1/6] hwmon: (fam15h_power) Add CPU_SUP_AMD as the dependence Huang Rui
2016-04-06 7:44 ` [PATCH v6 2/6] hwmon: (fam15h_power) Add compute unit accumulated power Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
2016-04-06 7:44 ` [PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm Huang Rui
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
PTSC is the performance timestamp counter value in a cpu core and the
cores in one compute unit have the fixed frequency. So it picks up the
performance timestamp counter value of the first core per compute unit
to measure the interval for average power per compute unit.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
---
drivers/hwmon/fam15h_power.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 4edbaf0..336d422 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -50,6 +50,7 @@ MODULE_LICENSE("GPL");
#define MSR_F15H_CU_PWR_ACCUMULATOR 0xc001007a
#define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
+#define MSR_F15H_PTSC 0xc0010280
#define PCI_DEVICE_ID_AMD_15H_M70H_NB_F4 0x15b4
@@ -65,6 +66,8 @@ struct fam15h_power_data {
u64 max_cu_acc_power;
/* accumulated power of the compute units */
u64 cu_acc_power[MAX_CUS];
+ /* performance timestamp counter */
+ u64 cpu_sw_pwr_ptsc[MAX_CUS];
};
static ssize_t show_power(struct device *dev,
@@ -145,6 +148,7 @@ static void do_read_registers_on_cu(void *_data)
cu = cpu_data(cpu).cpu_core_id;
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
+ rdmsrl_safe(MSR_F15H_PTSC, &data->cpu_sw_pwr_ptsc[cu]);
}
/*
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
` (2 preceding siblings ...)
2016-04-06 7:44 ` [PATCH v6 3/6] hwmon: (fam15h_power) Add ptsc counter value for " Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
2016-04-06 7:44 ` [PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm Huang Rui
2016-04-06 7:44 ` [PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function Huang Rui
5 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
This patch introduces an algorithm that computes the average power by
reading a delta value of “core power accumulator” register during
measurement interval, and then dividing delta value by the length of
the time interval.
User is able to use power1_average entry to measure the processor power
consumption and power1_average_interval entry to set the interval.
A simple example:
ray@hr-ub:~/tip$ sensors
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 19.58 mW (avg = 2.55 mW, interval = 0.01 s)
(crit = 15.00 W)
...
The result is current average processor power consumption in 10
millisecond. The unit of the result is uWatt.
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
---
drivers/hwmon/fam15h_power.c | 128 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 124 insertions(+), 4 deletions(-)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 336d422..5abbfa8 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -27,6 +27,8 @@
#include <linux/bitops.h>
#include <linux/cpu.h>
#include <linux/cpumask.h>
+#include <linux/time.h>
+#include <linux/sched.h>
#include <asm/processor.h>
#include <asm/msr.h>
@@ -48,6 +50,9 @@ MODULE_LICENSE("GPL");
#define FAM15H_NUM_GROUPS 2
#define MAX_CUS 8
+/* set maximum interval as 1 second */
+#define MAX_INTERVAL 1000
+
#define MSR_F15H_CU_PWR_ACCUMULATOR 0xc001007a
#define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
#define MSR_F15H_PTSC 0xc0010280
@@ -68,6 +73,9 @@ struct fam15h_power_data {
u64 cu_acc_power[MAX_CUS];
/* performance timestamp counter */
u64 cpu_sw_pwr_ptsc[MAX_CUS];
+ /* online/offline status of current compute unit */
+ int cu_on[MAX_CUS];
+ unsigned long power_period;
};
static ssize_t show_power(struct device *dev,
@@ -149,6 +157,8 @@ static void do_read_registers_on_cu(void *_data)
rdmsrl_safe(MSR_F15H_CU_PWR_ACCUMULATOR, &data->cu_acc_power[cu]);
rdmsrl_safe(MSR_F15H_PTSC, &data->cpu_sw_pwr_ptsc[cu]);
+
+ data->cu_on[cu] = 1;
}
/*
@@ -165,6 +175,8 @@ static int read_registers(struct fam15h_power_data *data)
if (!ret)
return -ENOMEM;
+ memset(data->cu_on, 0, sizeof(int) * MAX_CUS);
+
get_online_cpus();
this_cpu = smp_processor_id();
@@ -199,6 +211,98 @@ static int read_registers(struct fam15h_power_data *data)
return 0;
}
+static ssize_t acc_show_power(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct fam15h_power_data *data = dev_get_drvdata(dev);
+ u64 prev_cu_acc_power[MAX_CUS], prev_ptsc[MAX_CUS],
+ jdelta[MAX_CUS];
+ u64 tdelta, avg_acc;
+ int cu, cu_num, ret;
+ signed long leftover;
+
+ /*
+ * With the new x86 topology modelling, x86_max_cores is the
+ * compute unit number.
+ */
+ cu_num = boot_cpu_data.x86_max_cores;
+
+ ret = read_registers(data);
+ if (ret)
+ return 0;
+
+ for (cu = 0; cu < cu_num; cu++) {
+ prev_cu_acc_power[cu] = data->cu_acc_power[cu];
+ prev_ptsc[cu] = data->cpu_sw_pwr_ptsc[cu];
+ }
+
+ leftover = schedule_timeout_interruptible(msecs_to_jiffies(data->power_period));
+ if (leftover)
+ return 0;
+
+ ret = read_registers(data);
+ if (ret)
+ return 0;
+
+ for (cu = 0, avg_acc = 0; cu < cu_num; cu++) {
+ /* check if current compute unit is online */
+ if (data->cu_on[cu] == 0)
+ continue;
+
+ if (data->cu_acc_power[cu] < prev_cu_acc_power[cu]) {
+ jdelta[cu] = data->max_cu_acc_power + data->cu_acc_power[cu];
+ jdelta[cu] -= prev_cu_acc_power[cu];
+ } else {
+ jdelta[cu] = data->cu_acc_power[cu] - prev_cu_acc_power[cu];
+ }
+ tdelta = data->cpu_sw_pwr_ptsc[cu] - prev_ptsc[cu];
+ jdelta[cu] *= data->cpu_pwr_sample_ratio * 1000;
+ do_div(jdelta[cu], tdelta);
+
+ /* the unit is microWatt */
+ avg_acc += jdelta[cu];
+ }
+
+ return sprintf(buf, "%llu\n", (unsigned long long)avg_acc);
+}
+static DEVICE_ATTR(power1_average, S_IRUGO, acc_show_power, NULL);
+
+static ssize_t acc_show_power_period(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct fam15h_power_data *data = dev_get_drvdata(dev);
+
+ return sprintf(buf, "%lu\n", data->power_period);
+}
+
+static ssize_t acc_set_power_period(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct fam15h_power_data *data = dev_get_drvdata(dev);
+ unsigned long temp;
+ int ret;
+
+ ret = kstrtoul(buf, 10, &temp);
+ if (ret)
+ return ret;
+
+ if (temp > MAX_INTERVAL)
+ return -EINVAL;
+
+ /* the interval value should be greater than 0 */
+ if (temp <= 0)
+ return -EINVAL;
+
+ data->power_period = temp;
+
+ return count;
+}
+static DEVICE_ATTR(power1_average_interval, S_IRUGO | S_IWUSR,
+ acc_show_power_period, acc_set_power_period);
+
static int fam15h_power_init_attrs(struct pci_dev *pdev,
struct fam15h_power_data *data)
{
@@ -211,6 +315,10 @@ static int fam15h_power_init_attrs(struct pci_dev *pdev,
(c->x86_model >= 0x60 && c->x86_model <= 0x7f)))
n += 1;
+ /* check if processor supports accumulated power */
+ if (boot_cpu_has(X86_FEATURE_ACC_POWER))
+ n += 2;
+
fam15h_power_attrs = devm_kcalloc(&pdev->dev, n,
sizeof(*fam15h_power_attrs),
GFP_KERNEL);
@@ -225,6 +333,11 @@ static int fam15h_power_init_attrs(struct pci_dev *pdev,
(c->x86_model >= 0x60 && c->x86_model <= 0x7f)))
fam15h_power_attrs[n++] = &dev_attr_power1_input.attr;
+ if (boot_cpu_has(X86_FEATURE_ACC_POWER)) {
+ fam15h_power_attrs[n++] = &dev_attr_power1_average.attr;
+ fam15h_power_attrs[n++] = &dev_attr_power1_average_interval.attr;
+ }
+
data->group.attrs = fam15h_power_attrs;
return 0;
@@ -290,7 +403,7 @@ static int fam15h_power_resume(struct pci_dev *pdev)
static int fam15h_power_init_data(struct pci_dev *f4,
struct fam15h_power_data *data)
{
- u32 val, eax, ebx, ecx, edx;
+ u32 val;
u64 tmp;
int ret;
@@ -317,10 +430,9 @@ static int fam15h_power_init_data(struct pci_dev *f4,
if (ret)
return ret;
- cpuid(0x80000007, &eax, &ebx, &ecx, &edx);
/* CPUID Fn8000_0007:EDX[12] indicates to support accumulated power */
- if (!(edx & BIT(12)))
+ if (!boot_cpu_has(X86_FEATURE_ACC_POWER))
return 0;
/*
@@ -328,7 +440,7 @@ static int fam15h_power_init_data(struct pci_dev *f4,
* sample period to the PTSC counter period by executing CPUID
* Fn8000_0007:ECX
*/
- data->cpu_pwr_sample_ratio = ecx;
+ data->cpu_pwr_sample_ratio = cpuid_ecx(0x80000007);
if (rdmsrl_safe(MSR_F15H_CU_MAX_PWR_ACCUMULATOR, &tmp)) {
pr_err("Failed to read max compute unit power accumulator MSR\n");
@@ -337,6 +449,14 @@ static int fam15h_power_init_data(struct pci_dev *f4,
data->max_cu_acc_power = tmp;
+ /*
+ * Milliseconds are a reasonable interval for the measurement.
+ * But it shouldn't set too long here, because several seconds
+ * would cause the read function to hang. So set default
+ * interval as 10 ms.
+ */
+ data->power_period = 10;
+
return read_registers(data);
}
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
` (3 preceding siblings ...)
2016-04-06 7:44 ` [PATCH v6 4/6] hwmon: (fam15h_power) Introduce a cpu accumulated power reporting algorithm Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
2016-04-06 7:44 ` [PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function Huang Rui
5 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
This patch adds the description to explain the TDP reporting mechanism
and accumulated power algorithm.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
---
Documentation/hwmon/fam15h_power | 65 +++++++++++++++++++++++++++++++++++++++-
drivers/hwmon/fam15h_power.c | 2 +-
2 files changed, 65 insertions(+), 2 deletions(-)
diff --git a/Documentation/hwmon/fam15h_power b/Documentation/hwmon/fam15h_power
index e2b1b69..fb594c2 100644
--- a/Documentation/hwmon/fam15h_power
+++ b/Documentation/hwmon/fam15h_power
@@ -10,14 +10,22 @@ Supported chips:
Datasheets:
BIOS and Kernel Developer's Guide (BKDG) For AMD Family 15h Processors
BIOS and Kernel Developer's Guide (BKDG) For AMD Family 16h Processors
+ AMD64 Architecture Programmer's Manual Volume 2: System Programming
Author: Andreas Herrmann <herrmann.der.user@googlemail.com>
Description
-----------
+1) Processor TDP (Thermal design power)
+
+Given a fixed frequency and voltage, the power consumption of a
+processor varies based on the workload being executed. Derated power
+is the power consumed when running a specific application. Thermal
+design power (TDP) is an example of derated power.
+
This driver permits reading of registers providing power information
-of AMD Family 15h and 16h processors.
+of AMD Family 15h and 16h processors via TDP algorithm.
For AMD Family 15h and 16h processors the following power values can
be calculated using different processor northbridge function
@@ -37,3 +45,58 @@ This driver provides ProcessorPwrWatts and CurrPwrWatts:
On multi-node processors the calculated value is for the entire
package and not for a single node. Thus the driver creates sysfs
attributes only for internal node0 of a multi-node processor.
+
+2) Accumulated Power Mechanism
+
+This driver also introduces an algorithm that should be used to
+calculate the average power consumed by a processor during a
+measurement interval Tm. The feature of accumulated power mechanism is
+indicated by CPUID Fn8000_0007_EDX[12].
+
+* Tsample: compute unit power accumulator sample period
+* Tref: the PTSC counter period
+* PTSC: performance timestamp counter
+* N: the ratio of compute unit power accumulator sample period to the
+ PTSC period
+* Jmax: max compute unit accumulated power which is indicated by
+ MaxCpuSwPwrAcc MSR C001007b
+* Jx/Jy: compute unit accumulated power which is indicated by
+ CpuSwPwrAcc MSR C001007a
+* Tx/Ty: the value of performance timestamp counter which is indicated
+ by CU_PTSC MSR C0010280
+* PwrCPUave: CPU average power
+
+i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
+ N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
+
+ii. Read the full range of the cumulative energy value from the new
+MSR MaxCpuSwPwrAcc.
+ Jmax = value returned.
+iii. At time x, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+ Jx = value read from CpuSwPwrAcc and Tx = value read from
+PTSC.
+
+iv. At time y, SW reads CpuSwPwrAcc MSR and samples the PTSC.
+ Jy = value read from CpuSwPwrAcc and Ty = value read from
+PTSC.
+
+v. Calculate the average power consumption for a compute unit over
+time period (y-x). Unit of result is uWatt.
+ if (Jy < Jx) // Rollover has occurred
+ Jdelta = (Jy + Jmax) - Jx
+ else
+ Jdelta = Jy - Jx
+ PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
+
+This driver provides PwrCPUave and interval(default is 10 millisecond
+and maximum is 1 second):
+* power1_average (PwrCPUave)
+* power1_average_interval (Interval)
+
+The power1_average_interval can be updated at /etc/sensors3.conf file
+as below:
+
+chip "fam15h_power-*"
+ set power1_average_interval 0.01
+
+Then save it with "sensors -s".
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 5abbfa8..7d9d697 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -1,7 +1,7 @@
/*
* fam15h_power.c - AMD Family 15h processor power monitoring
*
- * Copyright (c) 2011 Advanced Micro Devices, Inc.
+ * Copyright (c) 2011-2016 Advanced Micro Devices, Inc.
* Author: Andreas Herrmann <herrmann.der.user@googlemail.com>
*
*
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v6 6/6] hwmon: (fam15h_power) Add platform check function
2016-04-06 7:44 [PATCH v6 0/6] hwmon: (fam15h_power) Introduce an accumulated power reporting algorithm Huang Rui
` (4 preceding siblings ...)
2016-04-06 7:44 ` [PATCH v6 5/6] hwmon: (fam15h_power) Add documentation for TDP and accumulated power algorithm Huang Rui
@ 2016-04-06 7:44 ` Huang Rui
5 siblings, 0 replies; 11+ messages in thread
From: Huang Rui @ 2016-04-06 7:44 UTC (permalink / raw)
To: Guenter Roeck, Jean Delvare
Cc: linux-hwmon, linux-kernel, Borislav Petkov, Sherry Hurwitz, Huang Rui
This patch adds a platform check function to make code more readable.
Signed-off-by: Huang Rui <ray.huang@amd.com>
---
drivers/hwmon/fam15h_power.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 7d9d697..eb97a92 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -78,6 +78,11 @@ struct fam15h_power_data {
unsigned long power_period;
};
+static bool is_carrizo_or_later(void)
+{
+ return boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60;
+}
+
static ssize_t show_power(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -94,7 +99,7 @@ static ssize_t show_power(struct device *dev,
* On Carrizo and later platforms, TdpRunAvgAccCap bit field
* is extended to 4:31 from 4:25.
*/
- if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60) {
+ if (is_carrizo_or_later()) {
running_avg_capture = val >> 4;
running_avg_capture = sign_extend32(running_avg_capture, 27);
} else {
@@ -111,7 +116,7 @@ static ssize_t show_power(struct device *dev,
* On Carrizo and later platforms, ApmTdpLimit bit field
* is extended to 16:31 from 16:28.
*/
- if (boot_cpu_data.x86 == 0x15 && boot_cpu_data.x86_model >= 0x60)
+ if (is_carrizo_or_later())
tdp_limit = val >> 16;
else
tdp_limit = (val >> 16) & 0x1fff;
--
1.9.1
^ permalink raw reply related [flat|nested] 11+ messages in thread