All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 13:18 ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 13:18 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Andi Kleen, Jean Delvare, Guenter Roeck, lm-sensors,
	linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 0d3141f..54a70fe 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	indx = TO_ATTR_NO(cpu);
 
+	/* The core id is too big, just return */
+	if (indx > MAX_CORE_DATA - 1)
+		return;
+
 	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
 		coretemp_remove_core(pdata, &pdev->dev, indx);
 
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 13:18 ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 13:18 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Andi Kleen, Jean Delvare, Guenter Roeck, lm-sensors,
	linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 0d3141f..54a70fe 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	indx = TO_ATTR_NO(cpu);
 
+	/* The core id is too big, just return */
+	if (indx > MAX_CORE_DATA - 1)
+		return;
+
 	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
 		coretemp_remove_core(pdata, &pdev->dev, indx);
 
-- 
1.7.9.1


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 13:18 ` [lm-sensors] " Kirill A. Shutemov
@ 2012-04-30 15:28   ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-04-30 15:28 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 09:18 -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
> IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> PGD 673e5a067 PUD 66e9b3067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 79
> Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]
> 
> Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
> RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
> RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
> R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
> R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
> FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
> Stack:
>  ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
>  0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
>  ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
> Call Trace:
>  [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
>  [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
>  [<ffffffff81059d30>] __cpu_notify+0x20/0x40
>  [<ffffffff815fa251>] _cpu_down+0x81/0x270
>  [<ffffffff815fa477>] cpu_down+0x37/0x50
>  [<ffffffff815fd6a3>] store_online+0x63/0xc0
>  [<ffffffff813c7078>] dev_attr_store+0x18/0x30
>  [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
>  [<ffffffff81180443>] vfs_write+0xb3/0x180
>  [<ffffffff8118076a>] sys_write+0x4a/0x90
>  [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
> Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
> RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
>  RSP <ffff880472fb3d48>
> CR2: 000000000000013c
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 0d3141f..54a70fe 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
>  
>  	indx = TO_ATTR_NO(cpu);
>  
> +	/* The core id is too big, just return */
> +	if (indx > MAX_CORE_DATA - 1)
> +		return;
> +
>  	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
>  		coretemp_remove_core(pdata, &pdev->dev, indx);
>  
Hi,

good catch. Couple of problems, though.

First, what number of cores are we talking about ? We should probably
increase NUM_REAL_CORES as well. Long term, we should get rid of the
dependency to prevent that problem from happening again, but that is a
different issue.

Second, we'll need the same code in get_core_online(). Otherwise the
platform device can be created for the new core (if the core is
re-enabled) but will never be deleted.

Thanks,
Guenter



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 15:28   ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-04-30 15:28 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 09:18 -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
> IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> PGD 673e5a067 PUD 66e9b3067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 79
> Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]
> 
> Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
> RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
> RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
> R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
> R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
> FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
> Stack:
>  ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
>  0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
>  ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
> Call Trace:
>  [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
>  [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
>  [<ffffffff81059d30>] __cpu_notify+0x20/0x40
>  [<ffffffff815fa251>] _cpu_down+0x81/0x270
>  [<ffffffff815fa477>] cpu_down+0x37/0x50
>  [<ffffffff815fd6a3>] store_online+0x63/0xc0
>  [<ffffffff813c7078>] dev_attr_store+0x18/0x30
>  [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
>  [<ffffffff81180443>] vfs_write+0xb3/0x180
>  [<ffffffff8118076a>] sys_write+0x4a/0x90
>  [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
> Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
> RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
>  RSP <ffff880472fb3d48>
> CR2: 000000000000013c
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 0d3141f..54a70fe 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
>  
>  	indx = TO_ATTR_NO(cpu);
>  
> +	/* The core id is too big, just return */
> +	if (indx > MAX_CORE_DATA - 1)
> +		return;
> +
>  	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
>  		coretemp_remove_core(pdata, &pdev->dev, indx);
>  
Hi,

good catch. Couple of problems, though.

First, what number of cores are we talking about ? We should probably
increase NUM_REAL_CORES as well. Long term, we should get rid of the
dependency to prevent that problem from happening again, but that is a
different issue.

Second, we'll need the same code in get_core_online(). Otherwise the
platform device can be created for the new core (if the core is
re-enabled) but will never be deleted.

Thanks,
Guenter



_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 15:28   ` [lm-sensors] " Guenter Roeck
@ 2012-04-30 16:19     ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 16:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> Hi,
> 
> good catch. Couple of problems, though.
> 
> First, what number of cores are we talking about ? We should probably
> increase NUM_REAL_CORES as well. Long term, we should get rid of the
> dependency to prevent that problem from happening again, but that is a
> different issue.

It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
2, 8, 9, 16, 17, 18, 24, 25. :-/
No need to change NUM_REAL_CORES at the moment.

> Second, we'll need the same code in get_core_online(). Otherwise the
> platform device can be created for the new core (if the core is
> re-enabled) but will never be deleted.

It has already handled in create_core_data().

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 16:19     ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 16:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> Hi,
> 
> good catch. Couple of problems, though.
> 
> First, what number of cores are we talking about ? We should probably
> increase NUM_REAL_CORES as well. Long term, we should get rid of the
> dependency to prevent that problem from happening again, but that is a
> different issue.

It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
2, 8, 9, 16, 17, 18, 24, 25. :-/
No need to change NUM_REAL_CORES at the moment.

> Second, we'll need the same code in get_core_online(). Otherwise the
> platform device can be created for the new core (if the core is
> re-enabled) but will never be deleted.

It has already handled in create_core_data().

-- 
 Kirill A. Shutemov

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 16:19     ` [lm-sensors] " Kirill A. Shutemov
@ 2012-04-30 16:59       ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-04-30 16:59 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 12:19 -0400, Kirill A. Shutemov wrote:
> On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> > Hi,
> > 
> > good catch. Couple of problems, though.
> > 
> > First, what number of cores are we talking about ? We should probably
> > increase NUM_REAL_CORES as well. Long term, we should get rid of the
> > dependency to prevent that problem from happening again, but that is a
> > different issue.
> 
> It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
> 2, 8, 9, 16, 17, 18, 24, 25. :-/
> No need to change NUM_REAL_CORES at the moment.
> 
The problem, though, is that core_data[] is indexed by the core ID. So,
NUM_REAL_CORES is a bit misleading. It should probably be named
NUM_CORE_ID or something similar.

> > Second, we'll need the same code in get_core_online(). Otherwise the
> > platform device can be created for the new core (if the core is
> > re-enabled) but will never be deleted.
> 
> It has already handled in create_core_data().
> 
... which is called from coretemp_add_core(), which does not return an
error. Besides, it is called after the call to coretemp_device_add().

As a result, the platform device for the unsupported core(s) will be
created, even if the core is not supported. I just don't think that is a
good idea.

Besides, if you'd have the check in get_core_online(), you would not
even hit the problem in put_core_offline(), since the platform device
would not exist and the function would bail out because of that.

Thanks,
Guenter



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 16:59       ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-04-30 16:59 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 12:19 -0400, Kirill A. Shutemov wrote:
> On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> > Hi,
> > 
> > good catch. Couple of problems, though.
> > 
> > First, what number of cores are we talking about ? We should probably
> > increase NUM_REAL_CORES as well. Long term, we should get rid of the
> > dependency to prevent that problem from happening again, but that is a
> > different issue.
> 
> It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
> 2, 8, 9, 16, 17, 18, 24, 25. :-/
> No need to change NUM_REAL_CORES at the moment.
> 
The problem, though, is that core_data[] is indexed by the core ID. So,
NUM_REAL_CORES is a bit misleading. It should probably be named
NUM_CORE_ID or something similar.

> > Second, we'll need the same code in get_core_online(). Otherwise the
> > platform device can be created for the new core (if the core is
> > re-enabled) but will never be deleted.
> 
> It has already handled in create_core_data().
> 
... which is called from coretemp_add_core(), which does not return an
error. Besides, it is called after the call to coretemp_device_add().

As a result, the platform device for the unsupported core(s) will be
created, even if the core is not supported. I just don't think that is a
good idea.

Besides, if you'd have the check in get_core_online(), you would not
even hit the problem in put_core_offline(), since the platform device
would not exist and the function would bail out because of that.

Thanks,
Guenter



_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 13:18 ` [lm-sensors] " Kirill A. Shutemov
@ 2012-05-01 15:20   ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-01 15:20 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
[ ... ]
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Looking at it again, you were right. Adding the check to get_core_online()
doesn't really help as the platform device is per CPU, not per core.

Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Guenter

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-05-01 15:20   ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-01 15:20 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
[ ... ]
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Looking at it again, you were right. Adding the check to get_core_online()
doesn't really help as the platform device is per CPU, not per core.

Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Guenter

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-05-01 15:20   ` [lm-sensors] " Guenter Roeck
@ 2012-05-01 21:00     ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-01 21:00 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > core id of the cpu if more than NUM_REAL_CORES-1.
> > 
> [ ... ]
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Looking at it again, you were right. Adding the check to get_core_online()
> doesn't really help as the platform device is per CPU, not per core.
> 
> Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Actually, the problem is bigger.

Documentation/cputopology.txt:
====
2) /sys/devices/system/cpu/cpuX/topology/core_id:

        the CPU core ID of cpuX. Typically it is the hardware platform's
        identifier (rather than the kernel's).  The actual value is
        architecture and platform dependent.
====

We should not use core id as an index in an array since it's an arbitrary
number (from kernel POV).

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-05-01 21:00     ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-01 21:00 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > core id of the cpu if more than NUM_REAL_CORES-1.
> > 
> [ ... ]
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Looking at it again, you were right. Adding the check to get_core_online()
> doesn't really help as the platform device is per CPU, not per core.
> 
> Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Actually, the problem is bigger.

Documentation/cputopology.txt:
==
2) /sys/devices/system/cpu/cpuX/topology/core_id:

        the CPU core ID of cpuX. Typically it is the hardware platform's
        identifier (rather than the kernel's).  The actual value is
        architecture and platform dependent.
==

We should not use core id as an index in an array since it's an arbitrary
number (from kernel POV).

-- 
 Kirill A. Shutemov

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-05-01 21:00     ` [lm-sensors] " Kirill A. Shutemov
@ 2012-05-01 21:25       ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-01 21:25 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 05:00:41PM -0400, Kirill A. Shutemov wrote:
> On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> > On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > > core id of the cpu if more than NUM_REAL_CORES-1.
> > > 
> > [ ... ]
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > Looking at it again, you were right. Adding the check to get_core_online()
> > doesn't really help as the platform device is per CPU, not per core.
> > 
> > Applied. I'll submit a separate patch to increase NUM_REAL_CORES.
> 
> Actually, the problem is bigger.
> 
> Documentation/cputopology.txt:
> ====
> 2) /sys/devices/system/cpu/cpuX/topology/core_id:
> 
>         the CPU core ID of cpuX. Typically it is the hardware platform's
>         identifier (rather than the kernel's).  The actual value is
>         architecture and platform dependent.
> ====
> 
> We should not use core id as an index in an array since it's an arbitrary
> number (from kernel POV).
> 
Yes, we know we'll need a better fix going forward. Using a fixed size array is bad,
no matter what context.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-05-01 21:25       ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-01 21:25 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 05:00:41PM -0400, Kirill A. Shutemov wrote:
> On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> > On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > > core id of the cpu if more than NUM_REAL_CORES-1.
> > > 
> > [ ... ]
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > Looking at it again, you were right. Adding the check to get_core_online()
> > doesn't really help as the platform device is per CPU, not per core.
> > 
> > Applied. I'll submit a separate patch to increase NUM_REAL_CORES.
> 
> Actually, the problem is bigger.
> 
> Documentation/cputopology.txt:
> ==
> 2) /sys/devices/system/cpu/cpuX/topology/core_id:
> 
>         the CPU core ID of cpuX. Typically it is the hardware platform's
>         identifier (rather than the kernel's).  The actual value is
>         architecture and platform dependent.
> ==
> 
> We should not use core id as an index in an array since it's an arbitrary
> number (from kernel POV).
> 
Yes, we know we'll need a better fix going forward. Using a fixed size array is bad,
no matter what context.

Thanks,
Guenter

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-01 21:25       ` [lm-sensors] " Guenter Roeck
@ 2012-05-02 15:10         ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-02 15:10 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 127 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..f5f9108 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	spinlock_t temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id == id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id == attr_no) {
+			spin_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	spin_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	spin_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	spin_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	spin_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	spin_lock_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	spin_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
-
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
+	attr_no = TO_ATTR_NO(cpu);
 
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata->cpu == cpu)
+		coretemp_remove_core(tdata, &pdev->dev);
+	kref_put(&tdata->refcount, temp_data_release);
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2012-05-02 15:10         ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-02 15:10 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 127 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..f5f9108 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	spinlock_t temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id = id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id = attr_no) {
+			spin_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	spin_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	spin_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	spin_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	spin_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	spin_lock_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	spin_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
-
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
+	attr_no = TO_ATTR_NO(cpu);
 
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata->cpu = cpu)
+		coretemp_remove_core(tdata, &pdev->dev);
+	kref_put(&tdata->refcount, temp_data_release);
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* RE: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-02 15:10         ` [lm-sensors] " Kirill A. Shutemov
@ 2012-05-03  5:29           ` R, Durgadoss
  -1 siblings, 0 replies; 52+ messages in thread
From: R, Durgadoss @ 2012-05-03  5:29 UTC (permalink / raw)
  To: Kirill A. Shutemov, Yu, Fenghua, Guenter Roeck
  Cc: Andi Kleen, linux-kernel, lm-sensors

Hi,


> -----Original Message-----
> From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-bounces@lm-
> sensors.org] On Behalf Of Kirill A. Shutemov
> Sent: Wednesday, May 02, 2012 8:41 PM
> To: Yu, Fenghua; Guenter Roeck
> Cc: Andi Kleen; linux-kernel@vger.kernel.org; Kirill A. Shutemov; lm-sensors@lm-
> sensors.org
> Subject: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size
> array for temp data
> 
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++---
> ----------
>  1 files changed, 127 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..f5f9108 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp
> */
> -#define NUM_REAL_CORES		16	/* Number of Real cores per cpu
> */
>  #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
>  #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
>  #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA		(NUM_REAL_CORES +
> BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees
> Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +	struct list_head list;
> +	struct kref refcount;
> +	int id;
>  	int temp;
>  	int ttarget;
>  	int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>  	struct device *hwmon_dev;
>  	u16 phys_proc_id;
> -	struct temp_data *core_data[MAX_CORE_DATA];
> +	struct list_head temp_data_list;
> +	spinlock_t temp_data_lock;
>  	struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +	kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +	struct temp_data *tdata;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (tdata->id == id) {
> +			kref_get(&tdata->refcount);
> +			goto out;
> +		}
> +	tdata = NULL;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>  			struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	if (tdata->is_pkg_data)
> -		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +	else
> +		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -	return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret = -EAGAIN;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>  	}
> 
>  	mutex_unlock(&tdata->update_lock);
> -	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +	if (tdata->valid)
> +		ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>  	if (!tdata)
>  		return NULL;
> +	kref_init(&tdata->refcount);
> 
>  	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
> 
> 	MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>  				unsigned int cpu, int pkg_flag)
>  {
>  	struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 */
>  	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -	if (attr_no > MAX_CORE_DATA - 1)
> -		return -ERANGE;
> -
>  	/*
>  	 * Provide a single set of attributes for all HT siblings of a core
>  	 * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 * Skip if a HT sibling of this core is already registered.
>  	 * This is not an error.
>  	 */
> -	if (pdata->core_data[attr_no] != NULL)
> -		return 0;
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +		if (tdata->id == attr_no) {
> +			spin_unlock(&pdata->temp_data_lock);
> +			return 0;
> +		}
> +	}
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	tdata = init_temp_data(cpu, pkg_flag);
>  	if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  		}
>  	}
> 
> -	pdata->core_data[attr_no] = tdata;
> +	tdata->id = attr_no;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_add(&tdata->list, &pdata->temp_data_list);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	/* Create sysfs interfaces */
>  	err = create_core_attrs(tdata, &pdev->dev, attr_no);
>  	if (err)
> -		goto exit_free;
> +		goto exit_list_del;
> 
>  	return 0;
> +exit_list_del:
> +	spin_lock(&pdata->temp_data_lock);
> +	list_del(&tdata->list);
> +	spin_unlock(&pdata->temp_data_lock);
>  exit_free:
> -	pdata->core_data[attr_no] = NULL;
>  	kfree(tdata);
>  	return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int
> cpu, int pkg_flag)
>  	if (!pdev)
>  		return;
> 
> -	err = create_core_data(pdev, cpu, pkg_flag);
> +	err = create_temp_data(pdev, cpu, pkg_flag);
>  	if (err)
>  		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -				struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device
> *dev)
>  {
>  	int i;
> -	struct temp_data *tdata = pdata->core_data[indx];
> 
>  	/* Remove the sysfs attributes */
>  	for (i = 0; i < tdata->attr_size; i++)
>  		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -	kfree(pdata->core_data[indx]);
> -	pdata->core_data[indx] = NULL;
> +	list_del(&tdata->list);
> +	kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct
> platform_device *pdev)
>  		goto exit_free;
> 
>  	pdata->phys_proc_id = pdev->id;
> +	INIT_LIST_HEAD(&pdata->temp_data_list);
> +	spin_lock_init(&pdata->temp_data_lock);
>  	platform_set_drvdata(pdev, pdata);
> 
>  	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>  	struct platform_data *pdata = platform_get_drvdata(pdev);
> -	int i;
> +	struct temp_data *cur, *tmp;
> 
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -		if (pdata->core_data[i])
> -			coretemp_remove_core(pdata, &pdev->dev, i);
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +		coretemp_remove_core(cur, &pdev->dev);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	device_remove_file(&pdev->dev, &pdata->name_attr);
>  	hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit
> coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -	int i;
> -
> -	/* Find online cores, except pkgtemp data */
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -		if (pdata->core_data[i] &&
> -			!pdata->core_data[i]->is_pkg_data) {
> -			return true;
> -		}
> -	}
> -	return false;
> +	struct temp_data *tdata;
> +	bool ret = true;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (!tdata->is_pkg_data)
> +			 goto out;
> +	ret = false;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -	int i, indx;
> +	int i, attr_no;
>  	struct platform_data *pdata;
>  	struct platform_device *pdev = coretemp_get_pdev(cpu);
> +	struct temp_data *tdata;
> 
>  	/* If the physical CPU device does not exist, just return */
>  	if (!pdev)
> @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>  	pdata = platform_get_drvdata(pdev);
> 
> -	indx = TO_ATTR_NO(cpu);
> -
> -	/* The core id is too big, just return */
> -	if (indx > MAX_CORE_DATA - 1)
> -		return;
> +	attr_no = TO_ATTR_NO(cpu);
> 
> -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> -		coretemp_remove_core(pdata, &pdev->dev, indx);
> +	tdata = get_temp_data(pdata, attr_no);
> +	if (tdata->cpu == cpu)

The get_temp_data can return a NULL. So, you might want to do,
if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash.

In general, why are we using spin_locks instead of mutex_locks,
for list manipulations .. ?

Thanks,
Durga


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2012-05-03  5:29           ` R, Durgadoss
  0 siblings, 0 replies; 52+ messages in thread
From: R, Durgadoss @ 2012-05-03  5:29 UTC (permalink / raw)
  To: Kirill A. Shutemov, Yu, Fenghua, Guenter Roeck
  Cc: Andi Kleen, linux-kernel, lm-sensors

Hi,


> -----Original Message-----
> From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-bounces@lm-
> sensors.org] On Behalf Of Kirill A. Shutemov
> Sent: Wednesday, May 02, 2012 8:41 PM
> To: Yu, Fenghua; Guenter Roeck
> Cc: Andi Kleen; linux-kernel@vger.kernel.org; Kirill A. Shutemov; lm-sensors@lm-
> sensors.org
> Subject: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size
> array for temp data
> 
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++---
> ----------
>  1 files changed, 127 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..f5f9108 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp
> */
> -#define NUM_REAL_CORES		16	/* Number of Real cores per cpu
> */
>  #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
>  #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
>  #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA		(NUM_REAL_CORES +
> BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees
> Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +	struct list_head list;
> +	struct kref refcount;
> +	int id;
>  	int temp;
>  	int ttarget;
>  	int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>  	struct device *hwmon_dev;
>  	u16 phys_proc_id;
> -	struct temp_data *core_data[MAX_CORE_DATA];
> +	struct list_head temp_data_list;
> +	spinlock_t temp_data_lock;
>  	struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +	kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +	struct temp_data *tdata;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (tdata->id = id) {
> +			kref_get(&tdata->refcount);
> +			goto out;
> +		}
> +	tdata = NULL;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>  			struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	if (tdata->is_pkg_data)
> -		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +	else
> +		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -	return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret = -EAGAIN;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>  	}
> 
>  	mutex_unlock(&tdata->update_lock);
> -	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +	if (tdata->valid)
> +		ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>  	if (!tdata)
>  		return NULL;
> +	kref_init(&tdata->refcount);
> 
>  	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
> 
> 	MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>  				unsigned int cpu, int pkg_flag)
>  {
>  	struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 */
>  	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -	if (attr_no > MAX_CORE_DATA - 1)
> -		return -ERANGE;
> -
>  	/*
>  	 * Provide a single set of attributes for all HT siblings of a core
>  	 * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 * Skip if a HT sibling of this core is already registered.
>  	 * This is not an error.
>  	 */
> -	if (pdata->core_data[attr_no] != NULL)
> -		return 0;
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +		if (tdata->id = attr_no) {
> +			spin_unlock(&pdata->temp_data_lock);
> +			return 0;
> +		}
> +	}
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	tdata = init_temp_data(cpu, pkg_flag);
>  	if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  		}
>  	}
> 
> -	pdata->core_data[attr_no] = tdata;
> +	tdata->id = attr_no;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_add(&tdata->list, &pdata->temp_data_list);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	/* Create sysfs interfaces */
>  	err = create_core_attrs(tdata, &pdev->dev, attr_no);
>  	if (err)
> -		goto exit_free;
> +		goto exit_list_del;
> 
>  	return 0;
> +exit_list_del:
> +	spin_lock(&pdata->temp_data_lock);
> +	list_del(&tdata->list);
> +	spin_unlock(&pdata->temp_data_lock);
>  exit_free:
> -	pdata->core_data[attr_no] = NULL;
>  	kfree(tdata);
>  	return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int
> cpu, int pkg_flag)
>  	if (!pdev)
>  		return;
> 
> -	err = create_core_data(pdev, cpu, pkg_flag);
> +	err = create_temp_data(pdev, cpu, pkg_flag);
>  	if (err)
>  		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -				struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device
> *dev)
>  {
>  	int i;
> -	struct temp_data *tdata = pdata->core_data[indx];
> 
>  	/* Remove the sysfs attributes */
>  	for (i = 0; i < tdata->attr_size; i++)
>  		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -	kfree(pdata->core_data[indx]);
> -	pdata->core_data[indx] = NULL;
> +	list_del(&tdata->list);
> +	kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct
> platform_device *pdev)
>  		goto exit_free;
> 
>  	pdata->phys_proc_id = pdev->id;
> +	INIT_LIST_HEAD(&pdata->temp_data_list);
> +	spin_lock_init(&pdata->temp_data_lock);
>  	platform_set_drvdata(pdev, pdata);
> 
>  	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>  	struct platform_data *pdata = platform_get_drvdata(pdev);
> -	int i;
> +	struct temp_data *cur, *tmp;
> 
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -		if (pdata->core_data[i])
> -			coretemp_remove_core(pdata, &pdev->dev, i);
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +		coretemp_remove_core(cur, &pdev->dev);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	device_remove_file(&pdev->dev, &pdata->name_attr);
>  	hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit
> coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -	int i;
> -
> -	/* Find online cores, except pkgtemp data */
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -		if (pdata->core_data[i] &&
> -			!pdata->core_data[i]->is_pkg_data) {
> -			return true;
> -		}
> -	}
> -	return false;
> +	struct temp_data *tdata;
> +	bool ret = true;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (!tdata->is_pkg_data)
> +			 goto out;
> +	ret = false;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -	int i, indx;
> +	int i, attr_no;
>  	struct platform_data *pdata;
>  	struct platform_device *pdev = coretemp_get_pdev(cpu);
> +	struct temp_data *tdata;
> 
>  	/* If the physical CPU device does not exist, just return */
>  	if (!pdev)
> @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>  	pdata = platform_get_drvdata(pdev);
> 
> -	indx = TO_ATTR_NO(cpu);
> -
> -	/* The core id is too big, just return */
> -	if (indx > MAX_CORE_DATA - 1)
> -		return;
> +	attr_no = TO_ATTR_NO(cpu);
> 
> -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
> -		coretemp_remove_core(pdata, &pdev->dev, indx);
> +	tdata = get_temp_data(pdata, attr_no);
> +	if (tdata->cpu = cpu)

The get_temp_data can return a NULL. So, you might want to do,
if (tdata && tdata->cpu = cpu) to avoid a potential NULL ptr crash.

In general, why are we using spin_locks instead of mutex_locks,
for list manipulations .. ?

Thanks,
Durga


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-03  5:29           ` R, Durgadoss
@ 2012-05-03 10:04             ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 10:04 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Yu, Fenghua, Guenter Roeck, Andi Kleen, linux-kernel, lm-sensors

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

On Thu, May 03, 2012 at 05:29:43AM +0000, R, Durgadoss wrote:
> > @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> > 
> >  	pdata = platform_get_drvdata(pdev);
> > 
> > -	indx = TO_ATTR_NO(cpu);
> > -
> > -	/* The core id is too big, just return */
> > -	if (indx > MAX_CORE_DATA - 1)
> > -		return;
> > +	attr_no = TO_ATTR_NO(cpu);
> > 
> > -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> > -		coretemp_remove_core(pdata, &pdev->dev, indx);
> > +	tdata = get_temp_data(pdata, attr_no);
> > +	if (tdata->cpu == cpu)
> 
> The get_temp_data can return a NULL. So, you might want to do,
> if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash.

Good catch, thanks.

> In general, why are we using spin_locks instead of mutex_locks,
> for list manipulations .. ?

I don't think it matters here. Contention is low and nobody sleeps, but
okay, I'll change it to mutex.

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2012-05-03 10:04             ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 10:04 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Yu, Fenghua, Guenter Roeck, Andi Kleen, linux-kernel, lm-sensors


[-- Attachment #1.1: Type: text/plain, Size: 986 bytes --]

On Thu, May 03, 2012 at 05:29:43AM +0000, R, Durgadoss wrote:
> > @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> > 
> >  	pdata = platform_get_drvdata(pdev);
> > 
> > -	indx = TO_ATTR_NO(cpu);
> > -
> > -	/* The core id is too big, just return */
> > -	if (indx > MAX_CORE_DATA - 1)
> > -		return;
> > +	attr_no = TO_ATTR_NO(cpu);
> > 
> > -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> > -		coretemp_remove_core(pdata, &pdev->dev, indx);
> > +	tdata = get_temp_data(pdata, attr_no);
> > +	if (tdata->cpu == cpu)
> 
> The get_temp_data can return a NULL. So, you might want to do,
> if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash.

Good catch, thanks.

> In general, why are we using spin_locks instead of mutex_locks,
> for list manipulations .. ?

I don't think it matters here. Contention is low and nobody sleeps, but
okay, I'll change it to mutex.

-- 
 Kirill A. Shutemov

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-02 15:10         ` [lm-sensors] " Kirill A. Shutemov
@ 2012-05-03 11:18           ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 11:18 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck, Durgadoss R
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 v2:
   - fix NULL pointer dereference. Thanks to R, Durgadoss;
   - use mutex instead of spinlock for list locking.
---
 drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 129 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..1c66131 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	struct mutex temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id == id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id == attr_no) {
+			mutex_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	mutex_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	mutex_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	mutex_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
+	attr_no = TO_ATTR_NO(cpu);
 
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
-
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata) {
+		if (tdata->cpu == cpu)
+			coretemp_remove_core(tdata, &pdev->dev);
+		kref_put(&tdata->refcount, temp_data_release);
+	}
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp
@ 2012-05-03 11:18           ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 11:18 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck, Durgadoss R
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 v2:
   - fix NULL pointer dereference. Thanks to R, Durgadoss;
   - use mutex instead of spinlock for list locking.
---
 drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 129 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..1c66131 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	struct mutex temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id = id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id = attr_no) {
+			mutex_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	mutex_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	mutex_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	mutex_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
+	attr_no = TO_ATTR_NO(cpu);
 
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
-
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata) {
+		if (tdata->cpu = cpu)
+			coretemp_remove_core(tdata, &pdev->dev);
+		kref_put(&tdata->refcount, temp_data_release);
+	}
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-03 11:18           ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Kirill A. Shutemov
@ 2012-05-04  5:41             ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-04  5:41 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  v2:
>    - fix NULL pointer dereference. Thanks to R, Durgadoss;
>    - use mutex instead of spinlock for list locking.
> ---
>  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
>  1 files changed, 129 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..1c66131 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
>  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
>  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
>  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +       struct list_head list;
> +       struct kref refcount;

Hi,

the kref is not needed. The attribute access functions don't
need to be protected since the attributes for a core are deleted
before the core data itself is deleted. So it is not neccessary 
to hold a lock while accessing/using temp_data in the attribute
access functions. All you need is to hold a mutex while you are
manipulating or walking the list.

Thanks,
Guenter

> +       int id;
>         int temp;
>         int ttarget;
>         int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>         struct device *hwmon_dev;
>         u16 phys_proc_id;
> -       struct temp_data *core_data[MAX_CORE_DATA];
> +       struct list_head temp_data_list;
> +       struct mutex temp_data_lock;
>         struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +       struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +       kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +       struct temp_data *tdata;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (tdata->id == id) {
> +                       kref_get(&tdata->refcount);
> +                       goto out;
> +               }
> +       tdata = NULL;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>                         struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         if (tdata->is_pkg_data)
> -               return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +               ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +       else
> +               ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -       return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -       return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +       ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret = -EAGAIN;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>         }
> 
>         mutex_unlock(&tdata->update_lock);
> -       return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +       if (tdata->valid)
> +               ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>         if (!tdata)
>                 return NULL;
> +       kref_init(&tdata->refcount);
> 
>         tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
>                                                         MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>                                 unsigned int cpu, int pkg_flag)
>  {
>         struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          */
>         attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -       if (attr_no > MAX_CORE_DATA - 1)
> -               return -ERANGE;
> -
>         /*
>          * Provide a single set of attributes for all HT siblings of a core
>          * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          * Skip if a HT sibling of this core is already registered.
>          * This is not an error.
>          */
> -       if (pdata->core_data[attr_no] != NULL)
> -               return 0;
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +               if (tdata->id == attr_no) {
> +                       mutex_unlock(&pdata->temp_data_lock);
> +                       return 0;
> +               }
> +       }
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         tdata = init_temp_data(cpu, pkg_flag);
>         if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>                 }
>         }
> 
> -       pdata->core_data[attr_no] = tdata;
> +       tdata->id = attr_no;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_add(&tdata->list, &pdata->temp_data_list);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         /* Create sysfs interfaces */
>         err = create_core_attrs(tdata, &pdev->dev, attr_no);
>         if (err)
> -               goto exit_free;
> +               goto exit_list_del;
> 
>         return 0;
> +exit_list_del:
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_del(&tdata->list);
> +       mutex_unlock(&pdata->temp_data_lock);
>  exit_free:
> -       pdata->core_data[attr_no] = NULL;
>         kfree(tdata);
>         return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
>         if (!pdev)
>                 return;
> 
> -       err = create_core_data(pdev, cpu, pkg_flag);
> +       err = create_temp_data(pdev, cpu, pkg_flag);
>         if (err)
>                 dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -                               struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
>  {
>         int i;
> -       struct temp_data *tdata = pdata->core_data[indx];
> 
>         /* Remove the sysfs attributes */
>         for (i = 0; i < tdata->attr_size; i++)
>                 device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -       kfree(pdata->core_data[indx]);
> -       pdata->core_data[indx] = NULL;
> +       list_del(&tdata->list);
> +       kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
>                 goto exit_free;
> 
>         pdata->phys_proc_id = pdev->id;
> +       INIT_LIST_HEAD(&pdata->temp_data_list);
> +       mutex_init(&pdata->temp_data_lock);
>         platform_set_drvdata(pdev, pdata);
> 
>         pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>         struct platform_data *pdata = platform_get_drvdata(pdev);
> -       int i;
> +       struct temp_data *cur, *tmp;
> 
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -               if (pdata->core_data[i])
> -                       coretemp_remove_core(pdata, &pdev->dev, i);
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +               coretemp_remove_core(cur, &pdev->dev);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         device_remove_file(&pdev->dev, &pdata->name_attr);
>         hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -       int i;
> -
> -       /* Find online cores, except pkgtemp data */
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -               if (pdata->core_data[i] &&
> -                       !pdata->core_data[i]->is_pkg_data) {
> -                       return true;
> -               }
> -       }
> -       return false;
> +       struct temp_data *tdata;
> +       bool ret = true;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (!tdata->is_pkg_data)
> +                        goto out;
> +       ret = false;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -       int i, indx;
> +       int i, attr_no;
>         struct platform_data *pdata;
>         struct platform_device *pdev = coretemp_get_pdev(cpu);
> +       struct temp_data *tdata;
> 
>         /* If the physical CPU device does not exist, just return */
>         if (!pdev)
> @@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>         pdata = platform_get_drvdata(pdev);
> 
> -       indx = TO_ATTR_NO(cpu);
> +       attr_no = TO_ATTR_NO(cpu);
> 
> -       /* The core id is too big, just return */
> -       if (indx > MAX_CORE_DATA - 1)
> -               return;
> -
> -       if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> -               coretemp_remove_core(pdata, &pdev->dev, indx);
> +       tdata = get_temp_data(pdata, attr_no);
> +       if (tdata) {
> +               if (tdata->cpu == cpu)
> +                       coretemp_remove_core(tdata, &pdev->dev);
> +               kref_put(&tdata->refcount, temp_data_release);
> +       }
> 
>         /*
>          * If a HT sibling of a core is taken offline, but another HT sibling
> --
> 1.7.9.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp
@ 2012-05-04  5:41             ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-04  5:41 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  v2:
>    - fix NULL pointer dereference. Thanks to R, Durgadoss;
>    - use mutex instead of spinlock for list locking.
> ---
>  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
>  1 files changed, 129 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..1c66131 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
>  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
>  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
>  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +       struct list_head list;
> +       struct kref refcount;

Hi,

the kref is not needed. The attribute access functions don't
need to be protected since the attributes for a core are deleted
before the core data itself is deleted. So it is not neccessary 
to hold a lock while accessing/using temp_data in the attribute
access functions. All you need is to hold a mutex while you are
manipulating or walking the list.

Thanks,
Guenter

> +       int id;
>         int temp;
>         int ttarget;
>         int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>         struct device *hwmon_dev;
>         u16 phys_proc_id;
> -       struct temp_data *core_data[MAX_CORE_DATA];
> +       struct list_head temp_data_list;
> +       struct mutex temp_data_lock;
>         struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +       struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +       kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +       struct temp_data *tdata;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (tdata->id = id) {
> +                       kref_get(&tdata->refcount);
> +                       goto out;
> +               }
> +       tdata = NULL;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>                         struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         if (tdata->is_pkg_data)
> -               return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +               ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +       else
> +               ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -       return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -       return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +       ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret = -EAGAIN;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>         }
> 
>         mutex_unlock(&tdata->update_lock);
> -       return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +       if (tdata->valid)
> +               ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>         if (!tdata)
>                 return NULL;
> +       kref_init(&tdata->refcount);
> 
>         tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
>                                                         MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>                                 unsigned int cpu, int pkg_flag)
>  {
>         struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          */
>         attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -       if (attr_no > MAX_CORE_DATA - 1)
> -               return -ERANGE;
> -
>         /*
>          * Provide a single set of attributes for all HT siblings of a core
>          * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          * Skip if a HT sibling of this core is already registered.
>          * This is not an error.
>          */
> -       if (pdata->core_data[attr_no] != NULL)
> -               return 0;
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +               if (tdata->id = attr_no) {
> +                       mutex_unlock(&pdata->temp_data_lock);
> +                       return 0;
> +               }
> +       }
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         tdata = init_temp_data(cpu, pkg_flag);
>         if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>                 }
>         }
> 
> -       pdata->core_data[attr_no] = tdata;
> +       tdata->id = attr_no;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_add(&tdata->list, &pdata->temp_data_list);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         /* Create sysfs interfaces */
>         err = create_core_attrs(tdata, &pdev->dev, attr_no);
>         if (err)
> -               goto exit_free;
> +               goto exit_list_del;
> 
>         return 0;
> +exit_list_del:
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_del(&tdata->list);
> +       mutex_unlock(&pdata->temp_data_lock);
>  exit_free:
> -       pdata->core_data[attr_no] = NULL;
>         kfree(tdata);
>         return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
>         if (!pdev)
>                 return;
> 
> -       err = create_core_data(pdev, cpu, pkg_flag);
> +       err = create_temp_data(pdev, cpu, pkg_flag);
>         if (err)
>                 dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -                               struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
>  {
>         int i;
> -       struct temp_data *tdata = pdata->core_data[indx];
> 
>         /* Remove the sysfs attributes */
>         for (i = 0; i < tdata->attr_size; i++)
>                 device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -       kfree(pdata->core_data[indx]);
> -       pdata->core_data[indx] = NULL;
> +       list_del(&tdata->list);
> +       kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
>                 goto exit_free;
> 
>         pdata->phys_proc_id = pdev->id;
> +       INIT_LIST_HEAD(&pdata->temp_data_list);
> +       mutex_init(&pdata->temp_data_lock);
>         platform_set_drvdata(pdev, pdata);
> 
>         pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>         struct platform_data *pdata = platform_get_drvdata(pdev);
> -       int i;
> +       struct temp_data *cur, *tmp;
> 
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -               if (pdata->core_data[i])
> -                       coretemp_remove_core(pdata, &pdev->dev, i);
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +               coretemp_remove_core(cur, &pdev->dev);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         device_remove_file(&pdev->dev, &pdata->name_attr);
>         hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -       int i;
> -
> -       /* Find online cores, except pkgtemp data */
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -               if (pdata->core_data[i] &&
> -                       !pdata->core_data[i]->is_pkg_data) {
> -                       return true;
> -               }
> -       }
> -       return false;
> +       struct temp_data *tdata;
> +       bool ret = true;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (!tdata->is_pkg_data)
> +                        goto out;
> +       ret = false;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -       int i, indx;
> +       int i, attr_no;
>         struct platform_data *pdata;
>         struct platform_device *pdev = coretemp_get_pdev(cpu);
> +       struct temp_data *tdata;
> 
>         /* If the physical CPU device does not exist, just return */
>         if (!pdev)
> @@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>         pdata = platform_get_drvdata(pdev);
> 
> -       indx = TO_ATTR_NO(cpu);
> +       attr_no = TO_ATTR_NO(cpu);
> 
> -       /* The core id is too big, just return */
> -       if (indx > MAX_CORE_DATA - 1)
> -               return;
> -
> -       if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
> -               coretemp_remove_core(pdata, &pdev->dev, indx);
> +       tdata = get_temp_data(pdata, attr_no);
> +       if (tdata) {
> +               if (tdata->cpu = cpu)
> +                       coretemp_remove_core(tdata, &pdev->dev);
> +               kref_put(&tdata->refcount, temp_data_release);
> +       }
> 
>         /*
>          * If a HT sibling of a core is taken offline, but another HT sibling
> --
> 1.7.9.1
> 

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04  5:41             ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Guenter Roeck
@ 2012-05-04  6:46               ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04  6:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2773 bytes --]

On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > Let's rework code to allow arbitrary number of cores on a CPU, not
> > limited by hardcoded array size.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  v2:
> >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> >    - use mutex instead of spinlock for list locking.
> > ---
> >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 129 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > index 54a70fe..1c66131 100644
> > --- a/drivers/hwmon/coretemp.c
> > +++ b/drivers/hwmon/coretemp.c
> > @@ -36,6 +36,8 @@
> >  #include <linux/cpu.h>
> >  #include <linux/pci.h>
> >  #include <linux/smp.h>
> > +#include <linux/list.h>
> > +#include <linux/kref.h>
> >  #include <linux/moduleparam.h>
> >  #include <asm/msr.h>
> >  #include <asm/processor.h>
> > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > 
> >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > 
> >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> >   * @valid: If this is 1, the current temperature is valid.
> >   */
> >  struct temp_data {
> > +       struct list_head list;
> > +       struct kref refcount;
> 
> Hi,
> 
> the kref is not needed. The attribute access functions don't
> need to be protected since the attributes for a core are deleted
> before the core data itself is deleted. So it is not neccessary 
> to hold a lock while accessing/using temp_data in the attribute
> access functions. All you need is to hold a mutex while you are
> manipulating or walking the list.

Without kref, what prevents following situation:

		CPU-A				CPU-B
	tdata = get_temp_data();
					coretemp_remove_core() {
					    device_remove_file();
					    kfree(tdata);
					}
	<tdata dereference>

?

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp
@ 2012-05-04  6:46               ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04  6:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 2773 bytes --]

On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > Let's rework code to allow arbitrary number of cores on a CPU, not
> > limited by hardcoded array size.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  v2:
> >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> >    - use mutex instead of spinlock for list locking.
> > ---
> >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 129 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > index 54a70fe..1c66131 100644
> > --- a/drivers/hwmon/coretemp.c
> > +++ b/drivers/hwmon/coretemp.c
> > @@ -36,6 +36,8 @@
> >  #include <linux/cpu.h>
> >  #include <linux/pci.h>
> >  #include <linux/smp.h>
> > +#include <linux/list.h>
> > +#include <linux/kref.h>
> >  #include <linux/moduleparam.h>
> >  #include <asm/msr.h>
> >  #include <asm/processor.h>
> > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > 
> >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > 
> >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> >   * @valid: If this is 1, the current temperature is valid.
> >   */
> >  struct temp_data {
> > +       struct list_head list;
> > +       struct kref refcount;
> 
> Hi,
> 
> the kref is not needed. The attribute access functions don't
> need to be protected since the attributes for a core are deleted
> before the core data itself is deleted. So it is not neccessary 
> to hold a lock while accessing/using temp_data in the attribute
> access functions. All you need is to hold a mutex while you are
> manipulating or walking the list.

Without kref, what prevents following situation:

		CPU-A				CPU-B
	tdata = get_temp_data();
					coretemp_remove_core() {
					    device_remove_file();
					    kfree(tdata);
					}
	<tdata dereference>

?

-- 
 Kirill A. Shutemov

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04  6:46               ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Kirill A. Shutemov
@ 2012-05-04 13:34                 ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-04 13:34 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > limited by hardcoded array size.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > ---
> > >  v2:
> > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > >    - use mutex instead of spinlock for list locking.
> > > ---
> > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > 
> > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > index 54a70fe..1c66131 100644
> > > --- a/drivers/hwmon/coretemp.c
> > > +++ b/drivers/hwmon/coretemp.c
> > > @@ -36,6 +36,8 @@
> > >  #include <linux/cpu.h>
> > >  #include <linux/pci.h>
> > >  #include <linux/smp.h>
> > > +#include <linux/list.h>
> > > +#include <linux/kref.h>
> > >  #include <linux/moduleparam.h>
> > >  #include <asm/msr.h>
> > >  #include <asm/processor.h>
> > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > 
> > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > 
> > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > >   * @valid: If this is 1, the current temperature is valid.
> > >   */
> > >  struct temp_data {
> > > +       struct list_head list;
> > > +       struct kref refcount;
> > 
> > Hi,
> > 
> > the kref is not needed. The attribute access functions don't
> > need to be protected since the attributes for a core are deleted
> > before the core data itself is deleted. So it is not neccessary 
> > to hold a lock while accessing/using temp_data in the attribute
> > access functions. All you need is to hold a mutex while you are
> > manipulating or walking the list.
> 
> Without kref, what prevents following situation:
> 
> 		CPU-A				CPU-B
> 	tdata = get_temp_data();
> 					coretemp_remove_core() {
> 					    device_remove_file();
> 					    kfree(tdata);
> 					}
> 	<tdata dereference>
> 
The remove function requires a semaphore which is held by the access function,
so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Guenter

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp
@ 2012-05-04 13:34                 ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2012-05-04 13:34 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > limited by hardcoded array size.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > ---
> > >  v2:
> > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > >    - use mutex instead of spinlock for list locking.
> > > ---
> > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > 
> > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > index 54a70fe..1c66131 100644
> > > --- a/drivers/hwmon/coretemp.c
> > > +++ b/drivers/hwmon/coretemp.c
> > > @@ -36,6 +36,8 @@
> > >  #include <linux/cpu.h>
> > >  #include <linux/pci.h>
> > >  #include <linux/smp.h>
> > > +#include <linux/list.h>
> > > +#include <linux/kref.h>
> > >  #include <linux/moduleparam.h>
> > >  #include <asm/msr.h>
> > >  #include <asm/processor.h>
> > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > 
> > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > 
> > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > >   * @valid: If this is 1, the current temperature is valid.
> > >   */
> > >  struct temp_data {
> > > +       struct list_head list;
> > > +       struct kref refcount;
> > 
> > Hi,
> > 
> > the kref is not needed. The attribute access functions don't
> > need to be protected since the attributes for a core are deleted
> > before the core data itself is deleted. So it is not neccessary 
> > to hold a lock while accessing/using temp_data in the attribute
> > access functions. All you need is to hold a mutex while you are
> > manipulating or walking the list.
> 
> Without kref, what prevents following situation:
> 
> 		CPU-A				CPU-B
> 	tdata = get_temp_data();
> 					coretemp_remove_core() {
> 					    device_remove_file();
> 					    kfree(tdata);
> 					}
> 	<tdata dereference>
> 
The remove function requires a semaphore which is held by the access function,
so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Guenter

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04 13:34                 ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Guenter Roeck
@ 2012-05-04 13:42                   ` Kirill A. Shutemov
  -1 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04 13:42 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3381 bytes --]

On Fri, May 04, 2012 at 06:34:19AM -0700, Guenter Roeck wrote:
> On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> > On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > > 
> > > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > > limited by hardcoded array size.
> > > > 
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > ---
> > > >  v2:
> > > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > > >    - use mutex instead of spinlock for list locking.
> > > > ---
> > > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > > 
> > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > > index 54a70fe..1c66131 100644
> > > > --- a/drivers/hwmon/coretemp.c
> > > > +++ b/drivers/hwmon/coretemp.c
> > > > @@ -36,6 +36,8 @@
> > > >  #include <linux/cpu.h>
> > > >  #include <linux/pci.h>
> > > >  #include <linux/smp.h>
> > > > +#include <linux/list.h>
> > > > +#include <linux/kref.h>
> > > >  #include <linux/moduleparam.h>
> > > >  #include <asm/msr.h>
> > > >  #include <asm/processor.h>
> > > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > > 
> > > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > > 
> > > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > >   * @valid: If this is 1, the current temperature is valid.
> > > >   */
> > > >  struct temp_data {
> > > > +       struct list_head list;
> > > > +       struct kref refcount;
> > > 
> > > Hi,
> > > 
> > > the kref is not needed. The attribute access functions don't
> > > need to be protected since the attributes for a core are deleted
> > > before the core data itself is deleted. So it is not neccessary 
> > > to hold a lock while accessing/using temp_data in the attribute
> > > access functions. All you need is to hold a mutex while you are
> > > manipulating or walking the list.
> > 
> > Without kref, what prevents following situation:
> > 
> > 		CPU-A				CPU-B
> > 	tdata = get_temp_data();
> > 					coretemp_remove_core() {
> > 					    device_remove_file();
> > 					    kfree(tdata);
> > 					}
> > 	<tdata dereference>
> > 
> The remove function requires a semaphore which is held by the access function,
> so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Understood. I'll update the patch.

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp
@ 2012-05-04 13:42                   ` Kirill A. Shutemov
  0 siblings, 0 replies; 52+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04 13:42 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 3381 bytes --]

On Fri, May 04, 2012 at 06:34:19AM -0700, Guenter Roeck wrote:
> On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> > On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > > 
> > > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > > limited by hardcoded array size.
> > > > 
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > ---
> > > >  v2:
> > > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > > >    - use mutex instead of spinlock for list locking.
> > > > ---
> > > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > > 
> > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > > index 54a70fe..1c66131 100644
> > > > --- a/drivers/hwmon/coretemp.c
> > > > +++ b/drivers/hwmon/coretemp.c
> > > > @@ -36,6 +36,8 @@
> > > >  #include <linux/cpu.h>
> > > >  #include <linux/pci.h>
> > > >  #include <linux/smp.h>
> > > > +#include <linux/list.h>
> > > > +#include <linux/kref.h>
> > > >  #include <linux/moduleparam.h>
> > > >  #include <asm/msr.h>
> > > >  #include <asm/processor.h>
> > > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > > 
> > > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > > 
> > > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > >   * @valid: If this is 1, the current temperature is valid.
> > > >   */
> > > >  struct temp_data {
> > > > +       struct list_head list;
> > > > +       struct kref refcount;
> > > 
> > > Hi,
> > > 
> > > the kref is not needed. The attribute access functions don't
> > > need to be protected since the attributes for a core are deleted
> > > before the core data itself is deleted. So it is not neccessary 
> > > to hold a lock while accessing/using temp_data in the attribute
> > > access functions. All you need is to hold a mutex while you are
> > > manipulating or walking the list.
> > 
> > Without kref, what prevents following situation:
> > 
> > 		CPU-A				CPU-B
> > 	tdata = get_temp_data();
> > 					coretemp_remove_core() {
> > 					    device_remove_file();
> > 					    kfree(tdata);
> > 					}
> > 	<tdata dereference>
> > 
> The remove function requires a semaphore which is held by the access function,
> so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Understood. I'll update the patch.

-- 
 Kirill A. Shutemov

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-02 15:10         ` [lm-sensors] " Kirill A. Shutemov
@ 2015-07-15 16:04 ` Lukasz Odzioba
  -1 siblings, 0 replies; 52+ messages in thread
From: Lukasz Odzioba @ 2015-07-15 16:04 UTC (permalink / raw)
  To: fenghua.yu; +Cc: jdelvare, linux, lm-sensors, linux-kernel, Lukasz Odzioba

Removes the limits of supported CPU cores and max core ID.


Patch is based on Kirill A. Shutemov's work from 2012.

Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
---
 drivers/hwmon/coretemp.c |  120 ++++++++++++++++++++++++++++-----------------
 1 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 3e03379..c39ce14 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -52,11 +52,10 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		32	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	19	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
+#define PACKAGE_ID		1	/* Magic number of physical cpu */
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -71,18 +70,20 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 /*
  * Per-Core Temperature Data
+ * @id: If this is equal PACKAGE_ID, structure contains package temperature
+		data, otherwise it is just TO_ATTR_NO(cpu)
  * @last_updated: The time when the current temperature value was updated
  *		earlier (in jiffies).
  * @cpu_core_id: The CPU Core from which temperature values should be read
  *		This value is passed as "id" field to rdmsr/wrmsr functions.
  * @status_reg: One of IA32_THERM_STATUS or IA32_PACKAGE_THERM_STATUS,
  *		from where the temperature values should be read.
- * @attr_size:  Total number of pre-core attrs displayed in the sysfs.
- * @is_pkg_data: If this is 1, the temp_data holds pkgtemp data.
- *		Otherwise, temp_data holds coretemp data.
+ * @attr_size:  Total number of per-core attrs displayed in the sysfs.
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -104,7 +105,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	struct mutex temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -117,12 +119,26 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id == id) {
+			mutex_unlock(&pdata->temp_data_lock);
+			return tdata;
+		}
+	mutex_unlock(&pdata->temp_data_lock);
+	return NULL;
+}
+
 static ssize_t show_label(struct device *dev,
 				struct device_attribute *devattr, char *buf)
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	if (tdata->is_pkg_data)
 		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
@@ -136,7 +152,7 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
@@ -148,8 +164,9 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	return sprintf(buf, "%d\n", tdata->tjmax);
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -157,8 +174,9 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	return sprintf(buf, "%d\n", tdata->ttarget);
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -167,7 +185,7 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	mutex_lock(&tdata->update_lock);
 
@@ -468,7 +486,7 @@ static struct temp_data *init_temp_data(unsigned int cpu, int pkg_flag)
 	return tdata;
 }
 
-static int create_core_data(struct platform_device *pdev, unsigned int cpu,
+static int create_temp_data(struct platform_device *pdev, unsigned int cpu,
 			    int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -483,10 +501,7 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	 * The attr number is always core id + 2
 	 * The Pkgtemp will always show up as temp1_*, if available
 	 */
-	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
-
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
+	attr_no = pkg_flag ? PACKAGE_ID : TO_ATTR_NO(cpu);
 
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
@@ -495,7 +510,8 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata)
 		return 0;
 
 	tdata = init_temp_data(cpu, pkg_flag);
@@ -525,16 +541,28 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	get_online_cpus();
+	mutex_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	mutex_unlock(&pdata->temp_data_lock);
+	put_online_cpus();
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, pdata->hwmon_dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_del;
 
 	return 0;
+
+exit_del:
+	get_online_cpus();
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	mutex_unlock(&pdata->temp_data_lock);
+	put_online_cpus();
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -547,21 +575,21 @@ static void coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
 static void coretemp_remove_core(struct platform_data *pdata,
-				 int indx)
+					struct temp_data *tdata)
 {
-	struct temp_data *tdata = pdata->core_data[indx];
-
 	/* Remove the sysfs attributes */
 	sysfs_remove_group(&pdata->hwmon_dev->kobj, &tdata->attr_group);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	kfree(tdata);
+	mutex_unlock(&pdata->temp_data_lock);
 }
 
 static int coretemp_probe(struct platform_device *pdev)
@@ -575,6 +603,8 @@ static int coretemp_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	mutex_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = devm_hwmon_device_register_with_groups(dev, DRVNAME,
@@ -585,11 +615,12 @@ static int coretemp_probe(struct platform_device *pdev)
 static int coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, i);
+	get_online_cpus();
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(pdata, cur);
+	put_online_cpus();
 
 	return 0;
 }
@@ -664,15 +695,15 @@ static void coretemp_device_remove(unsigned int cpu)
 
 static bool is_any_core_online(struct platform_data *pdata)
 {
-	int i;
+	struct temp_data *tdata;
 
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id != PACKAGE_ID) {
+			mutex_unlock(&pdata->temp_data_lock);
 			return true;
 		}
-	}
+	mutex_unlock(&pdata->temp_data_lock);
 	return false;
 }
 
@@ -720,9 +751,10 @@ static void get_core_online(unsigned int cpu)
 
 static void put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -730,15 +762,13 @@ static void put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
-
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
-
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
-		coretemp_remove_core(pdata, indx);
+	attr_no = TO_ATTR_NO(cpu);
 
+	get_online_cpus();
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata && tdata->cpu == cpu)
+		coretemp_remove_core(pdata, tdata);
+	put_online_cpus();
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
 	 * of the same core is still online, register the alternate sibling.
-- 
1.7.1

--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-15 16:04 ` Lukasz Odzioba
  0 siblings, 0 replies; 52+ messages in thread
From: Lukasz Odzioba @ 2015-07-15 16:04 UTC (permalink / raw)
  To: fenghua.yu; +Cc: jdelvare, linux, lm-sensors, linux-kernel, Lukasz Odzioba

Removes the limits of supported CPU cores and max core ID.


Patch is based on Kirill A. Shutemov's work from 2012.

Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
---
 drivers/hwmon/coretemp.c |  120 ++++++++++++++++++++++++++++-----------------
 1 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 3e03379..c39ce14 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -52,11 +52,10 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		32	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	19	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
+#define PACKAGE_ID		1	/* Magic number of physical cpu */
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -71,18 +70,20 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 /*
  * Per-Core Temperature Data
+ * @id: If this is equal PACKAGE_ID, structure contains package temperature
+		data, otherwise it is just TO_ATTR_NO(cpu)
  * @last_updated: The time when the current temperature value was updated
  *		earlier (in jiffies).
  * @cpu_core_id: The CPU Core from which temperature values should be read
  *		This value is passed as "id" field to rdmsr/wrmsr functions.
  * @status_reg: One of IA32_THERM_STATUS or IA32_PACKAGE_THERM_STATUS,
  *		from where the temperature values should be read.
- * @attr_size:  Total number of pre-core attrs displayed in the sysfs.
- * @is_pkg_data: If this is 1, the temp_data holds pkgtemp data.
- *		Otherwise, temp_data holds coretemp data.
+ * @attr_size:  Total number of per-core attrs displayed in the sysfs.
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -104,7 +105,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	struct mutex temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -117,12 +119,26 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id = id) {
+			mutex_unlock(&pdata->temp_data_lock);
+			return tdata;
+		}
+	mutex_unlock(&pdata->temp_data_lock);
+	return NULL;
+}
+
 static ssize_t show_label(struct device *dev,
 				struct device_attribute *devattr, char *buf)
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	if (tdata->is_pkg_data)
 		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
@@ -136,7 +152,7 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
@@ -148,8 +164,9 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	return sprintf(buf, "%d\n", tdata->tjmax);
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -157,8 +174,9 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	return sprintf(buf, "%d\n", tdata->ttarget);
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -167,7 +185,7 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
 
 	mutex_lock(&tdata->update_lock);
 
@@ -468,7 +486,7 @@ static struct temp_data *init_temp_data(unsigned int cpu, int pkg_flag)
 	return tdata;
 }
 
-static int create_core_data(struct platform_device *pdev, unsigned int cpu,
+static int create_temp_data(struct platform_device *pdev, unsigned int cpu,
 			    int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -483,10 +501,7 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	 * The attr number is always core id + 2
 	 * The Pkgtemp will always show up as temp1_*, if available
 	 */
-	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
-
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
+	attr_no = pkg_flag ? PACKAGE_ID : TO_ATTR_NO(cpu);
 
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
@@ -495,7 +510,8 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata)
 		return 0;
 
 	tdata = init_temp_data(cpu, pkg_flag);
@@ -525,16 +541,28 @@ static int create_core_data(struct platform_device *pdev, unsigned int cpu,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	get_online_cpus();
+	mutex_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	mutex_unlock(&pdata->temp_data_lock);
+	put_online_cpus();
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, pdata->hwmon_dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_del;
 
 	return 0;
+
+exit_del:
+	get_online_cpus();
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	mutex_unlock(&pdata->temp_data_lock);
+	put_online_cpus();
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -547,21 +575,21 @@ static void coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
 static void coretemp_remove_core(struct platform_data *pdata,
-				 int indx)
+					struct temp_data *tdata)
 {
-	struct temp_data *tdata = pdata->core_data[indx];
-
 	/* Remove the sysfs attributes */
 	sysfs_remove_group(&pdata->hwmon_dev->kobj, &tdata->attr_group);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	kfree(tdata);
+	mutex_unlock(&pdata->temp_data_lock);
 }
 
 static int coretemp_probe(struct platform_device *pdev)
@@ -575,6 +603,8 @@ static int coretemp_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	mutex_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = devm_hwmon_device_register_with_groups(dev, DRVNAME,
@@ -585,11 +615,12 @@ static int coretemp_probe(struct platform_device *pdev)
 static int coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, i);
+	get_online_cpus();
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(pdata, cur);
+	put_online_cpus();
 
 	return 0;
 }
@@ -664,15 +695,15 @@ static void coretemp_device_remove(unsigned int cpu)
 
 static bool is_any_core_online(struct platform_data *pdata)
 {
-	int i;
+	struct temp_data *tdata;
 
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id != PACKAGE_ID) {
+			mutex_unlock(&pdata->temp_data_lock);
 			return true;
 		}
-	}
+	mutex_unlock(&pdata->temp_data_lock);
 	return false;
 }
 
@@ -720,9 +751,10 @@ static void get_core_online(unsigned int cpu)
 
 static void put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -730,15 +762,13 @@ static void put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
-
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
-
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu = cpu)
-		coretemp_remove_core(pdata, indx);
+	attr_no = TO_ATTR_NO(cpu);
 
+	get_online_cpus();
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata && tdata->cpu = cpu)
+		coretemp_remove_core(pdata, tdata);
+	put_online_cpus();
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
 	 * of the same core is still online, register the alternate sibling.
-- 
1.7.1

--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-15 16:04 ` [lm-sensors] " Lukasz Odzioba
@ 2015-07-15 21:07   ` Jean Delvare
  -1 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-15 21:07 UTC (permalink / raw)
  To: Lukasz Odzioba; +Cc: fenghua.yu, linux, lm-sensors, linux-kernel

On Wed, 15 Jul 2015 18:04:13 +0200, Lukasz Odzioba wrote:
> Removes the limits of supported CPU cores and max core ID.

I see the benefit of removing the arbitrary limit, but why use a list
instead of a dynamically allocated array? This is turning a O(1)
algorithm into a O(n) algorithm. I know n isn't too large in this case
but I still consider it bad practice if it can be avoided.

Do you expect core IDs to become arbitrarily large? Significantly
larger than the core count?

You need a better patch description for sure. Saying what the patch
does isn't sufficient, you need to explain why this is needed and why
this is the right way to do it.

-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-15 21:07   ` Jean Delvare
  0 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-15 21:07 UTC (permalink / raw)
  To: Lukasz Odzioba; +Cc: fenghua.yu, linux, lm-sensors, linux-kernel

On Wed, 15 Jul 2015 18:04:13 +0200, Lukasz Odzioba wrote:
> Removes the limits of supported CPU cores and max core ID.

I see the benefit of removing the arbitrary limit, but why use a list
instead of a dynamically allocated array? This is turning a O(1)
algorithm into a O(n) algorithm. I know n isn't too large in this case
but I still consider it bad practice if it can be avoided.

Do you expect core IDs to become arbitrarily large? Significantly
larger than the core count?

You need a better patch description for sure. Saying what the patch
does isn't sufficient, you need to explain why this is needed and why
this is the right way to do it.

-- 
Jean Delvare
SUSE L3 Support

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-15 21:07   ` [lm-sensors] " Jean Delvare
@ 2015-07-16 13:17     ` Odzioba, Lukasz
  -1 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-16 13:17 UTC (permalink / raw)
  To: Jean Delvare; +Cc: linux, Yu, Fenghua, lm-sensors, linux-kernel

On  Wednesday, July 15, 2015 11:08 PM Jean Delvare wrote:
> I see the benefit of removing the arbitrary limit, but why use a list
> instead of a dynamically allocated array? This is turning a O(1)
> algorithm into a O(n) algorithm. I know n isn't too large in this case
> but I still consider it bad practice if it can be avoided.

This patch tries to solve two problems which are present on current hardware:
-cpus with more than 32 cores
-core id is greater than NUM_REAL_CORES

In both cases it ends up with the following error in dmesg:
coretemp coretemp.0: Adding Core XXX failed

We could just bump NUM_REAL_CORES to 128 like we did in 2012 and call it 
solved, but the problem will come back eventually and it is waste of 
memory for cpus with handful of cores.

If there is way to obtain maximum core id during module initialization,
then we can allocate array and keep O(1) access. If we can't figure out
maximum core id then we can increase size of the array when new cores are
added. The problem with this is that core id enumeration can be sparse so
again we have waste of memory.

> Do you expect core IDs to become arbitrarily large?
> Significantly larger than the core count?

The question is what does significantly mean.
According to Documentation/cputopology.txt:
---
2) /sys/devices/system/cpu/cpuX/topology/core_id:

	the CPU core ID of cpuX. Typically it is the hardware platform's
	identifier (rather than the kernel's).  The actual value is
	architecture and platform dependent.
---

Even now we can have one core present with id like 60 (think of Xeon Phi).
I haven't seen in the wild insane core ids like thousands, but I don't see
a reason why we shouldn't handle it in a proper manner.

Current patch does not use more memory than it is needed, but the pitfall is
that it increased access complexity from O(1) to O(n). We could slide another
patch on top of this one to reduce access complexity from O(n) to O(logn)
by using i.e. radix tree. I preferred to send functional fix in the first
place, and then work on optimization if there is a concern about it.
Forgive me if it is not appropriate.


> You need a better patch description for sure. Saying what the patch does
> isn't sufficient, you need to explain why this is needed and why this is
> the right way to do it.

Thanks, I'll address that in the next revision, but first we need to figure out
the way to go. If this patch is not appropriate, then it is a follow up to 
start discussion on how to fix those two problems I mentioned. 

Thanks,
Lukas
--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-16 13:17     ` Odzioba, Lukasz
  0 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-16 13:17 UTC (permalink / raw)
  To: Jean Delvare; +Cc: linux, Yu, Fenghua, lm-sensors, linux-kernel

On  Wednesday, July 15, 2015 11:08 PM Jean Delvare wrote:
> I see the benefit of removing the arbitrary limit, but why use a list
> instead of a dynamically allocated array? This is turning a O(1)
> algorithm into a O(n) algorithm. I know n isn't too large in this case
> but I still consider it bad practice if it can be avoided.

This patch tries to solve two problems which are present on current hardware:
-cpus with more than 32 cores
-core id is greater than NUM_REAL_CORES

In both cases it ends up with the following error in dmesg:
coretemp coretemp.0: Adding Core XXX failed

We could just bump NUM_REAL_CORES to 128 like we did in 2012 and call it 
solved, but the problem will come back eventually and it is waste of 
memory for cpus with handful of cores.

If there is way to obtain maximum core id during module initialization,
then we can allocate array and keep O(1) access. If we can't figure out
maximum core id then we can increase size of the array when new cores are
added. The problem with this is that core id enumeration can be sparse so
again we have waste of memory.

> Do you expect core IDs to become arbitrarily large?
> Significantly larger than the core count?

The question is what does significantly mean.
According to Documentation/cputopology.txt:
---
2) /sys/devices/system/cpu/cpuX/topology/core_id:

	the CPU core ID of cpuX. Typically it is the hardware platform's
	identifier (rather than the kernel's).  The actual value is
	architecture and platform dependent.
---

Even now we can have one core present with id like 60 (think of Xeon Phi).
I haven't seen in the wild insane core ids like thousands, but I don't see
a reason why we shouldn't handle it in a proper manner.

Current patch does not use more memory than it is needed, but the pitfall is
that it increased access complexity from O(1) to O(n). We could slide another
patch on top of this one to reduce access complexity from O(n) to O(logn)
by using i.e. radix tree. I preferred to send functional fix in the first
place, and then work on optimization if there is a concern about it.
Forgive me if it is not appropriate.


> You need a better patch description for sure. Saying what the patch does
> isn't sufficient, you need to explain why this is needed and why this is
> the right way to do it.

Thanks, I'll address that in the next revision, but first we need to figure out
the way to go. If this patch is not appropriate, then it is a follow up to 
start discussion on how to fix those two problems I mentioned. 

Thanks,
Lukas
--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-16 13:17     ` [lm-sensors] " Odzioba, Lukasz
@ 2015-07-17 16:55       ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 16:55 UTC (permalink / raw)
  To: Odzioba, Lukasz, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On 07/16/2015 06:17 AM, Odzioba, Lukasz wrote:
> On  Wednesday, July 15, 2015 11:08 PM Jean Delvare wrote:
>> I see the benefit of removing the arbitrary limit, but why use a list
>> instead of a dynamically allocated array? This is turning a O(1)
>> algorithm into a O(n) algorithm. I know n isn't too large in this case
>> but I still consider it bad practice if it can be avoided.
>
> This patch tries to solve two problems which are present on current hardware:
> -cpus with more than 32 cores
> -core id is greater than NUM_REAL_CORES
>
> In both cases it ends up with the following error in dmesg:
> coretemp coretemp.0: Adding Core XXX failed
>
> We could just bump NUM_REAL_CORES to 128 like we did in 2012 and call it
> solved, but the problem will come back eventually and it is waste of
> memory for cpus with handful of cores.
>
> If there is way to obtain maximum core id during module initialization,
> then we can allocate array and keep O(1) access. If we can't figure out
> maximum core id then we can increase size of the array when new cores are
> added. The problem with this is that core id enumeration can be sparse so
> again we have waste of memory.
>
>> Do you expect core IDs to become arbitrarily large?
>> Significantly larger than the core count?
>
> The question is what does significantly mean.
> According to Documentation/cputopology.txt:
> ---
> 2) /sys/devices/system/cpu/cpuX/topology/core_id:
>
> 	the CPU core ID of cpuX. Typically it is the hardware platform's
> 	identifier (rather than the kernel's).  The actual value is
> 	architecture and platform dependent.
> ---
>
> Even now we can have one core present with id like 60 (think of Xeon Phi).
> I haven't seen in the wild insane core ids like thousands, but I don't see
> a reason why we shouldn't handle it in a proper manner.
>
> Current patch does not use more memory than it is needed, but the pitfall is
> that it increased access complexity from O(1) to O(n). We could slide another
> patch on top of this one to reduce access complexity from O(n) to O(logn)
> by using i.e. radix tree. I preferred to send functional fix in the first
> place, and then work on optimization if there is a concern about it.
> Forgive me if it is not appropriate.
>

You don't really explain why your approach would be better than allocating
an array of pointers to struct temp_data and increasing its size using
krealloc if needed.

Guenter


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 16:55       ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 16:55 UTC (permalink / raw)
  To: Odzioba, Lukasz, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On 07/16/2015 06:17 AM, Odzioba, Lukasz wrote:
> On  Wednesday, July 15, 2015 11:08 PM Jean Delvare wrote:
>> I see the benefit of removing the arbitrary limit, but why use a list
>> instead of a dynamically allocated array? This is turning a O(1)
>> algorithm into a O(n) algorithm. I know n isn't too large in this case
>> but I still consider it bad practice if it can be avoided.
>
> This patch tries to solve two problems which are present on current hardware:
> -cpus with more than 32 cores
> -core id is greater than NUM_REAL_CORES
>
> In both cases it ends up with the following error in dmesg:
> coretemp coretemp.0: Adding Core XXX failed
>
> We could just bump NUM_REAL_CORES to 128 like we did in 2012 and call it
> solved, but the problem will come back eventually and it is waste of
> memory for cpus with handful of cores.
>
> If there is way to obtain maximum core id during module initialization,
> then we can allocate array and keep O(1) access. If we can't figure out
> maximum core id then we can increase size of the array when new cores are
> added. The problem with this is that core id enumeration can be sparse so
> again we have waste of memory.
>
>> Do you expect core IDs to become arbitrarily large?
>> Significantly larger than the core count?
>
> The question is what does significantly mean.
> According to Documentation/cputopology.txt:
> ---
> 2) /sys/devices/system/cpu/cpuX/topology/core_id:
>
> 	the CPU core ID of cpuX. Typically it is the hardware platform's
> 	identifier (rather than the kernel's).  The actual value is
> 	architecture and platform dependent.
> ---
>
> Even now we can have one core present with id like 60 (think of Xeon Phi).
> I haven't seen in the wild insane core ids like thousands, but I don't see
> a reason why we shouldn't handle it in a proper manner.
>
> Current patch does not use more memory than it is needed, but the pitfall is
> that it increased access complexity from O(1) to O(n). We could slide another
> patch on top of this one to reduce access complexity from O(n) to O(logn)
> by using i.e. radix tree. I preferred to send functional fix in the first
> place, and then work on optimization if there is a concern about it.
> Forgive me if it is not appropriate.
>

You don't really explain why your approach would be better than allocating
an array of pointers to struct temp_data and increasing its size using
krealloc if needed.

Guenter


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 16:55       ` [lm-sensors] " Guenter Roeck
@ 2015-07-17 17:28         ` Odzioba, Lukasz
  -1 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-17 17:28 UTC (permalink / raw)
  To: Guenter Roeck, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

From: Guenter Roeck [mailto:linux@roeck-us.net] 
On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:

> You don't really explain why your approach would be better than
> allocating an array of pointers to struct temp_data and increasing
> its size using krealloc if needed.

Let's consider two cases of such implementation:
a) we use array of pointers with O(n) access algorithm
b) we use array of pointers with O(1) access algorithm

In both cases an array will have greater memory footprint unless
we implement reallocation ourselves when cpus are disabled which will
make code harder to maintain.
Case b) does not handle huge core ids and sparse enumeration well -
it is still to discuss whether we really need it since there is no
such hardware yet.

I am not saying that my solution is the best of possible ones.
I am saying that "the best" can vary depending on which criteria do you
choose from (time, memory, clean code...). Some may say that O(n) is
fine unless we have thousands of cores and this code is not on hot path,
others may be concerned more about memory on small/old devices.
I don't see holy grail here, If you see one please let me know.

If you think that we don't have to care so much about memory,
then I can create another patch which uses array instead of list.

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 17:28         ` Odzioba, Lukasz
  0 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-17 17:28 UTC (permalink / raw)
  To: Guenter Roeck, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

From: Guenter Roeck [mailto:linux@roeck-us.net] 
On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:

> You don't really explain why your approach would be better than
> allocating an array of pointers to struct temp_data and increasing
> its size using krealloc if needed.

Let's consider two cases of such implementation:
a) we use array of pointers with O(n) access algorithm
b) we use array of pointers with O(1) access algorithm

In both cases an array will have greater memory footprint unless
we implement reallocation ourselves when cpus are disabled which will
make code harder to maintain.
Case b) does not handle huge core ids and sparse enumeration well -
it is still to discuss whether we really need it since there is no
such hardware yet.

I am not saying that my solution is the best of possible ones.
I am saying that "the best" can vary depending on which criteria do you
choose from (time, memory, clean code...). Some may say that O(n) is
fine unless we have thousands of cores and this code is not on hot path,
others may be concerned more about memory on small/old devices.
I don't see holy grail here, If you see one please let me know.

If you think that we don't have to care so much about memory,
then I can create another patch which uses array instead of list.

Thanks,
Lukas

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 17:28         ` [lm-sensors] " Odzioba, Lukasz
@ 2015-07-17 18:01           ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 18:01 UTC (permalink / raw)
  To: Odzioba, Lukasz, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On 07/17/2015 10:28 AM, Odzioba, Lukasz wrote:
> From: Guenter Roeck [mailto:linux@roeck-us.net]
> On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:
>
>> You don't really explain why your approach would be better than
>> allocating an array of pointers to struct temp_data and increasing
>> its size using krealloc if needed.
>
> Let's consider two cases of such implementation:
> a) we use array of pointers with O(n) access algorithm
> b) we use array of pointers with O(1) access algorithm
>
> In both cases an array will have greater memory footprint unless
> we implement reallocation ourselves when cpus are disabled which will
> make code harder to maintain.

Please explain why krealloc() won't work, why using krealloc(()
would result in a larger memory footprint than using lists,
and why disabling CPUs would require any action in the first place.

> Case b) does not handle huge core ids and sparse enumeration well -
> it is still to discuss whether we really need it since there is no
> such hardware yet.
>
"yet" is a key term here. Presumably you have insider information.
Unless you can share this information, I don't see the point of
replacing an O(1) algorithm with O(n), especially since there
is a relatively simple alternative available to support more CPUs.

> I am not saying that my solution is the best of possible ones.
> I am saying that "the best" can vary depending on which criteria do you
> choose from (time, memory, clean code...). Some may say that O(n) is
> fine unless we have thousands of cores and this code is not on hot path,
> others may be concerned more about memory on small/old devices.
> I don't see holy grail here, If you see one please let me know.
>

Unless you clarify that Intel will introduce CPU IDs which can not be used
as array index because they are too sparse, I don't really see how the list
solution would consume less memory than an array of pointers, even if the
array is somewhat sparse. After all, a list consumes at least two pointers
per entry.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 18:01           ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 18:01 UTC (permalink / raw)
  To: Odzioba, Lukasz, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On 07/17/2015 10:28 AM, Odzioba, Lukasz wrote:
> From: Guenter Roeck [mailto:linux@roeck-us.net]
> On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:
>
>> You don't really explain why your approach would be better than
>> allocating an array of pointers to struct temp_data and increasing
>> its size using krealloc if needed.
>
> Let's consider two cases of such implementation:
> a) we use array of pointers with O(n) access algorithm
> b) we use array of pointers with O(1) access algorithm
>
> In both cases an array will have greater memory footprint unless
> we implement reallocation ourselves when cpus are disabled which will
> make code harder to maintain.

Please explain why krealloc() won't work, why using krealloc(()
would result in a larger memory footprint than using lists,
and why disabling CPUs would require any action in the first place.

> Case b) does not handle huge core ids and sparse enumeration well -
> it is still to discuss whether we really need it since there is no
> such hardware yet.
>
"yet" is a key term here. Presumably you have insider information.
Unless you can share this information, I don't see the point of
replacing an O(1) algorithm with O(n), especially since there
is a relatively simple alternative available to support more CPUs.

> I am not saying that my solution is the best of possible ones.
> I am saying that "the best" can vary depending on which criteria do you
> choose from (time, memory, clean code...). Some may say that O(n) is
> fine unless we have thousands of cores and this code is not on hot path,
> others may be concerned more about memory on small/old devices.
> I don't see holy grail here, If you see one please let me know.
>

Unless you clarify that Intel will introduce CPU IDs which can not be used
as array index because they are too sparse, I don't really see how the list
solution would consume less memory than an array of pointers, even if the
array is somewhat sparse. After all, a list consumes at least two pointers
per entry.

Thanks,
Guenter


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 17:28         ` [lm-sensors] " Odzioba, Lukasz
@ 2015-07-17 19:11           ` Jean Delvare
  -1 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 19:11 UTC (permalink / raw)
  To: Odzioba, Lukasz; +Cc: Guenter Roeck, Yu, Fenghua, lm-sensors, linux-kernel

On Fri, 17 Jul 2015 17:28:20 +0000, Odzioba, Lukasz wrote:
> From: Guenter Roeck [mailto:linux@roeck-us.net] 
> On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:
> 
> > You don't really explain why your approach would be better than
> > allocating an array of pointers to struct temp_data and increasing
> > its size using krealloc if needed.
> 
> Let's consider two cases of such implementation:
> a) we use array of pointers with O(n) access algorithm
> b) we use array of pointers with O(1) access algorithm
> 
> In both cases an array will have greater memory footprint unless
> we implement reallocation ourselves when cpus are disabled which will
> make code harder to maintain.

I see no reason to reallocate when CPUs are disabled. This is rare
enough that I very much doubt we care.

> Case b) does not handle huge core ids and sparse enumeration well -
> it is still to discuss whether we really need it since there is no
> such hardware yet.

If you don't have a use case, I see no reason to change anything.

If you have a use case, it would be nice to tell us what it is, so that
we can make better comments on your proposal.

> I am not saying that my solution is the best of possible ones.
> I am saying that "the best" can vary depending on which criteria do you
> choose from (time, memory, clean code...). Some may say that O(n) is
> fine unless we have thousands of cores and this code is not on hot path,
> others may be concerned more about memory on small/old devices.
> I don't see holy grail here, If you see one please let me know.

The problem is that the lookup algorithm is only one piece of the
puzzle. When a user runs "sensors" or any other monitoring tool, you'll
do the look-up once for each logical CPU, or even for every attribute
of every CPU. So we're not talking about n, we're talking about
5 * n^2. And that can happen every other second. So while you may not
call it a "hot path", this is still frequent enough so I am worried
about performance aka algorithmic complexity.

Switching from linear complexity to quadratic complexity "because n is
going to increase soon" is quite opposite to what I would expect.

> If you think that we don't have to care so much about memory,
> then I can create another patch which uses array instead of list.

I'm not so worried about memory. Did you actually check how many bytes
of memory were used per supported logical CPU?

We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
would be fine with that. This lets people worried about memory
consumption control it.

-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 19:11           ` Jean Delvare
  0 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 19:11 UTC (permalink / raw)
  To: Odzioba, Lukasz; +Cc: Guenter Roeck, Yu, Fenghua, lm-sensors, linux-kernel

On Fri, 17 Jul 2015 17:28:20 +0000, Odzioba, Lukasz wrote:
> From: Guenter Roeck [mailto:linux@roeck-us.net] 
> On Friday, July 17, 2015 6:55 PM Guenter Roeck wrote:
> 
> > You don't really explain why your approach would be better than
> > allocating an array of pointers to struct temp_data and increasing
> > its size using krealloc if needed.
> 
> Let's consider two cases of such implementation:
> a) we use array of pointers with O(n) access algorithm
> b) we use array of pointers with O(1) access algorithm
> 
> In both cases an array will have greater memory footprint unless
> we implement reallocation ourselves when cpus are disabled which will
> make code harder to maintain.

I see no reason to reallocate when CPUs are disabled. This is rare
enough that I very much doubt we care.

> Case b) does not handle huge core ids and sparse enumeration well -
> it is still to discuss whether we really need it since there is no
> such hardware yet.

If you don't have a use case, I see no reason to change anything.

If you have a use case, it would be nice to tell us what it is, so that
we can make better comments on your proposal.

> I am not saying that my solution is the best of possible ones.
> I am saying that "the best" can vary depending on which criteria do you
> choose from (time, memory, clean code...). Some may say that O(n) is
> fine unless we have thousands of cores and this code is not on hot path,
> others may be concerned more about memory on small/old devices.
> I don't see holy grail here, If you see one please let me know.

The problem is that the lookup algorithm is only one piece of the
puzzle. When a user runs "sensors" or any other monitoring tool, you'll
do the look-up once for each logical CPU, or even for every attribute
of every CPU. So we're not talking about n, we're talking about
5 * n^2. And that can happen every other second. So while you may not
call it a "hot path", this is still frequent enough so I am worried
about performance aka algorithmic complexity.

Switching from linear complexity to quadratic complexity "because n is
going to increase soon" is quite opposite to what I would expect.

> If you think that we don't have to care so much about memory,
> then I can create another patch which uses array instead of list.

I'm not so worried about memory. Did you actually check how many bytes
of memory were used per supported logical CPU?

We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
would be fine with that. This lets people worried about memory
consumption control it.

-- 
Jean Delvare
SUSE L3 Support

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 18:01           ` [lm-sensors] " Guenter Roeck
@ 2015-07-17 19:23             ` Odzioba, Lukasz
  -1 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-17 19:23 UTC (permalink / raw)
  To: Guenter Roeck, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On Friday, July 17, 2015 8:02 PM Guenter Roeck wrote:
> Please explain why krealloc() won't work, why using krealloc(() would
> result in a larger memory footprint than using lists, and why disabling
> CPUs would require any action in the first place.

It will work, but it can use more memory for cpus with many cores.
If you have just one core visible to the kernel with id 59 
(i.e. the rest are disabled by hardware) out of 60-core cpu then you
have to allocate an array of 60 pointers instead of just one element of
the list. Of course you can say that for cpu with just one core list will
use 3x the memory needed by array and that's true. I see no point in 
arguing which case is more important, let's move on.

> "yet" is a key term here. Presumably you have insider information.

No I don't have such information.

> Unless you can share this information, I don't see the point of replacing
> an O(1) algorithm with O(n), especially since there is a relatively simple
> alternative available to support more CPUs.

Ok I understand that, and somewhat agree with that.

> Unless you clarify that Intel will introduce CPU IDs which cannot be used
> as array index because they are too sparse, I don't really see how the list 
> solution would consume less memory than an array of pointers, even if the 
> array is somewhat sparse. After all, a list consumes at least two pointers 
> per entry.

Ok maybe I should state that clearer, my bad. For my purposes removing limit
of supported cores or bumping it is all I need. Removing limit of maxid is 
just nice to have - nothing really practical any soon I guess.

I based this patch on Kirill's which used lists and then you guys did not
say that this is a bad idea, so I tried to solve problems reported earlier
- deadlocks and race conditions and submitted another version.

Maybe if I would start from scratch I would not ran into all this.
In general it is good that you found the weak point of it.

Thanks,
Lukas


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 19:23             ` Odzioba, Lukasz
  0 siblings, 0 replies; 52+ messages in thread
From: Odzioba, Lukasz @ 2015-07-17 19:23 UTC (permalink / raw)
  To: Guenter Roeck, Jean Delvare; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

On Friday, July 17, 2015 8:02 PM Guenter Roeck wrote:
> Please explain why krealloc() won't work, why using krealloc(() would
> result in a larger memory footprint than using lists, and why disabling
> CPUs would require any action in the first place.

It will work, but it can use more memory for cpus with many cores.
If you have just one core visible to the kernel with id 59 
(i.e. the rest are disabled by hardware) out of 60-core cpu then you
have to allocate an array of 60 pointers instead of just one element of
the list. Of course you can say that for cpu with just one core list will
use 3x the memory needed by array and that's true. I see no point in 
arguing which case is more important, let's move on.

> "yet" is a key term here. Presumably you have insider information.

No I don't have such information.

> Unless you can share this information, I don't see the point of replacing
> an O(1) algorithm with O(n), especially since there is a relatively simple
> alternative available to support more CPUs.

Ok I understand that, and somewhat agree with that.

> Unless you clarify that Intel will introduce CPU IDs which cannot be used
> as array index because they are too sparse, I don't really see how the list 
> solution would consume less memory than an array of pointers, even if the 
> array is somewhat sparse. After all, a list consumes at least two pointers 
> per entry.

Ok maybe I should state that clearer, my bad. For my purposes removing limit
of supported cores or bumping it is all I need. Removing limit of maxid is 
just nice to have - nothing really practical any soon I guess.

I based this patch on Kirill's which used lists and then you guys did not
say that this is a bad idea, so I tried to solve problems reported earlier
- deadlocks and race conditions and submitted another version.

Maybe if I would start from scratch I would not ran into all this.
In general it is good that you found the weak point of it.

Thanks,
Lukas


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 19:11           ` [lm-sensors] " Jean Delvare
@ 2015-07-17 19:36             ` Guenter Roeck
  -1 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 19:36 UTC (permalink / raw)
  To: Jean Delvare, Odzioba, Lukasz; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

Hi Jean,

On 07/17/2015 12:11 PM, Jean Delvare wrote:

>
> We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
> would be fine with that. This lets people worried about memory
> consumption control it.
>
Unfortunately this won't work because the CPU ID is non-linear;
an 8-core system may have a CPI ID larger than 7.

Guenter


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 19:36             ` Guenter Roeck
  0 siblings, 0 replies; 52+ messages in thread
From: Guenter Roeck @ 2015-07-17 19:36 UTC (permalink / raw)
  To: Jean Delvare, Odzioba, Lukasz; +Cc: Yu, Fenghua, lm-sensors, linux-kernel

Hi Jean,

On 07/17/2015 12:11 PM, Jean Delvare wrote:

>
> We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
> would be fine with that. This lets people worried about memory
> consumption control it.
>
Unfortunately this won't work because the CPU ID is non-linear;
an 8-core system may have a CPI ID larger than 7.

Guenter


_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 19:36             ` [lm-sensors] " Guenter Roeck
@ 2015-07-17 21:25               ` Jean Delvare
  -1 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 21:25 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Odzioba, Lukasz, Yu, Fenghua, lm-sensors, linux-kernel

On Fri, 17 Jul 2015 12:36:34 -0700, Guenter Roeck wrote:
> Hi Jean,
> 
> On 07/17/2015 12:11 PM, Jean Delvare wrote:
> 
> >
> > We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
> > would be fine with that. This lets people worried about memory
> > consumption control it.
>
> Unfortunately this won't work because the CPU ID is non-linear;
> an 8-core system may have a CPU ID larger than 7.

Oh right, I forgot about that. Brilliant hardware/firmware engineers...

Well that does not prevent us from using CONFIG_NR_CPUS, "just" the
code would need to be modified to remove the assumption that the array
index matches the logical CPU ID.

BTW I wonder how the rest of the kernel handles the situation.
CONFIG_NR_CPUS is used where relevant so there certainly _is_ a linear
numbering which is independent from the CPU ID.

-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 21:25               ` Jean Delvare
  0 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 21:25 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Odzioba, Lukasz, Yu, Fenghua, lm-sensors, linux-kernel

On Fri, 17 Jul 2015 12:36:34 -0700, Guenter Roeck wrote:
> Hi Jean,
> 
> On 07/17/2015 12:11 PM, Jean Delvare wrote:
> 
> >
> > We could just drop NUM_REAL_CORES and use CONFIG_NR_CPUS instead, I
> > would be fine with that. This lets people worried about memory
> > consumption control it.
>
> Unfortunately this won't work because the CPU ID is non-linear;
> an 8-core system may have a CPU ID larger than 7.

Oh right, I forgot about that. Brilliant hardware/firmware engineers...

Well that does not prevent us from using CONFIG_NR_CPUS, "just" the
code would need to be modified to remove the assumption that the array
index matches the logical CPU ID.

BTW I wonder how the rest of the kernel handles the situation.
CONFIG_NR_CPUS is used where relevant so there certainly _is_ a linear
numbering which is independent from the CPU ID.

-- 
Jean Delvare
SUSE L3 Support

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2015-07-17 19:23             ` [lm-sensors] " Odzioba, Lukasz
@ 2015-07-17 21:33               ` Jean Delvare
  -1 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 21:33 UTC (permalink / raw)
  To: Odzioba, Lukasz; +Cc: Guenter Roeck, Yu, Fenghua, lm-sensors, linux-kernel

Hi Lukasz,

On Fri, 17 Jul 2015 19:23:44 +0000, Odzioba, Lukasz wrote:
> On Friday, July 17, 2015 8:02 PM Guenter Roeck wrote:
> > Please explain why krealloc() won't work, why using krealloc(() would
> > result in a larger memory footprint than using lists, and why disabling
> > CPUs would require any action in the first place.
> 
> It will work, but it can use more memory for cpus with many cores.
> If you have just one core visible to the kernel with id 59 
> (i.e. the rest are disabled by hardware) out of 60-core cpu then you
> have to allocate an array of 60 pointers instead of just one element of

Arrays of pointers are cheap. You can fit 512 pointers in a single
memory page.

> the list. Of course you can say that for cpu with just one core list will
> use 3x the memory needed by array and that's true. I see no point in 
> arguing which case is more important, let's move on.

I see the point in arguing: the example above is just silly and does
not match any real-world application. Nobody buys a 60-core CPU to run
it with a single core enabled.

When you have two or more alternative implementations possible,
thinking in terms of the most common cases is a key to make the right
decision. Thinking about servers with a lots of CPU cores versus
embedded devices with few cores and tight memory constraints, that is
useful. Making up corner cases is not.

-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
@ 2015-07-17 21:33               ` Jean Delvare
  0 siblings, 0 replies; 52+ messages in thread
From: Jean Delvare @ 2015-07-17 21:33 UTC (permalink / raw)
  To: Odzioba, Lukasz; +Cc: Guenter Roeck, Yu, Fenghua, lm-sensors, linux-kernel

Hi Lukasz,

On Fri, 17 Jul 2015 19:23:44 +0000, Odzioba, Lukasz wrote:
> On Friday, July 17, 2015 8:02 PM Guenter Roeck wrote:
> > Please explain why krealloc() won't work, why using krealloc(() would
> > result in a larger memory footprint than using lists, and why disabling
> > CPUs would require any action in the first place.
> 
> It will work, but it can use more memory for cpus with many cores.
> If you have just one core visible to the kernel with id 59 
> (i.e. the rest are disabled by hardware) out of 60-core cpu then you
> have to allocate an array of 60 pointers instead of just one element of

Arrays of pointers are cheap. You can fit 512 pointers in a single
memory page.

> the list. Of course you can say that for cpu with just one core list will
> use 3x the memory needed by array and that's true. I see no point in 
> arguing which case is more important, let's move on.

I see the point in arguing: the example above is just silly and does
not match any real-world application. Nobody buys a 60-core CPU to run
it with a single core enabled.

When you have two or more alternative implementations possible,
thinking in terms of the most common cases is a key to make the right
decision. Thinking about servers with a lots of CPU cores versus
embedded devices with few cores and tight memory constraints, that is
useful. Making up corner cases is not.

-- 
Jean Delvare
SUSE L3 Support

_______________________________________________
lm-sensors mailing list
lm-sensors@lm-sensors.org
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2015-07-17 21:33 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov
2012-04-30 13:18 ` [lm-sensors] " Kirill A. Shutemov
2012-04-30 15:28 ` Guenter Roeck
2012-04-30 15:28   ` [lm-sensors] " Guenter Roeck
2012-04-30 16:19   ` Kirill A. Shutemov
2012-04-30 16:19     ` [lm-sensors] " Kirill A. Shutemov
2012-04-30 16:59     ` Guenter Roeck
2012-04-30 16:59       ` [lm-sensors] " Guenter Roeck
2012-05-01 15:20 ` Guenter Roeck
2012-05-01 15:20   ` [lm-sensors] " Guenter Roeck
2012-05-01 21:00   ` Kirill A. Shutemov
2012-05-01 21:00     ` [lm-sensors] " Kirill A. Shutemov
2012-05-01 21:25     ` Guenter Roeck
2012-05-01 21:25       ` [lm-sensors] " Guenter Roeck
2012-05-02 15:10       ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
2012-05-02 15:10         ` [lm-sensors] " Kirill A. Shutemov
2012-05-03  5:29         ` R, Durgadoss
2012-05-03  5:29           ` R, Durgadoss
2012-05-03 10:04           ` Kirill A. Shutemov
2012-05-03 10:04             ` Kirill A. Shutemov
2012-05-03 11:18         ` [PATCH, v2] " Kirill A. Shutemov
2012-05-03 11:18           ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Kirill A. Shutemov
2012-05-04  5:41           ` [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data Guenter Roeck
2012-05-04  5:41             ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Guenter Roeck
2012-05-04  6:46             ` [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
2012-05-04  6:46               ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Kirill A. Shutemov
2012-05-04 13:34               ` [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data Guenter Roeck
2012-05-04 13:34                 ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Guenter Roeck
2012-05-04 13:42                 ` [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
2012-05-04 13:42                   ` [lm-sensors] [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp Kirill A. Shutemov
2015-07-15 16:04 [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Lukasz Odzioba
2015-07-15 16:04 ` [lm-sensors] " Lukasz Odzioba
2015-07-15 21:07 ` Jean Delvare
2015-07-15 21:07   ` [lm-sensors] " Jean Delvare
2015-07-16 13:17   ` Odzioba, Lukasz
2015-07-16 13:17     ` [lm-sensors] " Odzioba, Lukasz
2015-07-17 16:55     ` Guenter Roeck
2015-07-17 16:55       ` [lm-sensors] " Guenter Roeck
2015-07-17 17:28       ` Odzioba, Lukasz
2015-07-17 17:28         ` [lm-sensors] " Odzioba, Lukasz
2015-07-17 18:01         ` Guenter Roeck
2015-07-17 18:01           ` [lm-sensors] " Guenter Roeck
2015-07-17 19:23           ` Odzioba, Lukasz
2015-07-17 19:23             ` [lm-sensors] " Odzioba, Lukasz
2015-07-17 21:33             ` Jean Delvare
2015-07-17 21:33               ` [lm-sensors] " Jean Delvare
2015-07-17 19:11         ` Jean Delvare
2015-07-17 19:11           ` [lm-sensors] " Jean Delvare
2015-07-17 19:36           ` Guenter Roeck
2015-07-17 19:36             ` [lm-sensors] " Guenter Roeck
2015-07-17 21:25             ` Jean Delvare
2015-07-17 21:25               ` [lm-sensors] " Jean Delvare

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.