linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hwmon: coretemp: fix oops on cpu unplug
@ 2012-04-30 13:18 Kirill A. Shutemov
  2012-04-30 15:28 ` Guenter Roeck
  2012-05-01 15:20 ` Guenter Roeck
  0 siblings, 2 replies; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 13:18 UTC (permalink / raw)
  To: Fenghua Yu
  Cc: Andi Kleen, Jean Delvare, Guenter Roeck, lm-sensors,
	linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 0d3141f..54a70fe 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	indx = TO_ATTR_NO(cpu);
 
+	/* The core id is too big, just return */
+	if (indx > MAX_CORE_DATA - 1)
+		return;
+
 	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
 		coretemp_remove_core(pdata, &pdev->dev, indx);
 
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov
@ 2012-04-30 15:28 ` Guenter Roeck
  2012-04-30 16:19   ` Kirill A. Shutemov
  2012-05-01 15:20 ` Guenter Roeck
  1 sibling, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2012-04-30 15:28 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 09:18 -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
> BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
> IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> PGD 673e5a067 PUD 66e9b3067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 79
> Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]
> 
> Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
> RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
> RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
> R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
> R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
> FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
> Stack:
>  ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
>  0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
>  ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
> Call Trace:
>  [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
>  [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
>  [<ffffffff81059d30>] __cpu_notify+0x20/0x40
>  [<ffffffff815fa251>] _cpu_down+0x81/0x270
>  [<ffffffff815fa477>] cpu_down+0x37/0x50
>  [<ffffffff815fd6a3>] store_online+0x63/0xc0
>  [<ffffffff813c7078>] dev_attr_store+0x18/0x30
>  [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
>  [<ffffffff81180443>] vfs_write+0xb3/0x180
>  [<ffffffff8118076a>] sys_write+0x4a/0x90
>  [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
> Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
> RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
>  RSP <ffff880472fb3d48>
> CR2: 000000000000013c
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 0d3141f..54a70fe 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
>  
>  	indx = TO_ATTR_NO(cpu);
>  
> +	/* The core id is too big, just return */
> +	if (indx > MAX_CORE_DATA - 1)
> +		return;
> +
>  	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
>  		coretemp_remove_core(pdata, &pdev->dev, indx);
>  
Hi,

good catch. Couple of problems, though.

First, what number of cores are we talking about ? We should probably
increase NUM_REAL_CORES as well. Long term, we should get rid of the
dependency to prevent that problem from happening again, but that is a
different issue.

Second, we'll need the same code in get_core_online(). Otherwise the
platform device can be created for the new core (if the core is
re-enabled) but will never be deleted.

Thanks,
Guenter



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 15:28 ` Guenter Roeck
@ 2012-04-30 16:19   ` Kirill A. Shutemov
  2012-04-30 16:59     ` Guenter Roeck
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-04-30 16:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> Hi,
> 
> good catch. Couple of problems, though.
> 
> First, what number of cores are we talking about ? We should probably
> increase NUM_REAL_CORES as well. Long term, we should get rid of the
> dependency to prevent that problem from happening again, but that is a
> different issue.

It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
2, 8, 9, 16, 17, 18, 24, 25. :-/
No need to change NUM_REAL_CORES at the moment.

> Second, we'll need the same code in get_core_online(). Otherwise the
> platform device can be created for the new core (if the core is
> re-enabled) but will never be deleted.

It has already handled in create_core_data().

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 16:19   ` Kirill A. Shutemov
@ 2012-04-30 16:59     ` Guenter Roeck
  0 siblings, 0 replies; 15+ messages in thread
From: Guenter Roeck @ 2012-04-30 16:59 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, 2012-04-30 at 12:19 -0400, Kirill A. Shutemov wrote:
> On Mon, Apr 30, 2012 at 08:28:10AM -0700, Guenter Roeck wrote:
> > Hi,
> > 
> > good catch. Couple of problems, though.
> > 
> > First, what number of cores are we talking about ? We should probably
> > increase NUM_REAL_CORES as well. Long term, we should get rid of the
> > dependency to prevent that problem from happening again, but that is a
> > different issue.
> 
> It triggered by another bug. Cores on my 10-core processor have ids 0, 1,
> 2, 8, 9, 16, 17, 18, 24, 25. :-/
> No need to change NUM_REAL_CORES at the moment.
> 
The problem, though, is that core_data[] is indexed by the core ID. So,
NUM_REAL_CORES is a bit misleading. It should probably be named
NUM_CORE_ID or something similar.

> > Second, we'll need the same code in get_core_online(). Otherwise the
> > platform device can be created for the new core (if the core is
> > re-enabled) but will never be deleted.
> 
> It has already handled in create_core_data().
> 
... which is called from coretemp_add_core(), which does not return an
error. Besides, it is called after the call to coretemp_device_add().

As a result, the platform device for the unsupported core(s) will be
created, even if the core is not supported. I just don't think that is a
good idea.

Besides, if you'd have the check in get_core_online(), you would not
even hit the problem in put_core_offline(), since the platform device
would not exist and the function would bail out because of that.

Thanks,
Guenter



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov
  2012-04-30 15:28 ` Guenter Roeck
@ 2012-05-01 15:20 ` Guenter Roeck
  2012-05-01 21:00   ` Kirill A. Shutemov
  1 sibling, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2012-05-01 15:20 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
> 
[ ... ]
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Looking at it again, you were right. Adding the check to get_core_online()
doesn't really help as the platform device is per CPU, not per core.

Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Guenter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-05-01 15:20 ` Guenter Roeck
@ 2012-05-01 21:00   ` Kirill A. Shutemov
  2012-05-01 21:25     ` Guenter Roeck
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-01 21:00 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > core id of the cpu if more than NUM_REAL_CORES-1.
> > 
> [ ... ]
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Looking at it again, you were right. Adding the check to get_core_online()
> doesn't really help as the platform device is per CPU, not per core.
> 
> Applied. I'll submit a separate patch to increase NUM_REAL_CORES.

Actually, the problem is bigger.

Documentation/cputopology.txt:
====
2) /sys/devices/system/cpu/cpuX/topology/core_id:

        the CPU core ID of cpuX. Typically it is the hardware platform's
        identifier (rather than the kernel's).  The actual value is
        architecture and platform dependent.
====

We should not use core id as an index in an array since it's an arbitrary
number (from kernel POV).

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
  2012-05-01 21:00   ` Kirill A. Shutemov
@ 2012-05-01 21:25     ` Guenter Roeck
  2012-05-02 15:10       ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
  0 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2012-05-01 21:25 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Andi Kleen, Jean Delvare, lm-sensors, linux-kernel

On Tue, May 01, 2012 at 05:00:41PM -0400, Kirill A. Shutemov wrote:
> On Tue, May 01, 2012 at 08:20:14AM -0700, Guenter Roeck wrote:
> > On Mon, Apr 30, 2012 at 09:18:01AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > coretemp tries to access core_data array beyond bounds on cpu unplug if
> > > core id of the cpu if more than NUM_REAL_CORES-1.
> > > 
> > [ ... ]
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > Looking at it again, you were right. Adding the check to get_core_online()
> > doesn't really help as the platform device is per CPU, not per core.
> > 
> > Applied. I'll submit a separate patch to increase NUM_REAL_CORES.
> 
> Actually, the problem is bigger.
> 
> Documentation/cputopology.txt:
> ====
> 2) /sys/devices/system/cpu/cpuX/topology/core_id:
> 
>         the CPU core ID of cpuX. Typically it is the hardware platform's
>         identifier (rather than the kernel's).  The actual value is
>         architecture and platform dependent.
> ====
> 
> We should not use core id as an index in an array since it's an arbitrary
> number (from kernel POV).
> 
Yes, we know we'll need a better fix going forward. Using a fixed size array is bad,
no matter what context.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-01 21:25     ` Guenter Roeck
@ 2012-05-02 15:10       ` Kirill A. Shutemov
  2012-05-03  5:29         ` [lm-sensors] " R, Durgadoss
  2012-05-03 11:18         ` [PATCH, v2] " Kirill A. Shutemov
  0 siblings, 2 replies; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-02 15:10 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 127 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..f5f9108 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	spinlock_t temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id == id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id == attr_no) {
+			spin_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	spin_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	spin_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	spin_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	spin_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	spin_lock_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	spin_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	spin_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	spin_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
-
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
+	attr_no = TO_ATTR_NO(cpu);
 
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata->cpu == cpu)
+		coretemp_remove_core(tdata, &pdev->dev);
+	kref_put(&tdata->refcount, temp_data_release);
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* RE: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-02 15:10       ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
@ 2012-05-03  5:29         ` R, Durgadoss
  2012-05-03 10:04           ` Kirill A. Shutemov
  2012-05-03 11:18         ` [PATCH, v2] " Kirill A. Shutemov
  1 sibling, 1 reply; 15+ messages in thread
From: R, Durgadoss @ 2012-05-03  5:29 UTC (permalink / raw)
  To: Kirill A. Shutemov, Yu, Fenghua, Guenter Roeck
  Cc: Andi Kleen, linux-kernel, lm-sensors

Hi,


> -----Original Message-----
> From: lm-sensors-bounces@lm-sensors.org [mailto:lm-sensors-bounces@lm-
> sensors.org] On Behalf Of Kirill A. Shutemov
> Sent: Wednesday, May 02, 2012 8:41 PM
> To: Yu, Fenghua; Guenter Roeck
> Cc: Andi Kleen; linux-kernel@vger.kernel.org; Kirill A. Shutemov; lm-sensors@lm-
> sensors.org
> Subject: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size
> array for temp data
> 
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  drivers/hwmon/coretemp.c |  176 +++++++++++++++++++++++++++++++++---
> ----------
>  1 files changed, 127 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..f5f9108 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp
> */
> -#define NUM_REAL_CORES		16	/* Number of Real cores per cpu
> */
>  #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
>  #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
>  #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA		(NUM_REAL_CORES +
> BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees
> Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +	struct list_head list;
> +	struct kref refcount;
> +	int id;
>  	int temp;
>  	int ttarget;
>  	int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>  	struct device *hwmon_dev;
>  	u16 phys_proc_id;
> -	struct temp_data *core_data[MAX_CORE_DATA];
> +	struct list_head temp_data_list;
> +	spinlock_t temp_data_lock;
>  	struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +	kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +	struct temp_data *tdata;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (tdata->id == id) {
> +			kref_get(&tdata->refcount);
> +			goto out;
> +		}
> +	tdata = NULL;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>  			struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	if (tdata->is_pkg_data)
> -		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +	else
> +		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -	return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> +
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret;
> 
> -	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +	if (!tdata)
> +		return -ENOENT;
> +
> +	ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>  	u32 eax, edx;
>  	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>  	struct platform_data *pdata = dev_get_drvdata(dev);
> -	struct temp_data *tdata = pdata->core_data[attr->index];
> +	struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +	ssize_t ret = -EAGAIN;
> +
> +	if (!tdata)
> +		return -ENOENT;
> 
>  	mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>  	}
> 
>  	mutex_unlock(&tdata->update_lock);
> -	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +	if (tdata->valid)
> +		ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +	kref_put(&tdata->refcount, temp_data_release);
> +	return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>  	if (!tdata)
>  		return NULL;
> +	kref_init(&tdata->refcount);
> 
>  	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
> 
> 	MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit
> *init_temp_data(unsigned int cpu,
>  	return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>  				unsigned int cpu, int pkg_flag)
>  {
>  	struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 */
>  	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -	if (attr_no > MAX_CORE_DATA - 1)
> -		return -ERANGE;
> -
>  	/*
>  	 * Provide a single set of attributes for all HT siblings of a core
>  	 * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  	 * Skip if a HT sibling of this core is already registered.
>  	 * This is not an error.
>  	 */
> -	if (pdata->core_data[attr_no] != NULL)
> -		return 0;
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +		if (tdata->id == attr_no) {
> +			spin_unlock(&pdata->temp_data_lock);
> +			return 0;
> +		}
> +	}
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	tdata = init_temp_data(cpu, pkg_flag);
>  	if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct
> platform_device *pdev,
>  		}
>  	}
> 
> -	pdata->core_data[attr_no] = tdata;
> +	tdata->id = attr_no;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_add(&tdata->list, &pdata->temp_data_list);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	/* Create sysfs interfaces */
>  	err = create_core_attrs(tdata, &pdev->dev, attr_no);
>  	if (err)
> -		goto exit_free;
> +		goto exit_list_del;
> 
>  	return 0;
> +exit_list_del:
> +	spin_lock(&pdata->temp_data_lock);
> +	list_del(&tdata->list);
> +	spin_unlock(&pdata->temp_data_lock);
>  exit_free:
> -	pdata->core_data[attr_no] = NULL;
>  	kfree(tdata);
>  	return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int
> cpu, int pkg_flag)
>  	if (!pdev)
>  		return;
> 
> -	err = create_core_data(pdev, cpu, pkg_flag);
> +	err = create_temp_data(pdev, cpu, pkg_flag);
>  	if (err)
>  		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -				struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device
> *dev)
>  {
>  	int i;
> -	struct temp_data *tdata = pdata->core_data[indx];
> 
>  	/* Remove the sysfs attributes */
>  	for (i = 0; i < tdata->attr_size; i++)
>  		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -	kfree(pdata->core_data[indx]);
> -	pdata->core_data[indx] = NULL;
> +	list_del(&tdata->list);
> +	kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct
> platform_device *pdev)
>  		goto exit_free;
> 
>  	pdata->phys_proc_id = pdev->id;
> +	INIT_LIST_HEAD(&pdata->temp_data_list);
> +	spin_lock_init(&pdata->temp_data_lock);
>  	platform_set_drvdata(pdev, pdata);
> 
>  	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>  	struct platform_data *pdata = platform_get_drvdata(pdev);
> -	int i;
> +	struct temp_data *cur, *tmp;
> 
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -		if (pdata->core_data[i])
> -			coretemp_remove_core(pdata, &pdev->dev, i);
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +		coretemp_remove_core(cur, &pdev->dev);
> +	spin_unlock(&pdata->temp_data_lock);
> 
>  	device_remove_file(&pdev->dev, &pdata->name_attr);
>  	hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit
> coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -	int i;
> -
> -	/* Find online cores, except pkgtemp data */
> -	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -		if (pdata->core_data[i] &&
> -			!pdata->core_data[i]->is_pkg_data) {
> -			return true;
> -		}
> -	}
> -	return false;
> +	struct temp_data *tdata;
> +	bool ret = true;
> +
> +	spin_lock(&pdata->temp_data_lock);
> +	list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +		if (!tdata->is_pkg_data)
> +			 goto out;
> +	ret = false;
> +out:
> +	spin_unlock(&pdata->temp_data_lock);
> +	return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -	int i, indx;
> +	int i, attr_no;
>  	struct platform_data *pdata;
>  	struct platform_device *pdev = coretemp_get_pdev(cpu);
> +	struct temp_data *tdata;
> 
>  	/* If the physical CPU device does not exist, just return */
>  	if (!pdev)
> @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>  	pdata = platform_get_drvdata(pdev);
> 
> -	indx = TO_ATTR_NO(cpu);
> -
> -	/* The core id is too big, just return */
> -	if (indx > MAX_CORE_DATA - 1)
> -		return;
> +	attr_no = TO_ATTR_NO(cpu);
> 
> -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> -		coretemp_remove_core(pdata, &pdev->dev, indx);
> +	tdata = get_temp_data(pdata, attr_no);
> +	if (tdata->cpu == cpu)

The get_temp_data can return a NULL. So, you might want to do,
if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash.

In general, why are we using spin_locks instead of mutex_locks,
for list manipulations .. ?

Thanks,
Durga


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [lm-sensors] [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-03  5:29         ` [lm-sensors] " R, Durgadoss
@ 2012-05-03 10:04           ` Kirill A. Shutemov
  0 siblings, 0 replies; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 10:04 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Yu, Fenghua, Guenter Roeck, Andi Kleen, linux-kernel, lm-sensors

[-- Attachment #1: Type: text/plain, Size: 986 bytes --]

On Thu, May 03, 2012 at 05:29:43AM +0000, R, Durgadoss wrote:
> > @@ -707,14 +787,12 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> > 
> >  	pdata = platform_get_drvdata(pdev);
> > 
> > -	indx = TO_ATTR_NO(cpu);
> > -
> > -	/* The core id is too big, just return */
> > -	if (indx > MAX_CORE_DATA - 1)
> > -		return;
> > +	attr_no = TO_ATTR_NO(cpu);
> > 
> > -	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> > -		coretemp_remove_core(pdata, &pdev->dev, indx);
> > +	tdata = get_temp_data(pdata, attr_no);
> > +	if (tdata->cpu == cpu)
> 
> The get_temp_data can return a NULL. So, you might want to do,
> if (tdata && tdata->cpu == cpu) to avoid a potential NULL ptr crash.

Good catch, thanks.

> In general, why are we using spin_locks instead of mutex_locks,
> for list manipulations .. ?

I don't think it matters here. Contention is low and nobody sleeps, but
okay, I'll change it to mutex.

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-02 15:10       ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
  2012-05-03  5:29         ` [lm-sensors] " R, Durgadoss
@ 2012-05-03 11:18         ` Kirill A. Shutemov
  2012-05-04  5:41           ` Guenter Roeck
  1 sibling, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-03 11:18 UTC (permalink / raw)
  To: Fenghua Yu, Guenter Roeck, Durgadoss R
  Cc: Andi Kleen, Jean Delvare, lm-sensors, linux-kernel, Kirill A. Shutemov

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's rework code to allow arbitrary number of cores on a CPU, not
limited by hardcoded array size.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 v2:
   - fix NULL pointer dereference. Thanks to R, Durgadoss;
   - use mutex instead of spinlock for list locking.
---
 drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
 1 files changed, 129 insertions(+), 49 deletions(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 54a70fe..1c66131 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -36,6 +36,8 @@
 #include <linux/cpu.h>
 #include <linux/pci.h>
 #include <linux/smp.h>
+#include <linux/list.h>
+#include <linux/kref.h>
 #include <linux/moduleparam.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
 MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
 
 #define BASE_SYSFS_ATTR_NO	2	/* Sysfs Base attr no for coretemp */
-#define NUM_REAL_CORES		16	/* Number of Real cores per cpu */
 #define CORETEMP_NAME_LENGTH	17	/* String Length of attrs */
 #define MAX_CORE_ATTRS		4	/* Maximum no of basic attrs */
 #define TOTAL_ATTRS		(MAX_CORE_ATTRS + 1)
-#define MAX_CORE_DATA		(NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
 
 #define TO_PHYS_ID(cpu)		(cpu_data(cpu).phys_proc_id)
 #define TO_CORE_ID(cpu)		(cpu_data(cpu).cpu_core_id)
@@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
  * @valid: If this is 1, the current temperature is valid.
  */
 struct temp_data {
+	struct list_head list;
+	struct kref refcount;
+	int id;
 	int temp;
 	int ttarget;
 	int tjmax;
@@ -101,7 +104,8 @@ struct temp_data {
 struct platform_data {
 	struct device *hwmon_dev;
 	u16 phys_proc_id;
-	struct temp_data *core_data[MAX_CORE_DATA];
+	struct list_head temp_data_list;
+	struct mutex temp_data_lock;
 	struct device_attribute name_attr;
 };
 
@@ -114,6 +118,29 @@ struct pdev_entry {
 static LIST_HEAD(pdev_list);
 static DEFINE_MUTEX(pdev_list_mutex);
 
+static void temp_data_release(struct kref *ref)
+{
+	struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
+
+	kfree(tdata);
+}
+
+static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
+{
+	struct temp_data *tdata;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (tdata->id == id) {
+			kref_get(&tdata->refcount);
+			goto out;
+		}
+	tdata = NULL;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return tdata;
+}
+
 static ssize_t show_name(struct device *dev,
 			struct device_attribute *devattr, char *buf)
 {
@@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	if (tdata->is_pkg_data)
-		return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+		ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
+	else
+		ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
 
-	return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_crit_alarm(struct device *dev,
@@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
 
 	rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
 
-	return sprintf(buf, "%d\n", (eax >> 5) & 1);
+	ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_tjmax(struct device *dev,
@@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
+
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->tjmax);
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_ttarget(struct device *dev,
@@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
 {
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
+	if (!tdata)
+		return -ENOENT;
+
+	ret = sprintf(buf, "%d\n", tdata->ttarget);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static ssize_t show_temp(struct device *dev,
@@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
 	u32 eax, edx;
 	struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
 	struct platform_data *pdata = dev_get_drvdata(dev);
-	struct temp_data *tdata = pdata->core_data[attr->index];
+	struct temp_data *tdata = get_temp_data(pdata, attr->index);
+	ssize_t ret = -EAGAIN;
+
+	if (!tdata)
+		return -ENOENT;
 
 	mutex_lock(&tdata->update_lock);
 
@@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
 	}
 
 	mutex_unlock(&tdata->update_lock);
-	return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
+
+	if (tdata->valid)
+		ret = sprintf(buf, "%d\n", tdata->temp);
+
+	kref_put(&tdata->refcount, temp_data_release);
+	return ret;
 }
 
 static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
@@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
 	if (!tdata)
 		return NULL;
+	kref_init(&tdata->refcount);
 
 	tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
 							MSR_IA32_THERM_STATUS;
@@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
 	return tdata;
 }
 
-static int __cpuinit create_core_data(struct platform_device *pdev,
+static int __cpuinit create_temp_data(struct platform_device *pdev,
 				unsigned int cpu, int pkg_flag)
 {
 	struct temp_data *tdata;
@@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 */
 	attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
 
-	if (attr_no > MAX_CORE_DATA - 1)
-		return -ERANGE;
-
 	/*
 	 * Provide a single set of attributes for all HT siblings of a core
 	 * to avoid duplicate sensors (the processor ID and core ID of all
@@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 	 * Skip if a HT sibling of this core is already registered.
 	 * This is not an error.
 	 */
-	if (pdata->core_data[attr_no] != NULL)
-		return 0;
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list) {
+		if (tdata->id == attr_no) {
+			mutex_unlock(&pdata->temp_data_lock);
+			return 0;
+		}
+	}
+	mutex_unlock(&pdata->temp_data_lock);
 
 	tdata = init_temp_data(cpu, pkg_flag);
 	if (!tdata)
@@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
 		}
 	}
 
-	pdata->core_data[attr_no] = tdata;
+	tdata->id = attr_no;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_add(&tdata->list, &pdata->temp_data_list);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	/* Create sysfs interfaces */
 	err = create_core_attrs(tdata, &pdev->dev, attr_no);
 	if (err)
-		goto exit_free;
+		goto exit_list_del;
 
 	return 0;
+exit_list_del:
+	mutex_lock(&pdata->temp_data_lock);
+	list_del(&tdata->list);
+	mutex_unlock(&pdata->temp_data_lock);
 exit_free:
-	pdata->core_data[attr_no] = NULL;
 	kfree(tdata);
 	return err;
 }
@@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
 	if (!pdev)
 		return;
 
-	err = create_core_data(pdev, cpu, pkg_flag);
+	err = create_temp_data(pdev, cpu, pkg_flag);
 	if (err)
 		dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
 }
 
-static void coretemp_remove_core(struct platform_data *pdata,
-				struct device *dev, int indx)
+static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
 {
 	int i;
-	struct temp_data *tdata = pdata->core_data[indx];
 
 	/* Remove the sysfs attributes */
 	for (i = 0; i < tdata->attr_size; i++)
 		device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
 
-	kfree(pdata->core_data[indx]);
-	pdata->core_data[indx] = NULL;
+	list_del(&tdata->list);
+	kref_put(&tdata->refcount, temp_data_release);
 }
 
 static int __devinit coretemp_probe(struct platform_device *pdev)
@@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
 		goto exit_free;
 
 	pdata->phys_proc_id = pdev->id;
+	INIT_LIST_HEAD(&pdata->temp_data_list);
+	mutex_init(&pdata->temp_data_lock);
 	platform_set_drvdata(pdev, pdata);
 
 	pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
@@ -557,11 +634,12 @@ exit_free:
 static int __devexit coretemp_remove(struct platform_device *pdev)
 {
 	struct platform_data *pdata = platform_get_drvdata(pdev);
-	int i;
+	struct temp_data *cur, *tmp;
 
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i)
-		if (pdata->core_data[i])
-			coretemp_remove_core(pdata, &pdev->dev, i);
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
+		coretemp_remove_core(cur, &pdev->dev);
+	mutex_unlock(&pdata->temp_data_lock);
 
 	device_remove_file(&pdev->dev, &pdata->name_attr);
 	hwmon_device_unregister(pdata->hwmon_dev);
@@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
 
 static bool __cpuinit is_any_core_online(struct platform_data *pdata)
 {
-	int i;
-
-	/* Find online cores, except pkgtemp data */
-	for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
-		if (pdata->core_data[i] &&
-			!pdata->core_data[i]->is_pkg_data) {
-			return true;
-		}
-	}
-	return false;
+	struct temp_data *tdata;
+	bool ret = true;
+
+	mutex_lock(&pdata->temp_data_lock);
+	list_for_each_entry(tdata, &pdata->temp_data_list, list)
+		if (!tdata->is_pkg_data)
+			 goto out;
+	ret = false;
+out:
+	mutex_unlock(&pdata->temp_data_lock);
+	return ret;
 }
 
 static void __cpuinit get_core_online(unsigned int cpu)
@@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
 
 static void __cpuinit put_core_offline(unsigned int cpu)
 {
-	int i, indx;
+	int i, attr_no;
 	struct platform_data *pdata;
 	struct platform_device *pdev = coretemp_get_pdev(cpu);
+	struct temp_data *tdata;
 
 	/* If the physical CPU device does not exist, just return */
 	if (!pdev)
@@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
 
 	pdata = platform_get_drvdata(pdev);
 
-	indx = TO_ATTR_NO(cpu);
+	attr_no = TO_ATTR_NO(cpu);
 
-	/* The core id is too big, just return */
-	if (indx > MAX_CORE_DATA - 1)
-		return;
-
-	if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
-		coretemp_remove_core(pdata, &pdev->dev, indx);
+	tdata = get_temp_data(pdata, attr_no);
+	if (tdata) {
+		if (tdata->cpu == cpu)
+			coretemp_remove_core(tdata, &pdev->dev);
+		kref_put(&tdata->refcount, temp_data_release);
+	}
 
 	/*
 	 * If a HT sibling of a core is taken offline, but another HT sibling
-- 
1.7.9.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-03 11:18         ` [PATCH, v2] " Kirill A. Shutemov
@ 2012-05-04  5:41           ` Guenter Roeck
  2012-05-04  6:46             ` Kirill A. Shutemov
  0 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2012-05-04  5:41 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's rework code to allow arbitrary number of cores on a CPU, not
> limited by hardcoded array size.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  v2:
>    - fix NULL pointer dereference. Thanks to R, Durgadoss;
>    - use mutex instead of spinlock for list locking.
> ---
>  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
>  1 files changed, 129 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 54a70fe..1c66131 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -36,6 +36,8 @@
>  #include <linux/cpu.h>
>  #include <linux/pci.h>
>  #include <linux/smp.h>
> +#include <linux/list.h>
> +#include <linux/kref.h>
>  #include <linux/moduleparam.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
>  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> 
>  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
>  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
>  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
>  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> 
>  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
>  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
>   * @valid: If this is 1, the current temperature is valid.
>   */
>  struct temp_data {
> +       struct list_head list;
> +       struct kref refcount;

Hi,

the kref is not needed. The attribute access functions don't
need to be protected since the attributes for a core are deleted
before the core data itself is deleted. So it is not neccessary 
to hold a lock while accessing/using temp_data in the attribute
access functions. All you need is to hold a mutex while you are
manipulating or walking the list.

Thanks,
Guenter

> +       int id;
>         int temp;
>         int ttarget;
>         int tjmax;
> @@ -101,7 +104,8 @@ struct temp_data {
>  struct platform_data {
>         struct device *hwmon_dev;
>         u16 phys_proc_id;
> -       struct temp_data *core_data[MAX_CORE_DATA];
> +       struct list_head temp_data_list;
> +       struct mutex temp_data_lock;
>         struct device_attribute name_attr;
>  };
> 
> @@ -114,6 +118,29 @@ struct pdev_entry {
>  static LIST_HEAD(pdev_list);
>  static DEFINE_MUTEX(pdev_list_mutex);
> 
> +static void temp_data_release(struct kref *ref)
> +{
> +       struct temp_data *tdata = container_of(ref, struct temp_data, refcount);
> +
> +       kfree(tdata);
> +}
> +
> +static struct temp_data *get_temp_data(struct platform_data *pdata, int id)
> +{
> +       struct temp_data *tdata;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (tdata->id == id) {
> +                       kref_get(&tdata->refcount);
> +                       goto out;
> +               }
> +       tdata = NULL;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return tdata;
> +}
> +
>  static ssize_t show_name(struct device *dev,
>                         struct device_attribute *devattr, char *buf)
>  {
> @@ -125,12 +152,19 @@ static ssize_t show_label(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         if (tdata->is_pkg_data)
> -               return sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +               ret = sprintf(buf, "Physical id %u\n", pdata->phys_proc_id);
> +       else
> +               ret = sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> 
> -       return sprintf(buf, "Core %u\n", tdata->cpu_core_id);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_crit_alarm(struct device *dev,
> @@ -139,11 +173,18 @@ static ssize_t show_crit_alarm(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         rdmsr_on_cpu(tdata->cpu, tdata->status_reg, &eax, &edx);
> 
> -       return sprintf(buf, "%d\n", (eax >> 5) & 1);
> +       ret = sprintf(buf, "%d\n", (eax >> 5) & 1);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_tjmax(struct device *dev,
> @@ -151,8 +192,16 @@ static ssize_t show_tjmax(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> +
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->tjmax);
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->tjmax);
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_ttarget(struct device *dev,
> @@ -160,8 +209,16 @@ static ssize_t show_ttarget(struct device *dev,
>  {
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret;
> 
> -       return sprintf(buf, "%d\n", pdata->core_data[attr->index]->ttarget);
> +       if (!tdata)
> +               return -ENOENT;
> +
> +       ret = sprintf(buf, "%d\n", tdata->ttarget);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static ssize_t show_temp(struct device *dev,
> @@ -170,7 +227,11 @@ static ssize_t show_temp(struct device *dev,
>         u32 eax, edx;
>         struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
>         struct platform_data *pdata = dev_get_drvdata(dev);
> -       struct temp_data *tdata = pdata->core_data[attr->index];
> +       struct temp_data *tdata = get_temp_data(pdata, attr->index);
> +       ssize_t ret = -EAGAIN;
> +
> +       if (!tdata)
> +               return -ENOENT;
> 
>         mutex_lock(&tdata->update_lock);
> 
> @@ -188,7 +249,12 @@ static ssize_t show_temp(struct device *dev,
>         }
> 
>         mutex_unlock(&tdata->update_lock);
> -       return tdata->valid ? sprintf(buf, "%d\n", tdata->temp) : -EAGAIN;
> +
> +       if (tdata->valid)
> +               ret = sprintf(buf, "%d\n", tdata->temp);
> +
> +       kref_put(&tdata->refcount, temp_data_release);
> +       return ret;
>  }
> 
>  static int __cpuinit adjust_tjmax(struct cpuinfo_x86 *c, u32 id,
> @@ -412,6 +478,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         tdata = kzalloc(sizeof(struct temp_data), GFP_KERNEL);
>         if (!tdata)
>                 return NULL;
> +       kref_init(&tdata->refcount);
> 
>         tdata->status_reg = pkg_flag ? MSR_IA32_PACKAGE_THERM_STATUS :
>                                                         MSR_IA32_THERM_STATUS;
> @@ -423,7 +490,7 @@ static struct temp_data __cpuinit *init_temp_data(unsigned int cpu,
>         return tdata;
>  }
> 
> -static int __cpuinit create_core_data(struct platform_device *pdev,
> +static int __cpuinit create_temp_data(struct platform_device *pdev,
>                                 unsigned int cpu, int pkg_flag)
>  {
>         struct temp_data *tdata;
> @@ -440,9 +507,6 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          */
>         attr_no = pkg_flag ? 1 : TO_ATTR_NO(cpu);
> 
> -       if (attr_no > MAX_CORE_DATA - 1)
> -               return -ERANGE;
> -
>         /*
>          * Provide a single set of attributes for all HT siblings of a core
>          * to avoid duplicate sensors (the processor ID and core ID of all
> @@ -450,8 +514,14 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>          * Skip if a HT sibling of this core is already registered.
>          * This is not an error.
>          */
> -       if (pdata->core_data[attr_no] != NULL)
> -               return 0;
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list) {
> +               if (tdata->id == attr_no) {
> +                       mutex_unlock(&pdata->temp_data_lock);
> +                       return 0;
> +               }
> +       }
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         tdata = init_temp_data(cpu, pkg_flag);
>         if (!tdata)
> @@ -480,16 +550,23 @@ static int __cpuinit create_core_data(struct platform_device *pdev,
>                 }
>         }
> 
> -       pdata->core_data[attr_no] = tdata;
> +       tdata->id = attr_no;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_add(&tdata->list, &pdata->temp_data_list);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         /* Create sysfs interfaces */
>         err = create_core_attrs(tdata, &pdev->dev, attr_no);
>         if (err)
> -               goto exit_free;
> +               goto exit_list_del;
> 
>         return 0;
> +exit_list_del:
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_del(&tdata->list);
> +       mutex_unlock(&pdata->temp_data_lock);
>  exit_free:
> -       pdata->core_data[attr_no] = NULL;
>         kfree(tdata);
>         return err;
>  }
> @@ -502,23 +579,21 @@ static void __cpuinit coretemp_add_core(unsigned int cpu, int pkg_flag)
>         if (!pdev)
>                 return;
> 
> -       err = create_core_data(pdev, cpu, pkg_flag);
> +       err = create_temp_data(pdev, cpu, pkg_flag);
>         if (err)
>                 dev_err(&pdev->dev, "Adding Core %u failed\n", cpu);
>  }
> 
> -static void coretemp_remove_core(struct platform_data *pdata,
> -                               struct device *dev, int indx)
> +static void coretemp_remove_core(struct temp_data *tdata, struct device *dev)
>  {
>         int i;
> -       struct temp_data *tdata = pdata->core_data[indx];
> 
>         /* Remove the sysfs attributes */
>         for (i = 0; i < tdata->attr_size; i++)
>                 device_remove_file(dev, &tdata->sd_attrs[i].dev_attr);
> 
> -       kfree(pdata->core_data[indx]);
> -       pdata->core_data[indx] = NULL;
> +       list_del(&tdata->list);
> +       kref_put(&tdata->refcount, temp_data_release);
>  }
> 
>  static int __devinit coretemp_probe(struct platform_device *pdev)
> @@ -536,6 +611,8 @@ static int __devinit coretemp_probe(struct platform_device *pdev)
>                 goto exit_free;
> 
>         pdata->phys_proc_id = pdev->id;
> +       INIT_LIST_HEAD(&pdata->temp_data_list);
> +       mutex_init(&pdata->temp_data_lock);
>         platform_set_drvdata(pdev, pdata);
> 
>         pdata->hwmon_dev = hwmon_device_register(&pdev->dev);
> @@ -557,11 +634,12 @@ exit_free:
>  static int __devexit coretemp_remove(struct platform_device *pdev)
>  {
>         struct platform_data *pdata = platform_get_drvdata(pdev);
> -       int i;
> +       struct temp_data *cur, *tmp;
> 
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i)
> -               if (pdata->core_data[i])
> -                       coretemp_remove_core(pdata, &pdev->dev, i);
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry_safe(cur, tmp, &pdata->temp_data_list, list)
> +               coretemp_remove_core(cur, &pdev->dev);
> +       mutex_unlock(&pdata->temp_data_lock);
> 
>         device_remove_file(&pdev->dev, &pdata->name_attr);
>         hwmon_device_unregister(pdata->hwmon_dev);
> @@ -641,16 +719,17 @@ static void __cpuinit coretemp_device_remove(unsigned int cpu)
> 
>  static bool __cpuinit is_any_core_online(struct platform_data *pdata)
>  {
> -       int i;
> -
> -       /* Find online cores, except pkgtemp data */
> -       for (i = MAX_CORE_DATA - 1; i >= 0; --i) {
> -               if (pdata->core_data[i] &&
> -                       !pdata->core_data[i]->is_pkg_data) {
> -                       return true;
> -               }
> -       }
> -       return false;
> +       struct temp_data *tdata;
> +       bool ret = true;
> +
> +       mutex_lock(&pdata->temp_data_lock);
> +       list_for_each_entry(tdata, &pdata->temp_data_list, list)
> +               if (!tdata->is_pkg_data)
> +                        goto out;
> +       ret = false;
> +out:
> +       mutex_unlock(&pdata->temp_data_lock);
> +       return ret;
>  }
> 
>  static void __cpuinit get_core_online(unsigned int cpu)
> @@ -697,9 +776,10 @@ static void __cpuinit get_core_online(unsigned int cpu)
> 
>  static void __cpuinit put_core_offline(unsigned int cpu)
>  {
> -       int i, indx;
> +       int i, attr_no;
>         struct platform_data *pdata;
>         struct platform_device *pdev = coretemp_get_pdev(cpu);
> +       struct temp_data *tdata;
> 
>         /* If the physical CPU device does not exist, just return */
>         if (!pdev)
> @@ -707,14 +787,14 @@ static void __cpuinit put_core_offline(unsigned int cpu)
> 
>         pdata = platform_get_drvdata(pdev);
> 
> -       indx = TO_ATTR_NO(cpu);
> +       attr_no = TO_ATTR_NO(cpu);
> 
> -       /* The core id is too big, just return */
> -       if (indx > MAX_CORE_DATA - 1)
> -               return;
> -
> -       if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> -               coretemp_remove_core(pdata, &pdev->dev, indx);
> +       tdata = get_temp_data(pdata, attr_no);
> +       if (tdata) {
> +               if (tdata->cpu == cpu)
> +                       coretemp_remove_core(tdata, &pdev->dev);
> +               kref_put(&tdata->refcount, temp_data_release);
> +       }
> 
>         /*
>          * If a HT sibling of a core is taken offline, but another HT sibling
> --
> 1.7.9.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04  5:41           ` Guenter Roeck
@ 2012-05-04  6:46             ` Kirill A. Shutemov
  2012-05-04 13:34               ` Guenter Roeck
  0 siblings, 1 reply; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04  6:46 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2773 bytes --]

On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > Let's rework code to allow arbitrary number of cores on a CPU, not
> > limited by hardcoded array size.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  v2:
> >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> >    - use mutex instead of spinlock for list locking.
> > ---
> >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 129 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > index 54a70fe..1c66131 100644
> > --- a/drivers/hwmon/coretemp.c
> > +++ b/drivers/hwmon/coretemp.c
> > @@ -36,6 +36,8 @@
> >  #include <linux/cpu.h>
> >  #include <linux/pci.h>
> >  #include <linux/smp.h>
> > +#include <linux/list.h>
> > +#include <linux/kref.h>
> >  #include <linux/moduleparam.h>
> >  #include <asm/msr.h>
> >  #include <asm/processor.h>
> > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > 
> >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > 
> >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> >   * @valid: If this is 1, the current temperature is valid.
> >   */
> >  struct temp_data {
> > +       struct list_head list;
> > +       struct kref refcount;
> 
> Hi,
> 
> the kref is not needed. The attribute access functions don't
> need to be protected since the attributes for a core are deleted
> before the core data itself is deleted. So it is not neccessary 
> to hold a lock while accessing/using temp_data in the attribute
> access functions. All you need is to hold a mutex while you are
> manipulating or walking the list.

Without kref, what prevents following situation:

		CPU-A				CPU-B
	tdata = get_temp_data();
					coretemp_remove_core() {
					    device_remove_file();
					    kfree(tdata);
					}
	<tdata dereference>

?

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04  6:46             ` Kirill A. Shutemov
@ 2012-05-04 13:34               ` Guenter Roeck
  2012-05-04 13:42                 ` Kirill A. Shutemov
  0 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2012-05-04 13:34 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > limited by hardcoded array size.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > ---
> > >  v2:
> > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > >    - use mutex instead of spinlock for list locking.
> > > ---
> > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > 
> > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > index 54a70fe..1c66131 100644
> > > --- a/drivers/hwmon/coretemp.c
> > > +++ b/drivers/hwmon/coretemp.c
> > > @@ -36,6 +36,8 @@
> > >  #include <linux/cpu.h>
> > >  #include <linux/pci.h>
> > >  #include <linux/smp.h>
> > > +#include <linux/list.h>
> > > +#include <linux/kref.h>
> > >  #include <linux/moduleparam.h>
> > >  #include <asm/msr.h>
> > >  #include <asm/processor.h>
> > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > 
> > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > 
> > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > >   * @valid: If this is 1, the current temperature is valid.
> > >   */
> > >  struct temp_data {
> > > +       struct list_head list;
> > > +       struct kref refcount;
> > 
> > Hi,
> > 
> > the kref is not needed. The attribute access functions don't
> > need to be protected since the attributes for a core are deleted
> > before the core data itself is deleted. So it is not neccessary 
> > to hold a lock while accessing/using temp_data in the attribute
> > access functions. All you need is to hold a mutex while you are
> > manipulating or walking the list.
> 
> Without kref, what prevents following situation:
> 
> 		CPU-A				CPU-B
> 	tdata = get_temp_data();
> 					coretemp_remove_core() {
> 					    device_remove_file();
> 					    kfree(tdata);
> 					}
> 	<tdata dereference>
> 
The remove function requires a semaphore which is held by the access function,
so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Guenter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH, v2] hwmon: coretemp: use list instead of fixed size array for temp data
  2012-05-04 13:34               ` Guenter Roeck
@ 2012-05-04 13:42                 ` Kirill A. Shutemov
  0 siblings, 0 replies; 15+ messages in thread
From: Kirill A. Shutemov @ 2012-05-04 13:42 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Fenghua Yu, Durgadoss R, Andi Kleen, Jean Delvare, lm-sensors,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3381 bytes --]

On Fri, May 04, 2012 at 06:34:19AM -0700, Guenter Roeck wrote:
> On Fri, May 04, 2012 at 02:46:06AM -0400, Kirill A. Shutemov wrote:
> > On Thu, May 03, 2012 at 10:41:22PM -0700, Guenter Roeck wrote:
> > > On Thu, May 03, 2012 at 07:18:56AM -0400, Kirill A. Shutemov wrote:
> > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > > 
> > > > Let's rework code to allow arbitrary number of cores on a CPU, not
> > > > limited by hardcoded array size.
> > > > 
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > ---
> > > >  v2:
> > > >    - fix NULL pointer dereference. Thanks to R, Durgadoss;
> > > >    - use mutex instead of spinlock for list locking.
> > > > ---
> > > >  drivers/hwmon/coretemp.c |  178 +++++++++++++++++++++++++++++++++-------------
> > > >  1 files changed, 129 insertions(+), 49 deletions(-)
> > > > 
> > > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > > index 54a70fe..1c66131 100644
> > > > --- a/drivers/hwmon/coretemp.c
> > > > +++ b/drivers/hwmon/coretemp.c
> > > > @@ -36,6 +36,8 @@
> > > >  #include <linux/cpu.h>
> > > >  #include <linux/pci.h>
> > > >  #include <linux/smp.h>
> > > > +#include <linux/list.h>
> > > > +#include <linux/kref.h>
> > > >  #include <linux/moduleparam.h>
> > > >  #include <asm/msr.h>
> > > >  #include <asm/processor.h>
> > > > @@ -52,11 +54,9 @@ module_param_named(tjmax, force_tjmax, int, 0444);
> > > >  MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > > 
> > > >  #define BASE_SYSFS_ATTR_NO     2       /* Sysfs Base attr no for coretemp */
> > > > -#define NUM_REAL_CORES         16      /* Number of Real cores per cpu */
> > > >  #define CORETEMP_NAME_LENGTH   17      /* String Length of attrs */
> > > >  #define MAX_CORE_ATTRS         4       /* Maximum no of basic attrs */
> > > >  #define TOTAL_ATTRS            (MAX_CORE_ATTRS + 1)
> > > > -#define MAX_CORE_DATA          (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO)
> > > > 
> > > >  #define TO_PHYS_ID(cpu)                (cpu_data(cpu).phys_proc_id)
> > > >  #define TO_CORE_ID(cpu)                (cpu_data(cpu).cpu_core_id)
> > > > @@ -82,6 +82,9 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius");
> > > >   * @valid: If this is 1, the current temperature is valid.
> > > >   */
> > > >  struct temp_data {
> > > > +       struct list_head list;
> > > > +       struct kref refcount;
> > > 
> > > Hi,
> > > 
> > > the kref is not needed. The attribute access functions don't
> > > need to be protected since the attributes for a core are deleted
> > > before the core data itself is deleted. So it is not neccessary 
> > > to hold a lock while accessing/using temp_data in the attribute
> > > access functions. All you need is to hold a mutex while you are
> > > manipulating or walking the list.
> > 
> > Without kref, what prevents following situation:
> > 
> > 		CPU-A				CPU-B
> > 	tdata = get_temp_data();
> > 					coretemp_remove_core() {
> > 					    device_remove_file();
> > 					    kfree(tdata);
> > 					}
> > 	<tdata dereference>
> > 
> The remove function requires a semaphore which is held by the access function,
> so device_remove_file() will only proceed after CPU-A is done with the sysfs access.

Understood. I'll update the patch.

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-05-04 13:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-30 13:18 [PATCH] hwmon: coretemp: fix oops on cpu unplug Kirill A. Shutemov
2012-04-30 15:28 ` Guenter Roeck
2012-04-30 16:19   ` Kirill A. Shutemov
2012-04-30 16:59     ` Guenter Roeck
2012-05-01 15:20 ` Guenter Roeck
2012-05-01 21:00   ` Kirill A. Shutemov
2012-05-01 21:25     ` Guenter Roeck
2012-05-02 15:10       ` [PATCH] hwmon: coretemp: use list instead of fixed size array for temp data Kirill A. Shutemov
2012-05-03  5:29         ` [lm-sensors] " R, Durgadoss
2012-05-03 10:04           ` Kirill A. Shutemov
2012-05-03 11:18         ` [PATCH, v2] " Kirill A. Shutemov
2012-05-04  5:41           ` Guenter Roeck
2012-05-04  6:46             ` Kirill A. Shutemov
2012-05-04 13:34               ` Guenter Roeck
2012-05-04 13:42                 ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).