linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/10] CPU hotplug error handling fixes
@ 2007-07-16 13:48 Akinobu Mita
  2007-07-16 13:50 ` [PATCH 1/10] sysfs: fix kmem_cache_free(NULL) Akinobu Mita
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rusty Russell, Greg Kroah-Hartman, Dmitriy Zavin, H. Peter Anvin,
	Andi Kleen, Ashok Raj

This series of patches fixes the error handling for cpu hotplug.
The problem is revealed by CPU hotplug/unplug test with fault-injection.

The patch 1-3 are sysfs or driver core related error handling fixes.
These are not directly related to cpu hotplug. But these are needed to
pass the stress test.

The patch 4 changes the behavior when one of the callbacks in notifier
chain returns NOTIFY_BAD with CPU_UP_PREPARE event. This change makes
cpu hotplug error handling simple.

The patch 5 simplifies the cpu hotplug event handling in topology.c
by the patch 4.

The patch 6-10 are error handling fixes in cpu hotplug event callbacks.
These fixes also depend on the change by the patch 4.

Here is the test script I have confirmed with these patches.
I guess we still have the similar bugs that I could not test due to no
hardware. So it may be worth someone trying this script.

----------[ cut here ]----------

#!/bin/bash

FAILTYPE=failslab
CPU=1
CPU_ONLINE=/sys/devices/system/cpu/cpu${CPU}/online

faulty_system()
{
        bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
}

[ "$UID" == 0 ] || exit 1
[ -n "$FAILTYPE" -a -f /debug/$FAILTYPE/probability ] || exit 1
[ -f $CPU_ONLINE ] || exit 1

echo N > /debug/$FAILTYPE/ignore-gfp-wait
echo Y > /debug/$FAILTYPE/task-filter
echo 1 > /debug/$FAILTYPE/probability
echo -1 > /debug/$FAILTYPE/times

while true
do
	faulty_system "echo 0 > $CPU_ONLINE"
	faulty_system "echo 1 > $CPU_ONLINE"
done


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/10] sysfs: fix kmem_cache_free(NULL)
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
@ 2007-07-16 13:50 ` Akinobu Mita
  2007-07-16 15:43   ` Greg KH
  2007-07-16 13:51 ` [PATCH 2/10] sysdev: add error check in sysdev_register() Akinobu Mita
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:50 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman

This patch fixes out of memory error handling in sysfs_new_dirent().
kmem_cache_free() with NULL is not allowed.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 fs/sysfs/dir.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Index: 2.6-mm/fs/sysfs/dir.c
===================================================================
--- 2.6-mm.orig/fs/sysfs/dir.c
+++ 2.6-mm/fs/sysfs/dir.c
@@ -361,20 +361,20 @@ static struct dentry_operations sysfs_de
 struct sysfs_dirent *sysfs_new_dirent(const char *name, umode_t mode, int type)
 {
 	char *dup_name = NULL;
-	struct sysfs_dirent *sd = NULL;
+	struct sysfs_dirent *sd;
 
 	if (type & SYSFS_COPY_NAME) {
 		name = dup_name = kstrdup(name, GFP_KERNEL);
 		if (!name)
-			goto err_out;
+			return NULL;
 	}
 
 	sd = kmem_cache_zalloc(sysfs_dir_cachep, GFP_KERNEL);
 	if (!sd)
-		goto err_out;
+		goto err_out1;
 
 	if (sysfs_alloc_ino(&sd->s_ino))
-		goto err_out;
+		goto err_out2;
 
 	atomic_set(&sd->s_count, 1);
 	atomic_set(&sd->s_active, 0);
@@ -386,9 +386,10 @@ struct sysfs_dirent *sysfs_new_dirent(co
 
 	return sd;
 
- err_out:
-	kfree(dup_name);
+ err_out2:
 	kmem_cache_free(sysfs_dir_cachep, sd);
+ err_out1:
+	kfree(dup_name);
 	return NULL;
 }
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/10] sysdev: add error check in sysdev_register()
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
  2007-07-16 13:50 ` [PATCH 1/10] sysfs: fix kmem_cache_free(NULL) Akinobu Mita
@ 2007-07-16 13:51 ` Akinobu Mita
  2007-07-16 15:12   ` Cornelia Huck
  2007-07-16 13:52 ` [PATCH 3/10] sysfs: fix error handling in create_files() Akinobu Mita
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:51 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman

This patch enables to catch the errors returned by add() procedure of
sysdev driver in sysdev_register.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 drivers/base/sys.c |   48 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 10 deletions(-)

Index: 2.6-mm/drivers/base/sys.c
===================================================================
--- 2.6-mm.orig/drivers/base/sys.c
+++ 2.6-mm/drivers/base/sys.c
@@ -220,10 +220,13 @@ EXPORT_SYMBOL_GPL(sysdev_driver_unregist
  *	@sysdev:	device in question
  *
  */
-int sysdev_register(struct sys_device * sysdev)
+int sysdev_register(struct sys_device *sysdev)
 {
 	int error;
-	struct sysdev_class * cls = sysdev->cls;
+	struct sysdev_class *cls = sysdev->cls;
+	struct sysdev_driver *drv;
+	int added_sysdev = 0;
+	int added_cls = 0;
 
 	if (!cls)
 		return -EINVAL;
@@ -244,8 +247,6 @@ int sysdev_register(struct sys_device * 
 	error = kobject_register(&sysdev->kobj);
 
 	if (!error) {
-		struct sysdev_driver * drv;
-
 		mutex_lock(&sysdev_drivers_lock);
 		/* Generic notification is implicit, because it's that
 		 * code that should have called us.
@@ -253,23 +254,50 @@ int sysdev_register(struct sys_device * 
 
 		/* Notify global drivers */
 		list_for_each_entry(drv, &sysdev_drivers, entry) {
-			if (drv->add)
-				drv->add(sysdev);
+			if (drv->add) {
+				error = drv->add(sysdev);
+				if (error)
+					goto error_sysdev;
+			}
+			added_sysdev++;
 		}
 
 		/* Notify class auxillary drivers */
 		list_for_each_entry(drv, &cls->drivers, entry) {
-			if (drv->add)
-				drv->add(sysdev);
+			if (drv->add) {
+				error = drv->add(sysdev);
+				if (error)
+					goto error_cls;
+			}
+			added_cls++;
 		}
 		mutex_unlock(&sysdev_drivers_lock);
 	}
 	return error;
+
+error_cls:
+	list_for_each_entry(drv, &cls->drivers, entry) {
+		if (added_sysdev--)
+			break;
+		if (drv->add && drv->remove)
+			drv->remove(sysdev);
+	}
+error_sysdev:
+	list_for_each_entry(drv, &sysdev_drivers, entry) {
+		if (added_cls--)
+			break;
+		if (drv->add && drv->remove)
+			drv->remove(sysdev);
+	}
+	mutex_unlock(&sysdev_drivers_lock);
+	kobject_unregister(&sysdev->kobj);
+
+	return error;
 }
 
-void sysdev_unregister(struct sys_device * sysdev)
+void sysdev_unregister(struct sys_device *sysdev)
 {
-	struct sysdev_driver * drv;
+	struct sysdev_driver *drv;
 
 	mutex_lock(&sysdev_drivers_lock);
 	list_for_each_entry(drv, &sysdev_drivers, entry) {

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/10] sysfs: fix error handling in create_files()
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
  2007-07-16 13:50 ` [PATCH 1/10] sysfs: fix kmem_cache_free(NULL) Akinobu Mita
  2007-07-16 13:51 ` [PATCH 2/10] sysdev: add error check in sysdev_register() Akinobu Mita
@ 2007-07-16 13:52 ` Akinobu Mita
  2007-07-16 15:29   ` Cornelia Huck
  2007-07-16 13:53 ` [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE Akinobu Mita
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:52 UTC (permalink / raw)
  To: linux-kernel, Greg Kroah-Hartman

Current error handling in create_files() attempts to remove
all attributes passed by argument by remove_files(). But it should
only remove the attributes that have been successfully added.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 fs/sysfs/group.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

Index: 2.6-mm/fs/sysfs/group.c
===================================================================
--- 2.6-mm.orig/fs/sysfs/group.c
+++ 2.6-mm/fs/sysfs/group.c
@@ -30,13 +30,19 @@ static void remove_files(struct sysfs_di
 static int create_files(struct sysfs_dirent *dir_sd,
 			const struct attribute_group *grp)
 {
-	struct attribute *const* attr;
-	int error = 0;
+	int i;
+	int error;
+
+	for (i = 0; grp->attrs[i]; i++) {
+		error = sysfs_add_file(dir_sd, grp->attrs[i], SYSFS_KOBJ_ATTR);
+		if (error)
+			goto error;
+	}
+	return 0;
+error:
+	while (i--)
+		sysfs_hash_and_remove(dir_sd, grp->attrs[i]->name);
 
-	for (attr = grp->attrs; *attr && !error; attr++)
-		error = sysfs_add_file(dir_sd, *attr, SYSFS_KOBJ_ATTR);
-	if (error)
-		remove_files(dir_sd, grp);
 	return error;
 }
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (2 preceding siblings ...)
  2007-07-16 13:52 ` [PATCH 3/10] sysfs: fix error handling in create_files() Akinobu Mita
@ 2007-07-16 13:53 ` Akinobu Mita
  2007-07-17  8:23   ` Gautham R Shenoy
  2007-07-16 13:54 ` [PATCH 5/10] topology: remove topology_dev_map Akinobu Mita
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:53 UTC (permalink / raw)
  To: linux-kernel, Rusty Russell, Greg Kroah-Hartman, Dmitriy Zavin,
	H. Peter Anvin, Andi Kleen, Ashok Raj

The functions in a CPU notifier chain is called with CPU_UP_PREPARE event
before making the CPU online. If one of the callback returns NOTIFY_BAD,
it stops to deliver CPU_UP_PREPARE event, and CPU online operation is canceled.
Then CPU_UP_CANCELED event is delivered to the functions in a CPU notifier
chain again.

This CPU_UP_CANCELED event is delivered to the functions which have been
called with CPU_UP_PREPARE, not delivered to the functions which haven't
been called with CPU_UP_PREPARE.

The problem that makes existing cpu hotplug error handlings complex is
that the CPU_UP_CANCELED event is delivered to the function that has
returned NOTIFY_BAD, too.

Usually we don't expect to call destructor function against the
object that has failed to initialize. It is like:

	err = register_something();
	if (err) {
		unregister_something(); // bug
		return err;
	}

So it is natural to deliver CPU_UP_CANCELED event only to the functions
that have returned NOTIFY_OK with CPU_UP_PREPARE event and not to call
the function that have returned NOTIFY_BAD. This is what this patch is doing.

Otherwise, every cpu hotplug notifiler has to track whether
notifiler event is failed or not for each cpu.
(drivers/base/topology.c is doing this with topology_dev_map)

Similary this patch makes same thing with CPU_DOWN_PREPARE and
CPU_DOWN_FAILED evnets.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 kernel/cpu.c |    2 ++
 1 file changed, 2 insertions(+)

Index: 2.6-mm/kernel/cpu.c
===================================================================
--- 2.6-mm.orig/kernel/cpu.c
+++ 2.6-mm/kernel/cpu.c
@@ -138,6 +138,7 @@ static int _cpu_down(unsigned int cpu, i
 	err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod,
 					hcpu, -1, &nr_calls);
 	if (err == NOTIFY_BAD) {
+		nr_calls--;
 		__raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED | mod,
 					  hcpu, nr_calls, NULL);
 		printk("%s: attempt to take down CPU %u failed\n",
@@ -221,6 +222,7 @@ static int __cpuinit _cpu_up(unsigned in
 	ret = __raw_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE | mod, hcpu,
 							-1, &nr_calls);
 	if (ret == NOTIFY_BAD) {
+		nr_calls--;
 		printk("%s: attempt to bring up CPU %u failed\n",
 				__FUNCTION__, cpu);
 		ret = -EINVAL;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 5/10] topology: remove topology_dev_map
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (3 preceding siblings ...)
  2007-07-16 13:53 ` [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE Akinobu Mita
@ 2007-07-16 13:54 ` Akinobu Mita
  2007-07-16 13:57 ` [PATCH 6/10] thermal_throttle: fix cpu hotplug error handling Akinobu Mita
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:54 UTC (permalink / raw)
  To: linux-kernel, Rusty Russell, Greg Kroah-Hartman

By previous cpu hotplug notifier change, we don't need to track
topology_dev existence for each cpu by topology_dev_map.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 drivers/base/topology.c |   11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

Index: 2.6-mm/drivers/base/topology.c
===================================================================
--- 2.6-mm.orig/drivers/base/topology.c
+++ 2.6-mm/drivers/base/topology.c
@@ -94,27 +94,18 @@ static struct attribute_group topology_a
 	.name = "topology"
 };
 
-static cpumask_t topology_dev_map = CPU_MASK_NONE;
-
 /* Add/Remove cpu_topology interface for CPU device */
 static int __cpuinit topology_add_dev(unsigned int cpu)
 {
-	int rc;
 	struct sys_device *sys_dev = get_cpu_sysdev(cpu);
 
-	rc = sysfs_create_group(&sys_dev->kobj, &topology_attr_group);
-	if (!rc)
-		cpu_set(cpu, topology_dev_map);
-	return rc;
+	return sysfs_create_group(&sys_dev->kobj, &topology_attr_group);
 }
 
 static void __cpuinit topology_remove_dev(unsigned int cpu)
 {
 	struct sys_device *sys_dev = get_cpu_sysdev(cpu);
 
-	if (!cpu_isset(cpu, topology_dev_map))
-		return;
-	cpu_clear(cpu, topology_dev_map);
 	sysfs_remove_group(&sys_dev->kobj, &topology_attr_group);
 }
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 6/10] thermal_throttle: fix cpu hotplug error handling
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (4 preceding siblings ...)
  2007-07-16 13:54 ` [PATCH 5/10] topology: remove topology_dev_map Akinobu Mita
@ 2007-07-16 13:57 ` Akinobu Mita
  2007-07-16 13:58 ` [PATCH 7/10] msr: " Akinobu Mita
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:57 UTC (permalink / raw)
  To: linux-kernel, Dmitriy Zavin

Make thermal_throttle_add_dev() CPU_UP_PREPARE event handler
instead of CPU_ONLINE event handler.

Cc: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 arch/i386/kernel/cpu/mcheck/therm_throt.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Index: 2.6-mm/arch/i386/kernel/cpu/mcheck/therm_throt.c
===================================================================
--- 2.6-mm.orig/arch/i386/kernel/cpu/mcheck/therm_throt.c
+++ 2.6-mm/arch/i386/kernel/cpu/mcheck/therm_throt.c
@@ -131,23 +131,24 @@ static __cpuinit int thermal_throttle_cp
 {
 	unsigned int cpu = (unsigned long)hcpu;
 	struct sys_device *sys_dev;
-	int err;
+	int err = 0;
 
 	sys_dev = get_cpu_sysdev(cpu);
 	mutex_lock(&therm_cpu_lock);
 	switch (action) {
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
 		err = thermal_throttle_add_dev(sys_dev);
-		WARN_ON(err);
 		break;
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 		thermal_throttle_remove_dev(sys_dev);
 		break;
 	}
 	mutex_unlock(&therm_cpu_lock);
-	return NOTIFY_OK;
+	return err ? NOTIFY_BAD : NOTIFY_OK;
 }
 
 static struct notifier_block thermal_throttle_cpu_notifier =

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 7/10] msr: fix cpu hotplug error handling
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (5 preceding siblings ...)
  2007-07-16 13:57 ` [PATCH 6/10] thermal_throttle: fix cpu hotplug error handling Akinobu Mita
@ 2007-07-16 13:58 ` Akinobu Mita
  2007-07-16 13:58 ` [PATCH 8/10] cpuid: " Akinobu Mita
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:58 UTC (permalink / raw)
  To: linux-kernel, H. Peter Anvin

Make msr_device_create() CPU_UP_PREPARE event handler
instead of CPU_ONLINE handler.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 arch/i386/kernel/msr.c |   32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

Index: 2.6-mm/arch/i386/kernel/msr.c
===================================================================
--- 2.6-mm.orig/arch/i386/kernel/msr.c
+++ 2.6-mm/arch/i386/kernel/msr.c
@@ -135,33 +135,39 @@ static const struct file_operations msr_
 	.open = msr_open,
 };
 
-static int msr_device_create(int i)
+static int msr_device_create(int cpu)
 {
-	int err = 0;
 	struct device *dev;
 
-	dev = device_create(msr_class, NULL, MKDEV(MSR_MAJOR, i), "msr%d",i);
-	if (IS_ERR(dev))
-		err = PTR_ERR(dev);
-	return err;
+	dev = device_create(msr_class, NULL, MKDEV(MSR_MAJOR, cpu),
+			    "msr%d", cpu);
+	return IS_ERR(dev) ? PTR_ERR(dev) : 0;
+}
+
+static void msr_device_destroy(int cpu)
+{
+	device_destroy(msr_class, MKDEV(MSR_MAJOR, cpu));
 }
 
 static int msr_class_cpu_callback(struct notifier_block *nfb,
 				unsigned long action, void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
+	int err = 0;
 
 	switch (action) {
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		msr_device_create(cpu);
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		err = msr_device_create(cpu);
 		break;
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		device_destroy(msr_class, MKDEV(MSR_MAJOR, cpu));
+		msr_device_destroy(cpu);
 		break;
 	}
-	return NOTIFY_OK;
+	return err ? NOTIFY_BAD : NOTIFY_OK;
 }
 
 static struct notifier_block __cpuinitdata msr_class_cpu_notifier =
@@ -198,7 +204,7 @@ static int __init msr_init(void)
 out_class:
 	i = 0;
 	for_each_online_cpu(i)
-		device_destroy(msr_class, MKDEV(MSR_MAJOR, i));
+		msr_device_destroy(i);
 	class_destroy(msr_class);
 out_chrdev:
 	unregister_chrdev(MSR_MAJOR, "cpu/msr");
@@ -210,7 +216,7 @@ static void __exit msr_exit(void)
 {
 	int cpu = 0;
 	for_each_online_cpu(cpu)
-		device_destroy(msr_class, MKDEV(MSR_MAJOR, cpu));
+		msr_device_destroy(cpu);
 	class_destroy(msr_class);
 	unregister_chrdev(MSR_MAJOR, "cpu/msr");
 	unregister_hotcpu_notifier(&msr_class_cpu_notifier);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 8/10] cpuid: fix cpu hotplug error handling
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (6 preceding siblings ...)
  2007-07-16 13:58 ` [PATCH 7/10] msr: " Akinobu Mita
@ 2007-07-16 13:58 ` Akinobu Mita
  2007-07-16 14:00 ` [PATCH 9/10] mce: " Akinobu Mita
  2007-07-16 14:01 ` [PATCH 10/10] intel_cacheinfo: " Akinobu Mita
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 13:58 UTC (permalink / raw)
  To: linux-kernel, H. Peter Anvin

Make cpuid_device_create() CPU_UP_PREPARE event handler
instead of CPU_ONLINE event handler.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 arch/i386/kernel/cpuid.c |   32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

Index: 2.6-mm/arch/i386/kernel/cpuid.c
===================================================================
--- 2.6-mm.orig/arch/i386/kernel/cpuid.c
+++ 2.6-mm/arch/i386/kernel/cpuid.c
@@ -152,32 +152,38 @@ static const struct file_operations cpui
 	.open = cpuid_open,
 };
 
-static int cpuid_device_create(int i)
+static int cpuid_device_create(int cpu)
 {
-	int err = 0;
 	struct device *dev;
 
-	dev = device_create(cpuid_class, NULL, MKDEV(CPUID_MAJOR, i), "cpu%d",i);
-	if (IS_ERR(dev))
-		err = PTR_ERR(dev);
-	return err;
+	dev = device_create(cpuid_class, NULL, MKDEV(CPUID_MAJOR, cpu),
+			    "cpu%d", cpu);
+	return IS_ERR(dev) ? PTR_ERR(dev) : 0;
+}
+
+static void cpuid_device_destroy(int cpu)
+{
+	device_destroy(cpuid_class, MKDEV(CPUID_MAJOR, cpu));
 }
 
 static int cpuid_class_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
+	int err = 0;
 
 	switch (action) {
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		cpuid_device_create(cpu);
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		err = cpuid_device_create(cpu);
 		break;
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
-		device_destroy(cpuid_class, MKDEV(CPUID_MAJOR, cpu));
+		cpuid_device_destroy(cpu);
 		break;
 	}
-	return NOTIFY_OK;
+	return err ? NOTIFY_BAD : NOTIFY_OK;
 }
 
 static struct notifier_block __cpuinitdata cpuid_class_cpu_notifier =
@@ -214,7 +220,7 @@ static int __init cpuid_init(void)
 out_class:
 	i = 0;
 	for_each_online_cpu(i) {
-		device_destroy(cpuid_class, MKDEV(CPUID_MAJOR, i));
+		cpuid_device_destroy(i);
 	}
 	class_destroy(cpuid_class);
 out_chrdev:
@@ -228,7 +234,7 @@ static void __exit cpuid_exit(void)
 	int cpu = 0;
 
 	for_each_online_cpu(cpu)
-		device_destroy(cpuid_class, MKDEV(CPUID_MAJOR, cpu));
+		cpuid_device_destroy(cpu);
 	class_destroy(cpuid_class);
 	unregister_chrdev(CPUID_MAJOR, "cpu/cpuid");
 	unregister_hotcpu_notifier(&cpuid_class_cpu_notifier);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 9/10] mce: fix cpu hotplug error handling
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (7 preceding siblings ...)
  2007-07-16 13:58 ` [PATCH 8/10] cpuid: " Akinobu Mita
@ 2007-07-16 14:00 ` Akinobu Mita
  2007-07-16 14:01 ` [PATCH 10/10] intel_cacheinfo: " Akinobu Mita
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 14:00 UTC (permalink / raw)
  To: linux-kernel, Andi Kleen

- Fix error handling in mce_create_device()

  Error handling should not do sysdev_remove_file() with not yet added
  attributes.

- Change mce_create_device() from CPU_ONLINE event handler
  to CPU_UP_PREPARE event handler.

Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 arch/x86_64/kernel/mce.c |   35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

Index: 2.6-mm/arch/x86_64/kernel/mce.c
===================================================================
--- 2.6-mm.orig/arch/x86_64/kernel/mce.c
+++ 2.6-mm/arch/x86_64/kernel/mce.c
@@ -803,19 +803,30 @@ static __cpuinit int mce_create_device(u
 {
 	int err;
 	int i;
-	if (!mce_available(&cpu_data[cpu]))
-		return -EIO;
 
+	memset(&per_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
 	per_cpu(device_mce,cpu).id = cpu;
 	per_cpu(device_mce,cpu).cls = &mce_sysclass;
 
 	err = sysdev_register(&per_cpu(device_mce,cpu));
+	if (err)
+		return err;
+
+	for (i = 0; mce_attributes[i]; i++) {
+		err = sysdev_create_file(&per_cpu(device_mce,cpu),
+					 mce_attributes[i]);
+		if (err)
+			goto error;
+	}
 
-	if (!err) {
-		for (i = 0; mce_attributes[i]; i++)
-			sysdev_create_file(&per_cpu(device_mce,cpu),
-				mce_attributes[i]);
+	return 0;
+error:
+	while (i--) {
+		sysdev_remove_file(&per_cpu(device_mce,cpu),
+				   mce_attributes[i]);
 	}
+	sysdev_unregister(&per_cpu(device_mce,cpu));
+
 	return err;
 }
 
@@ -827,7 +838,6 @@ static void mce_remove_device(unsigned i
 		sysdev_remove_file(&per_cpu(device_mce,cpu),
 			mce_attributes[i]);
 	sysdev_unregister(&per_cpu(device_mce,cpu));
-	memset(&per_cpu(device_mce, cpu).kobj, 0, sizeof(struct kobject));
 }
 
 /* Get notified when a cpu comes on/off. Be hotplug friendly. */
@@ -835,18 +845,21 @@ static int
 mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
 	unsigned int cpu = (unsigned long)hcpu;
+	int err = 0;
 
 	switch (action) {
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		mce_create_device(cpu);
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
+		err = mce_create_device(cpu);
 		break;
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 		mce_remove_device(cpu);
 		break;
 	}
-	return NOTIFY_OK;
+	return err ? NOTIFY_BAD : NOTIFY_OK;
 }
 
 static struct notifier_block mce_cpu_notifier = {

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 10/10] intel_cacheinfo: fix cpu hotplug error handling
  2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
                   ` (8 preceding siblings ...)
  2007-07-16 14:00 ` [PATCH 9/10] mce: " Akinobu Mita
@ 2007-07-16 14:01 ` Akinobu Mita
  9 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 14:01 UTC (permalink / raw)
  To: linux-kernel, Ashok Raj

- Fix resource leakage in error case within detect_cache_attributes()

- Introduce cache_dev_map cpumask to track whether cache interface for
  CPU is successfully added by cache_add_dev() or not.

  cache_add_dev() may fail with out of memory error. In order to
  avoid cache_remove_dev() with that uninitialized cache interface when
  CPU_DEAD event is delivered we need to have the cache_dev_map cpumask.

  (We cannot change cache_add_dev() from CPU_ONLINE event handler
  to CPU_UP_PREPARE event handler. Because cache_add_dev() needs
  to do cpuid and store the results with its CPU online.)

Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>

---
 arch/i386/kernel/cpu/intel_cacheinfo.c |   57 +++++++++++++++++++++++----------
 1 file changed, 40 insertions(+), 17 deletions(-)

Index: 2.6-mm/arch/i386/kernel/cpu/intel_cacheinfo.c
===================================================================
--- 2.6-mm.orig/arch/i386/kernel/cpu/intel_cacheinfo.c
+++ 2.6-mm/arch/i386/kernel/cpu/intel_cacheinfo.c
@@ -502,6 +502,11 @@ static void __init cache_remove_shared_c
 
 static void free_cache_attributes(unsigned int cpu)
 {
+	int i;
+
+	for (i = 0; i < num_cache_leaves; i++)
+		cache_remove_shared_cpu_map(cpu, i);
+
 	kfree(cpuid4_info[cpu]);
 	cpuid4_info[cpu] = NULL;
 }
@@ -509,8 +514,8 @@ static void free_cache_attributes(unsign
 static int __cpuinit detect_cache_attributes(unsigned int cpu)
 {
 	struct _cpuid4_info	*this_leaf;
-	unsigned long 		j;
-	int 			retval;
+	unsigned long		j;
+	int			retval;
 	cpumask_t		oldmask;
 
 	if (num_cache_leaves == 0)
@@ -527,19 +532,26 @@ static int __cpuinit detect_cache_attrib
 		goto out;
 
 	/* Do cpuid and store the results */
-	retval = 0;
 	for (j = 0; j < num_cache_leaves; j++) {
 		this_leaf = CPUID4_INFO_IDX(cpu, j);
 		retval = cpuid4_cache_lookup(j, this_leaf);
-		if (unlikely(retval < 0))
+		if (unlikely(retval < 0)) {
+			int i;
+
+			for (i = 0; i < j; i++)
+				cache_remove_shared_cpu_map(cpu, i);
 			break;
+		}
 		cache_shared_cpu_map_setup(cpu, j);
 	}
 	set_cpus_allowed(current, oldmask);
 
 out:
-	if (retval)
-		free_cache_attributes(cpu);
+	if (retval) {
+		kfree(cpuid4_info[cpu]);
+		cpuid4_info[cpu] = NULL;
+	}
+
 	return retval;
 }
 
@@ -683,13 +695,14 @@ static void cpuid4_cache_sysfs_exit(unsi
 
 static int __cpuinit cpuid4_cache_sysfs_init(unsigned int cpu)
 {
+	int err;
 
 	if (num_cache_leaves == 0)
 		return -ENOENT;
 
-	detect_cache_attributes(cpu);
-	if (cpuid4_info[cpu] == NULL)
-		return -ENOENT;
+	err = detect_cache_attributes(cpu);
+	if (err)
+		return err;
 
 	/* Allocate all required memory */
 	cache_kobject[cpu] = kzalloc(sizeof(struct kobject), GFP_KERNEL);
@@ -708,13 +721,15 @@ err_out:
 	return -ENOMEM;
 }
 
+static cpumask_t cache_dev_map = CPU_MASK_NONE;
+
 /* Add/Remove cache interface for CPU device */
 static int __cpuinit cache_add_dev(struct sys_device * sys_dev)
 {
 	unsigned int cpu = sys_dev->id;
 	unsigned long i, j;
 	struct _index_kobject *this_object;
-	int retval = 0;
+	int retval;
 
 	retval = cpuid4_cache_sysfs_init(cpu);
 	if (unlikely(retval < 0))
@@ -724,6 +739,10 @@ static int __cpuinit cache_add_dev(struc
 	kobject_set_name(cache_kobject[cpu], "%s", "cache");
 	cache_kobject[cpu]->ktype = &ktype_percpu_entry;
 	retval = kobject_register(cache_kobject[cpu]);
+	if (retval < 0) {
+		cpuid4_cache_sysfs_exit(cpu);
+		return retval;
+	}
 
 	for (i = 0; i < num_cache_leaves; i++) {
 		this_object = INDEX_KOBJECT_PTR(cpu,i);
@@ -743,6 +762,9 @@ static int __cpuinit cache_add_dev(struc
 			break;
 		}
 	}
+	if (!retval)
+		cpu_set(cpu, cache_dev_map);
+
 	return retval;
 }
 
@@ -751,13 +773,14 @@ static void __cpuinit cache_remove_dev(s
 	unsigned int cpu = sys_dev->id;
 	unsigned long i;
 
-	for (i = 0; i < num_cache_leaves; i++) {
-		cache_remove_shared_cpu_map(cpu, i);
+	if (!cpu_isset(cpu, cache_dev_map))
+		return;
+	cpu_clear(cpu, cache_dev_map);
+
+	for (i = 0; i < num_cache_leaves; i++)
 		kobject_unregister(&(INDEX_KOBJECT_PTR(cpu,i)->kobj));
-	}
 	kobject_unregister(cache_kobject[cpu]);
 	cpuid4_cache_sysfs_exit(cpu);
-	return;
 }
 
 static int __cpuinit cacheinfo_cpu_callback(struct notifier_block *nfb,
@@ -782,7 +805,7 @@ static int __cpuinit cacheinfo_cpu_callb
 
 static struct notifier_block __cpuinitdata cacheinfo_cpu_notifier =
 {
-    .notifier_call = cacheinfo_cpu_callback,
+	.notifier_call = cacheinfo_cpu_callback,
 };
 
 static int __cpuinit cache_sysfs_init(void)
@@ -795,8 +818,8 @@ static int __cpuinit cache_sysfs_init(vo
 	register_hotcpu_notifier(&cacheinfo_cpu_notifier);
 
 	for_each_online_cpu(i) {
-		cacheinfo_cpu_callback(&cacheinfo_cpu_notifier, CPU_ONLINE,
-			(void *)(long)i);
+		struct sys_device *sys_dev = get_cpu_sysdev(i);
+		cache_add_dev(sys_dev);
 	}
 
 	return 0;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/10] sysdev: add error check in sysdev_register()
  2007-07-16 13:51 ` [PATCH 2/10] sysdev: add error check in sysdev_register() Akinobu Mita
@ 2007-07-16 15:12   ` Cornelia Huck
  2007-07-16 16:18     ` Akinobu Mita
  0 siblings, 1 reply; 24+ messages in thread
From: Cornelia Huck @ 2007-07-16 15:12 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linux-kernel, Greg Kroah-Hartman

On Mon, 16 Jul 2007 22:51:38 +0900,
Akinobu Mita <akinobu.mita@gmail.com> wrote:

> This patch enables to catch the errors returned by add() procedure of
> sysdev driver in sysdev_register.

> @@ -253,23 +254,50 @@ int sysdev_register(struct sys_device * 
> 
>  		/* Notify global drivers */
>  		list_for_each_entry(drv, &sysdev_drivers, entry) {
> -			if (drv->add)
> -				drv->add(sysdev);
> +			if (drv->add) {
> +				error = drv->add(sysdev);
> +				if (error)
> +					goto error_sysdev;
> +			}
> +			added_sysdev++;
>  		}

This looks asymmetric to me. Imagine we have a device/driver
combination for which drv->add will return -ESOMEERROR. Now, it depends
on the order:

1. The device is registered when the driver is already registered:
registration of the device will fail.

2. The driver is registered when the device is already registered:
driver registration will succeed, and the device will stay registered.

I'm not sure if that is what we really want.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/10] sysfs: fix error handling in create_files()
  2007-07-16 13:52 ` [PATCH 3/10] sysfs: fix error handling in create_files() Akinobu Mita
@ 2007-07-16 15:29   ` Cornelia Huck
  2007-07-16 16:28     ` Akinobu Mita
  2007-07-16 16:29     ` Greg KH
  0 siblings, 2 replies; 24+ messages in thread
From: Cornelia Huck @ 2007-07-16 15:29 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: linux-kernel, Greg Kroah-Hartman

On Mon, 16 Jul 2007 22:52:30 +0900,
Akinobu Mita <akinobu.mita@gmail.com> wrote:

> Current error handling in create_files() attempts to remove
> all attributes passed by argument by remove_files(). But it should
> only remove the attributes that have been successfully added.

While this is certainly cleaner, a question out of curiousity: Does
this fix any problem you saw? sysfs_hash_and_remove() used to be safe
on non-existing attributes...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/10] sysfs: fix kmem_cache_free(NULL)
  2007-07-16 13:50 ` [PATCH 1/10] sysfs: fix kmem_cache_free(NULL) Akinobu Mita
@ 2007-07-16 15:43   ` Greg KH
  2007-07-16 16:53     ` Akinobu Mita
  0 siblings, 1 reply; 24+ messages in thread
From: Greg KH @ 2007-07-16 15:43 UTC (permalink / raw)
  To: Akinobu Mita, linux-kernel

On Mon, Jul 16, 2007 at 10:50:28PM +0900, Akinobu Mita wrote:
> This patch fixes out of memory error handling in sysfs_new_dirent().
> kmem_cache_free() with NULL is not allowed.

Why not just allow kmem_cache_free() to allow NULL like other functions
in the kernel?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/10] sysdev: add error check in sysdev_register()
  2007-07-16 15:12   ` Cornelia Huck
@ 2007-07-16 16:18     ` Akinobu Mita
  0 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 16:18 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: linux-kernel, Greg Kroah-Hartman

2007/7/17, Cornelia Huck <cornelia.huck@de.ibm.com>:
> On Mon, 16 Jul 2007 22:51:38 +0900,
> Akinobu Mita <akinobu.mita@gmail.com> wrote:
>
> > This patch enables to catch the errors returned by add() procedure of
> > sysdev driver in sysdev_register.
>
> > @@ -253,23 +254,50 @@ int sysdev_register(struct sys_device *
> >
> >               /* Notify global drivers */
> >               list_for_each_entry(drv, &sysdev_drivers, entry) {
> > -                     if (drv->add)
> > -                             drv->add(sysdev);
> > +                     if (drv->add) {
> > +                             error = drv->add(sysdev);
> > +                             if (error)
> > +                                     goto error_sysdev;
> > +                     }
> > +                     added_sysdev++;
> >               }
>
> This looks asymmetric to me. Imagine we have a device/driver
> combination for which drv->add will return -ESOMEERROR. Now, it depends
> on the order:
>
> 1. The device is registered when the driver is already registered:
> registration of the device will fail.
>
> 2. The driver is registered when the device is already registered:
> driver registration will succeed, and the device will stay registered.
>
> I'm not sure if that is what we really want.

Thanks, I didn't realize the asymmetry.
I think we need to add same check to sysdev_driver_register().
But I need to check every sysdev driver's add() functions before
making this change. So this patch dropped from this series of patches
for now.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/10] sysfs: fix error handling in create_files()
  2007-07-16 15:29   ` Cornelia Huck
@ 2007-07-16 16:28     ` Akinobu Mita
  2007-07-16 16:29     ` Greg KH
  1 sibling, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 16:28 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: linux-kernel, Greg Kroah-Hartman

2007/7/17, Cornelia Huck <cornelia.huck@de.ibm.com>:
> On Mon, 16 Jul 2007 22:52:30 +0900,
> Akinobu Mita <akinobu.mita@gmail.com> wrote:
>
> > Current error handling in create_files() attempts to remove
> > all attributes passed by argument by remove_files(). But it should
> > only remove the attributes that have been successfully added.
>
> While this is certainly cleaner, a question out of curiousity: Does
> this fix any problem you saw? sysfs_hash_and_remove() used to be safe
> on non-existing attributes...

Maybe I folded all suspicious bugfixes that I found while I spend a lot of
time to pass my cpu hotplug/unplug test script.

I'll test again without this patch.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/10] sysfs: fix error handling in create_files()
  2007-07-16 15:29   ` Cornelia Huck
  2007-07-16 16:28     ` Akinobu Mita
@ 2007-07-16 16:29     ` Greg KH
  2007-07-17 16:22       ` Akinobu Mita
  1 sibling, 1 reply; 24+ messages in thread
From: Greg KH @ 2007-07-16 16:29 UTC (permalink / raw)
  To: Cornelia Huck; +Cc: Akinobu Mita, linux-kernel

On Mon, Jul 16, 2007 at 05:29:45PM +0200, Cornelia Huck wrote:
> On Mon, 16 Jul 2007 22:52:30 +0900,
> Akinobu Mita <akinobu.mita@gmail.com> wrote:
> 
> > Current error handling in create_files() attempts to remove
> > all attributes passed by argument by remove_files(). But it should
> > only remove the attributes that have been successfully added.
> 
> While this is certainly cleaner, a question out of curiousity: Does
> this fix any problem you saw? sysfs_hash_and_remove() used to be safe
> on non-existing attributes...

I agree, the existing code should work just fine, are you finding that
it does not for some reason?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/10] sysfs: fix kmem_cache_free(NULL)
  2007-07-16 15:43   ` Greg KH
@ 2007-07-16 16:53     ` Akinobu Mita
  2007-07-17  7:04       ` Pekka J Enberg
  0 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-16 16:53 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel, Pekka Enberg

2007/7/17, Greg KH <gregkh@suse.de>:
> On Mon, Jul 16, 2007 at 10:50:28PM +0900, Akinobu Mita wrote:
> > This patch fixes out of memory error handling in sysfs_new_dirent().
> > kmem_cache_free() with NULL is not allowed.
>
> Why not just allow kmem_cache_free() to allow NULL like other functions
> in the kernel?

It seems there was discussion about this and no conclusion
yet?

http://lkml.org/lkml/2007/3/19/42

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/10] sysfs: fix kmem_cache_free(NULL)
  2007-07-16 16:53     ` Akinobu Mita
@ 2007-07-17  7:04       ` Pekka J Enberg
  0 siblings, 0 replies; 24+ messages in thread
From: Pekka J Enberg @ 2007-07-17  7:04 UTC (permalink / raw)
  To: Akinobu Mita; +Cc: Greg KH, linux-kernel

2007/7/17, Greg KH <gregkh@suse.de>:
> > Why not just allow kmem_cache_free() to allow NULL like other functions
> > in the kernel?

On Tue, 17 Jul 2007, Akinobu Mita wrote:
> It seems there was discussion about this and no conclusion
> yet?
 
IIRC the patch was rejected by relevant maintainers because it slows 
down kmem_cache_free() which is presumably bad for some hot-paths...

				Pekka

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
  2007-07-16 13:53 ` [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE Akinobu Mita
@ 2007-07-17  8:23   ` Gautham R Shenoy
  2007-07-17 17:18     ` Akinobu Mita
  0 siblings, 1 reply; 24+ messages in thread
From: Gautham R Shenoy @ 2007-07-17  8:23 UTC (permalink / raw)
  To: Akinobu Mita, linux-kernel, Rusty Russell, Greg Kroah-Hartman,
	Dmitriy Zavin, H. Peter Anvin, Andi Kleen, Ashok Raj
  Cc: Srivatsa Vaddagiri, heiko.carstens, kiran, clameter, Andrew Morton

Hi Akinobu,

On Mon, Jul 16, 2007 at 10:53:47PM +0900, Akinobu Mita wrote:
> The functions in a CPU notifier chain is called with CPU_UP_PREPARE event
> before making the CPU online. If one of the callback returns NOTIFY_BAD,
> it stops to deliver CPU_UP_PREPARE event, and CPU online operation is canceled.
> Then CPU_UP_CANCELED event is delivered to the functions in a CPU notifier
> chain again.
> 
> This CPU_UP_CANCELED event is delivered to the functions which have been
> called with CPU_UP_PREPARE, not delivered to the functions which haven't
> been called with CPU_UP_PREPARE.
> 
> The problem that makes existing cpu hotplug error handlings complex is
> that the CPU_UP_CANCELED event is delivered to the function that has
> returned NOTIFY_BAD, too.
> 
> Usually we don't expect to call destructor function against the
> object that has failed to initialize. It is like:
> 
> 	err = register_something();
> 	if (err) {
> 		unregister_something(); // bug
> 		return err;
> 	}
> 
> So it is natural to deliver CPU_UP_CANCELED event only to the functions
> that have returned NOTIFY_OK with CPU_UP_PREPARE event and not to call
> the function that have returned NOTIFY_BAD. This is what this patch is doing.

Yes, this makes sense. However, it might break slab.
If I am not mistaken, slab code initializes multiple objects in 
CPU_UP_PREPARE and relies on the CPU_UP_CANCELLED to destroy the 
objects which successfully got initialized before the some object's
initialization went bad.

But then, I'm no expert on slab, so Cc'ing the right people.

There must be a way to share the code across CPU_UP_CANCELLED and
destruction of correctly initialized objects in CPU_UP_PREPARE 
(on a failure). That fix might be required before this patch goes in.

Regards
gautham.

> 
> Otherwise, every cpu hotplug notifiler has to track whether
> notifiler event is failed or not for each cpu.
> (drivers/base/topology.c is doing this with topology_dev_map)
> 
> Similary this patch makes same thing with CPU_DOWN_PREPARE and
> CPU_DOWN_FAILED evnets.
> 
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> 
> ---
>  kernel/cpu.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> Index: 2.6-mm/kernel/cpu.c
> ===================================================================
> --- 2.6-mm.orig/kernel/cpu.c
> +++ 2.6-mm/kernel/cpu.c
> @@ -138,6 +138,7 @@ static int _cpu_down(unsigned int cpu, i
>  	err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod,
>  					hcpu, -1, &nr_calls);
>  	if (err == NOTIFY_BAD) {
> +		nr_calls--;
>  		__raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED | mod,
>  					  hcpu, nr_calls, NULL);
>  		printk("%s: attempt to take down CPU %u failed\n",
> @@ -221,6 +222,7 @@ static int __cpuinit _cpu_up(unsigned in
>  	ret = __raw_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE | mod, hcpu,
>  							-1, &nr_calls);
>  	if (ret == NOTIFY_BAD) {
> +		nr_calls--;
>  		printk("%s: attempt to bring up CPU %u failed\n",
>  				__FUNCTION__, cpu);
>  		ret = -EINVAL;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/10] sysfs: fix error handling in create_files()
  2007-07-16 16:29     ` Greg KH
@ 2007-07-17 16:22       ` Akinobu Mita
  0 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-17 16:22 UTC (permalink / raw)
  To: Greg KH; +Cc: Cornelia Huck, linux-kernel

2007/7/17, Greg KH <gregkh@suse.de>:
> On Mon, Jul 16, 2007 at 05:29:45PM +0200, Cornelia Huck wrote:
> > On Mon, 16 Jul 2007 22:52:30 +0900,
> > Akinobu Mita <akinobu.mita@gmail.com> wrote:
> >
> > > Current error handling in create_files() attempts to remove
> > > all attributes passed by argument by remove_files(). But it should
> > > only remove the attributes that have been successfully added.
> >
> > While this is certainly cleaner, a question out of curiousity: Does
> > this fix any problem you saw? sysfs_hash_and_remove() used to be safe
> > on non-existing attributes...
>
> I agree, the existing code should work just fine, are you finding that
> it does not for some reason?

As you and Cornelia said, this patch doesn't fix anything.
I made a wrong guess for some oops call trace.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
  2007-07-17  8:23   ` Gautham R Shenoy
@ 2007-07-17 17:18     ` Akinobu Mita
  2007-07-18  6:54       ` Gautham R Shenoy
  0 siblings, 1 reply; 24+ messages in thread
From: Akinobu Mita @ 2007-07-17 17:18 UTC (permalink / raw)
  To: ego
  Cc: linux-kernel, Rusty Russell, Greg Kroah-Hartman, Dmitriy Zavin,
	H. Peter Anvin, Andi Kleen, Ashok Raj, Srivatsa Vaddagiri,
	heiko.carstens, kiran, clameter, Andrew Morton

> > So it is natural to deliver CPU_UP_CANCELED event only to the functions
> > that have returned NOTIFY_OK with CPU_UP_PREPARE event and not to call
> > the function that have returned NOTIFY_BAD. This is what this patch is doing.
>
> Yes, this makes sense.

Thank you for making sure of it.

>[...] However, it might break slab.
> If I am not mistaken, slab code initializes multiple objects in
> CPU_UP_PREPARE and relies on the CPU_UP_CANCELLED to destroy the
> objects which successfully got initialized before the some object's
> initialization went bad.

My testing machine is ordinary dual core non numa box. So it might not
trigger the problem that you are warried about under heavy slab alloc
failure injection.

At first glance I couln't find the problem in cpu hottplug code in slab.c yet,
but found some memory leak path. (it doesn't break slab though)

Index: 2.6-git/mm/slab.c
===================================================================
--- 2.6-git.orig/mm/slab.c
+++ 2.6-git/mm/slab.c
@@ -1221,13 +1221,18 @@ static int __cpuinit cpuup_callback(stru
 				shared = alloc_arraycache(node,
 					cachep->shared * cachep->batchcount,
 					0xbaadf00d);
-				if (!shared)
+				if (!shared) {
+					kfree(nc);
 					goto bad;
+				}
 			}
 			if (use_alien_caches) {
                                 alien = alloc_alien_cache(node, cachep->limit);
-                                if (!alien)
-                                        goto bad;
+                                if (!alien) {
+					kfree(shared);
+					kfree(nc);
+					goto bad;
+				}
                         }
 			cachep->array[cpu] = nc;
 			l3 = cachep->nodelists[node];

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
  2007-07-17 17:18     ` Akinobu Mita
@ 2007-07-18  6:54       ` Gautham R Shenoy
  2007-07-18 16:28         ` Akinobu Mita
  0 siblings, 1 reply; 24+ messages in thread
From: Gautham R Shenoy @ 2007-07-18  6:54 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-kernel, Rusty Russell, Greg Kroah-Hartman, Dmitriy Zavin,
	H. Peter Anvin, Andi Kleen, Ashok Raj, Srivatsa Vaddagiri,
	heiko.carstens, kiran, clameter, Andrew Morton

Hi!
On Wed, Jul 18, 2007 at 02:18:41AM +0900, Akinobu Mita wrote:
> >> So it is natural to deliver CPU_UP_CANCELED event only to the functions
> >> that have returned NOTIFY_OK with CPU_UP_PREPARE event and not to call
> >> the function that have returned NOTIFY_BAD. This is what this patch is 
> >doing.
> >
> >Yes, this makes sense.
> 
> Thank you for making sure of it.
> 
> >[...] However, it might break slab.
> >If I am not mistaken, slab code initializes multiple objects in
> >CPU_UP_PREPARE and relies on the CPU_UP_CANCELLED to destroy the
> >objects which successfully got initialized before the some object's
> >initialization went bad.
> 
> My testing machine is ordinary dual core non numa box. So it might not
> trigger the problem that you are warried about under heavy slab alloc
> failure injection.
> 
> At first glance I couln't find the problem in cpu hottplug code in slab.c 
> yet,
> but found some memory leak path. (it doesn't break slab though)

That's what I meant. I shouldn't have used the word "break" :-)
In case of slab, freeing up of resources on an error during CPU_UP_PREPARE,
is currently handled in CPU_UP_CANCELLED.

But, like you reasoned out, it makes more sense for such a subsystem
to free up all the correctly allocated resources before sending a
NOTIFY_BAD, rather than handling it in CPU_UP_CANCELLED. And slab 
needed that fix, which you've provided, before we send the notification
to (nr_calls - 1) callers.

So could you add this patch to series? 

Thanks and Regards
gautham.
> 
> Index: 2.6-git/mm/slab.c
> ===================================================================
> --- 2.6-git.orig/mm/slab.c
> +++ 2.6-git/mm/slab.c
> @@ -1221,13 +1221,18 @@ static int __cpuinit cpuup_callback(stru
> 				shared = alloc_arraycache(node,
> 					cachep->shared * cachep->batchcount,
> 					0xbaadf00d);
> -				if (!shared)
> +				if (!shared) {
> +					kfree(nc);
> 					goto bad;
> +				}
> 			}
> 			if (use_alien_caches) {
>                                 alien = alloc_alien_cache(node, 
>                                 cachep->limit);
> -                                if (!alien)
> -                                        goto bad;
> +                                if (!alien) {
> +					kfree(shared);
> +					kfree(nc);
> +					goto bad;
> +				}
>                         }
> 			cachep->array[cpu] = nc;
> 			l3 = cachep->nodelists[node];

-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE
  2007-07-18  6:54       ` Gautham R Shenoy
@ 2007-07-18 16:28         ` Akinobu Mita
  0 siblings, 0 replies; 24+ messages in thread
From: Akinobu Mita @ 2007-07-18 16:28 UTC (permalink / raw)
  To: ego
  Cc: linux-kernel, Rusty Russell, Greg Kroah-Hartman, Dmitriy Zavin,
	H. Peter Anvin, Andi Kleen, Ashok Raj, Srivatsa Vaddagiri,
	heiko.carstens, kiran, clameter, Andrew Morton

> > >[...] However, it might break slab.
> > >If I am not mistaken, slab code initializes multiple objects in
> > >CPU_UP_PREPARE and relies on the CPU_UP_CANCELLED to destroy the
> > >objects which successfully got initialized before the some object's
> > >initialization went bad.
> >
> > My testing machine is ordinary dual core non numa box. So it might not
> > trigger the problem that you are warried about under heavy slab alloc
> > failure injection.
> >
> > At first glance I couln't find the problem in cpu hottplug code in slab.c
> > yet,
> > but found some memory leak path. (it doesn't break slab though)
>
> That's what I meant. I shouldn't have used the word "break" :-)
> In case of slab, freeing up of resources on an error during CPU_UP_PREPARE,
> is currently handled in CPU_UP_CANCELLED.

Now I perfectly understand your concern. The last memleak fix
patch did not cover for each cachep->array[cpu] in cache_chain.
So cpu hotplug error handling in slab becomes worse by this change.

> But, like you reasoned out, it makes more sense for such a subsystem
> to free up all the correctly allocated resources before sending a
> NOTIFY_BAD, rather than handling it in CPU_UP_CANCELLED. And slab
> needed that fix, which you've provided, before we send the notification
> to (nr_calls - 1) callers.
>
> So could you add this patch to series?

Sure, and I'll CC you on the slab change.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2007-07-18 16:28 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-16 13:48 [PATCH 0/10] CPU hotplug error handling fixes Akinobu Mita
2007-07-16 13:50 ` [PATCH 1/10] sysfs: fix kmem_cache_free(NULL) Akinobu Mita
2007-07-16 15:43   ` Greg KH
2007-07-16 16:53     ` Akinobu Mita
2007-07-17  7:04       ` Pekka J Enberg
2007-07-16 13:51 ` [PATCH 2/10] sysdev: add error check in sysdev_register() Akinobu Mita
2007-07-16 15:12   ` Cornelia Huck
2007-07-16 16:18     ` Akinobu Mita
2007-07-16 13:52 ` [PATCH 3/10] sysfs: fix error handling in create_files() Akinobu Mita
2007-07-16 15:29   ` Cornelia Huck
2007-07-16 16:28     ` Akinobu Mita
2007-07-16 16:29     ` Greg KH
2007-07-17 16:22       ` Akinobu Mita
2007-07-16 13:53 ` [PATCH 4/10] cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE Akinobu Mita
2007-07-17  8:23   ` Gautham R Shenoy
2007-07-17 17:18     ` Akinobu Mita
2007-07-18  6:54       ` Gautham R Shenoy
2007-07-18 16:28         ` Akinobu Mita
2007-07-16 13:54 ` [PATCH 5/10] topology: remove topology_dev_map Akinobu Mita
2007-07-16 13:57 ` [PATCH 6/10] thermal_throttle: fix cpu hotplug error handling Akinobu Mita
2007-07-16 13:58 ` [PATCH 7/10] msr: " Akinobu Mita
2007-07-16 13:58 ` [PATCH 8/10] cpuid: " Akinobu Mita
2007-07-16 14:00 ` [PATCH 9/10] mce: " Akinobu Mita
2007-07-16 14:01 ` [PATCH 10/10] intel_cacheinfo: " Akinobu Mita

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).