linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
@ 2013-02-04 22:45 Nathan Zimmer
  2013-02-04 22:45 ` [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
                   ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-04 22:45 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

I am noticing the cpufreq_driver_lock is quite hot.
On an idle 512 system perf shows me most of the system time is spent on this
lock.  This is quite signifigant as top shows 5% of time in system time.
My solution was to first convert the lock to a rwlock and then to the rcu.


Nathan Zimmer (2):
  cpufreq: Convert the cpufreq_driver_lock to a rwlock
  cpufreq: Convert the cpufreq_driver_lock to use the rcu

 drivers/cpufreq/cpufreq.c | 139 ++++++++++++++++++++++++++--------------------
 1 file changed, 79 insertions(+), 60 deletions(-)

-- 
1.8.0.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-04 22:45 [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
@ 2013-02-04 22:45 ` Nathan Zimmer
  2013-02-05  8:11   ` Viresh Kumar
  2013-02-04 22:45 ` [PATCH 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  2013-02-05  1:07 ` [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
  2 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-04 22:45 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

This completely eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.

CC: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1f93dbd..13a83a2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -45,7 +45,7 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
+static DEFINE_RWLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -149,7 +149,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	if (!cpufreq_driver)
 		goto err_out_unlock;
@@ -167,13 +167,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return data;
 
 err_out_put_module:
 	module_put(cpufreq_driver->owner);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 err_out:
 	return NULL;
 }
@@ -751,10 +751,10 @@ static int cpufreq_add_dev_policy(unsigned int cpu,
 				return -EBUSY;
 			}
 
-			spin_lock_irqsave(&cpufreq_driver_lock, flags);
+			write_lock_irqsave(&cpufreq_driver_lock, flags);
 			cpumask_copy(managed_policy->cpus, policy->cpus);
 			per_cpu(cpufreq_cpu_data, cpu) = managed_policy;
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 			pr_debug("CPU already managed, adding link\n");
 			ret = sysfs_create_link(&dev->kobj,
@@ -850,14 +850,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 			goto err_out_kobj_put;
 	}
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		if (!cpu_online(j))
 			continue;
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -999,10 +999,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1042,11 +1042,11 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 
 	pr_debug("unregistering CPU %u\n", cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data) {
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		unlock_policy_rwsem_write(cpu);
 		return -EINVAL;
 	}
@@ -1060,7 +1060,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	if (unlikely(cpu != data->cpu)) {
 		pr_debug("removing link\n");
 		cpumask_clear_cpu(cpu, data->cpus);
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		kobj = &dev->kobj;
 		cpufreq_cpu_put(data);
 		unlock_policy_rwsem_write(cpu);
@@ -1089,7 +1089,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		}
 	}
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	if (unlikely(cpumask_weight(data->cpus) > 1)) {
 		for_each_cpu(j, data->cpus) {
@@ -1109,7 +1109,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		}
 	}
 #else
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 
 	if (cpufreq_driver->target)
@@ -1878,13 +1878,13 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	if (cpufreq_driver) {
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
 	cpufreq_driver = driver_data;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
@@ -1916,9 +1916,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1944,9 +1944,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.0.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-04 22:45 [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-02-04 22:45 ` [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-04 22:45 ` Nathan Zimmer
  2013-02-05  1:07 ` [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
  2 siblings, 0 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-04 22:45 UTC (permalink / raw)
  To: rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

In general rwlocks are discourged so we are moving it to use the rcu instead.

CC: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 177 +++++++++++++++++++++++++---------------------
 1 file changed, 98 insertions(+), 79 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 13a83a2..5c5b9f4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,13 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_RWLOCK(cpufreq_driver_lock);
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -143,21 +143,20 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
-	unsigned long flags;
+	struct cpufreq_driver *driver;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
-
-	if (!cpufreq_driver)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
-
 	/* get the CPU */
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
@@ -167,13 +166,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 err_out_unlock:
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -193,7 +192,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	module_put(rcu_dereference(cpufreq_driver)->owner);
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -261,10 +260,13 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
 
 	BUG_ON(irqs_disabled());
 
-	freqs->flags = cpufreq_driver->flags;
+	driver = rcu_dereference(cpufreq_driver);
+
+	freqs->flags = driver->flags;
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
@@ -276,7 +278,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(driver->flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -329,11 +331,12 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto out;
 
-	if (cpufreq_driver->setpolicy) {
+	if (driver->setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -342,7 +345,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (driver->target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -493,7 +496,8 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n",
+			 rcu_dereference(cpufreq_driver)->name);
 }
 
 /**
@@ -505,7 +509,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	if (!rcu_dereference(cpufreq_driver)->target) {
 		i += sprintf(buf, "performance powersave");
 		goto out;
 	}
@@ -589,8 +593,10 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
+
+	if (driver->bios_limit) {
+		ret = driver->bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -711,6 +717,7 @@ static int cpufreq_add_dev_policy(unsigned int cpu,
 				  struct device *dev)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver;
 #ifdef CONFIG_SMP
 	unsigned long flags;
 	unsigned int j;
@@ -724,6 +731,7 @@ static int cpufreq_add_dev_policy(unsigned int cpu,
 		       policy->governor->name, cpu);
 	}
 #endif
+	driver = rcu_dereference(cpufreq_driver);
 
 	for_each_cpu(j, policy->cpus) {
 		struct cpufreq_policy *managed_policy;
@@ -745,16 +753,16 @@ static int cpufreq_add_dev_policy(unsigned int cpu,
 
 			if (lock_policy_rwsem_write(cpu) < 0) {
 				/* Should not go through policy unlock path */
-				if (cpufreq_driver->exit)
-					cpufreq_driver->exit(policy);
+				if (driver->exit)
+					driver->exit(policy);
 				cpufreq_cpu_put(managed_policy);
 				return -EBUSY;
 			}
 
-			write_lock_irqsave(&cpufreq_driver_lock, flags);
+			spin_lock_irqsave(&cpufreq_driver_lock, flags);
 			cpumask_copy(managed_policy->cpus, policy->cpus);
 			per_cpu(cpufreq_cpu_data, cpu) = managed_policy;
-			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 			pr_debug("CPU already managed, adding link\n");
 			ret = sysfs_create_link(&dev->kobj,
@@ -767,8 +775,8 @@ static int cpufreq_add_dev_policy(unsigned int cpu,
 			 * Call driver->exit() because only the cpu parent of
 			 * the kobj needed to call init().
 			 */
-			if (cpufreq_driver->exit)
-				cpufreq_driver->exit(policy);
+			if (driver->exit)
+				driver->exit(policy);
 
 			if (!ret)
 				return 1;
@@ -819,6 +827,7 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 	unsigned long flags;
 	int ret = 0;
 	unsigned int j;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	/* prepare interface data */
 	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq,
@@ -827,37 +836,37 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
 			goto err_out_kobj_put;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		if (!cpu_online(j))
 			continue;
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -874,8 +883,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 
 	if (ret) {
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		if (driver->exit)
+			driver->exit(policy);
 	}
 	return ret;
 
@@ -905,6 +914,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 #ifdef CONFIG_HOTPLUG_CPU
 	int sibling;
 #endif
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (cpu_is_offline(cpu))
 		return 0;
@@ -921,7 +931,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	}
 #endif
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	if (!try_module_get(driver->owner)) {
 		ret = -EINVAL;
 		goto module_out;
 	}
@@ -965,7 +975,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = driver->init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_unlock_policy;
@@ -992,17 +1002,17 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unlock_policy_rwsem_write(cpu);
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 
 err_out_unregister:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1015,7 +1025,7 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 module_out:
 	return ret;
 }
@@ -1039,14 +1049,15 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	struct device *cpu_dev;
 	unsigned int j;
 #endif
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("unregistering CPU %u\n", cpu);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data) {
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		unlock_policy_rwsem_write(cpu);
 		return -EINVAL;
 	}
@@ -1060,7 +1071,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	if (unlikely(cpu != data->cpu)) {
 		pr_debug("removing link\n");
 		cpumask_clear_cpu(cpu, data->cpus);
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		kobj = &dev->kobj;
 		cpufreq_cpu_put(data);
 		unlock_policy_rwsem_write(cpu);
@@ -1089,7 +1100,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		}
 	}
 
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	if (unlikely(cpumask_weight(data->cpus) > 1)) {
 		for_each_cpu(j, data->cpus) {
@@ -1109,10 +1120,10 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		}
 	}
 #else
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 
-	if (cpufreq_driver->target)
+	if (driver->target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 	kobj = &data->kobj;
@@ -1129,8 +1140,8 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	pr_debug("wait complete\n");
 
 	lock_policy_rwsem_write(cpu);
-	if (cpufreq_driver->exit)
-		cpufreq_driver->exit(data);
+	if (driver->exit)
+		driver->exit(data);
 	unlock_policy_rwsem_write(cpu);
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -1253,14 +1264,15 @@ static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	unsigned int ret_freq = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver->get)
+	if (!driver->get)
 		return ret_freq;
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = driver->get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(driver->flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1320,6 +1332,7 @@ static int cpufreq_bp_suspend(void)
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("suspending cpu %u\n", cpu);
 
@@ -1328,8 +1341,8 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	if (driver->suspend) {
+		ret = driver->suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1358,6 +1371,7 @@ static void cpufreq_bp_resume(void)
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("resuming cpu %u\n", cpu);
 
@@ -1366,8 +1380,8 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	if (driver->resume) {
+		ret = driver->resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1471,6 +1485,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1487,8 +1502,8 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpu_online(policy->cpu) && cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	if (cpu_online(policy->cpu) && driver->target)
+		retval = driver->target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1521,15 +1536,16 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
-	if (!(cpu_online(cpu) && cpufreq_driver->getavg))
+	if (!(cpu_online(cpu) && driver->getavg))
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = driver->getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1679,6 +1695,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1692,7 +1709,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	}
 
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	ret = driver->verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1706,7 +1723,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = driver->verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1720,10 +1737,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (driver->setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = driver->setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1771,6 +1788,7 @@ int cpufreq_update_policy(unsigned int cpu)
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
 	int ret;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (!data) {
 		ret = -ENODEV;
@@ -1791,8 +1809,8 @@ int cpufreq_update_policy(unsigned int cpu)
 
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	if (driver->get) {
+		policy.cur = driver->get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
@@ -1878,19 +1896,19 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	if (rcu_dereference(cpufreq_driver)) {
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, driver_data);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(rcu_dereference(cpufreq_driver)->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1916,9 +1934,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1935,8 +1953,9 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	if (!old_driver || (driver != old_driver))
 		return -EINVAL;
 
 	pr_debug("unregistering driver %s\n", driver->name);
@@ -1944,9 +1963,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.0.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-04 22:45 [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-02-04 22:45 ` [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
  2013-02-04 22:45 ` [PATCH 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-02-05  1:07 ` Rafael J. Wysocki
  2013-02-05  8:28   ` Viresh Kumar
  2 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-05  1:07 UTC (permalink / raw)
  To: Nathan Zimmer, Viresh Kumar; +Cc: linux-kernel, linux-pm, cpufreq, Shawn Guo

On Monday, February 04, 2013 04:45:11 PM Nathan Zimmer wrote:
> I am noticing the cpufreq_driver_lock is quite hot.
> On an idle 512 system perf shows me most of the system time is spent on this
> lock.  This is quite signifigant as top shows 5% of time in system time.
> My solution was to first convert the lock to a rwlock and then to the rcu.
> 
> 
> Nathan Zimmer (2):
>   cpufreq: Convert the cpufreq_driver_lock to a rwlock
>   cpufreq: Convert the cpufreq_driver_lock to use the rcu
> 
>  drivers/cpufreq/cpufreq.c | 139 ++++++++++++++++++++++++++--------------------
>  1 file changed, 79 insertions(+), 60 deletions(-)

I like these changes.

Viresh, anyone, any comments?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-04 22:45 ` [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-05  8:11   ` Viresh Kumar
  0 siblings, 0 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-02-05  8:11 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, linux-kernel, linux-pm, cpufreq

On Tue, Feb 5, 2013 at 4:15 AM, Nathan Zimmer <nzimmer@sgi.com> wrote:
> This completely eliminates the contention I am seeing in __cpufreq_cpu_get.
> It also nicely stages the lock to be replaced by the rcu.
>
> CC: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 44 ++++++++++++++++++++++----------------------
>  1 file changed, 22 insertions(+), 22 deletions(-)

You have rebased it against an old version of this file. Please rebase it on
latest Rafael's linux-next branch.

The patch looks good otherwise.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05  1:07 ` [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
@ 2013-02-05  8:28   ` Viresh Kumar
  2013-02-05 10:03     ` Rafael J. Wysocki
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-05  8:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

On Tue, Feb 5, 2013 at 6:37 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, February 04, 2013 04:45:11 PM Nathan Zimmer wrote:
>> I am noticing the cpufreq_driver_lock is quite hot.
>> On an idle 512 system perf shows me most of the system time is spent on this
>> lock.  This is quite signifigant as top shows 5% of time in system time.
>> My solution was to first convert the lock to a rwlock and then to the rcu.
>>
>>
>> Nathan Zimmer (2):
>>   cpufreq: Convert the cpufreq_driver_lock to a rwlock
>>   cpufreq: Convert the cpufreq_driver_lock to use the rcu
>>
>>  drivers/cpufreq/cpufreq.c | 139 ++++++++++++++++++++++++++--------------------
>>  1 file changed, 79 insertions(+), 60 deletions(-)
>
> I like these changes.
>
> Viresh, anyone, any comments?

Hi Nathan/Rafael,

Even i liked the basic idea behind the patchset, but didn't like the way it
is divided into patches. For me, it is highly discouraged to undo something
that you added in the same patchset. And you did exactly the same thing.

Patch 2 is revert of 1 + rcu stuff.

So, i would expect a single patch, i.e. merge of both patches + rebased
on latest stuff.

--
viresh

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05 10:03     ` Rafael J. Wysocki
@ 2013-02-05  9:58       ` Viresh Kumar
  2013-02-05 10:13         ` Rafael J. Wysocki
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-05  9:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

On Tue, Feb 5, 2013 at 3:33 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> I actually don't agree with that, becuase the Nathan's apprach shows the
> reasoning that leads to the RCU introduction quite clearly.  So if you
> don't have technical problems with the patchset, I'm going to take it as is.

Great!!

Okay.. I don't have any technical problems with it, i reviewed most of it
carefully. The only pending thing is rebase on linux-next, after that i can
give my ack for it.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05  8:28   ` Viresh Kumar
@ 2013-02-05 10:03     ` Rafael J. Wysocki
  2013-02-05  9:58       ` Viresh Kumar
  0 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-05 10:03 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

On Tuesday, February 05, 2013 01:58:20 PM Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 6:37 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Monday, February 04, 2013 04:45:11 PM Nathan Zimmer wrote:
> >> I am noticing the cpufreq_driver_lock is quite hot.
> >> On an idle 512 system perf shows me most of the system time is spent on this
> >> lock.  This is quite signifigant as top shows 5% of time in system time.
> >> My solution was to first convert the lock to a rwlock and then to the rcu.
> >>
> >>
> >> Nathan Zimmer (2):
> >>   cpufreq: Convert the cpufreq_driver_lock to a rwlock
> >>   cpufreq: Convert the cpufreq_driver_lock to use the rcu
> >>
> >>  drivers/cpufreq/cpufreq.c | 139 ++++++++++++++++++++++++++--------------------
> >>  1 file changed, 79 insertions(+), 60 deletions(-)
> >
> > I like these changes.
> >
> > Viresh, anyone, any comments?
> 
> Hi Nathan/Rafael,
> 
> Even i liked the basic idea behind the patchset, but didn't like the way it
> is divided into patches. For me, it is highly discouraged to undo something
> that you added in the same patchset. And you did exactly the same thing.
> 
> Patch 2 is revert of 1 + rcu stuff.
> 
> So, i would expect a single patch, i.e. merge of both patches + rebased
> on latest stuff.

I actually don't agree with that, becuase the Nathan's apprach shows the
reasoning that leads to the RCU introduction quite clearly.  So if you
don't have technical problems with the patchset, I'm going to take it as is.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05  9:58       ` Viresh Kumar
@ 2013-02-05 10:13         ` Rafael J. Wysocki
  2013-02-05 14:58           ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-05 10:13 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

On Tuesday, February 05, 2013 03:28:30 PM Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 3:33 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > I actually don't agree with that, becuase the Nathan's apprach shows the
> > reasoning that leads to the RCU introduction quite clearly.  So if you
> > don't have technical problems with the patchset, I'm going to take it as is.
> 
> Great!!
> 
> Okay.. I don't have any technical problems with it, i reviewed most of it
> carefully. The only pending thing is rebase on linux-next, after that i can
> give my ack for it.

Yes, it would be great if it were rebased and retested.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05 10:13         ` Rafael J. Wysocki
@ 2013-02-05 14:58           ` Nathan Zimmer
  2013-02-05 22:00             ` Rafael J. Wysocki
  2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
  0 siblings, 2 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-05 14:58 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar
  Cc: linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

Ok, I'll rebase and retest from linux-next then.


________________________________________
From: Rafael J. Wysocki [rjw@sisk.pl]
Sent: Tuesday, February 05, 2013 4:13 AM
To: Viresh Kumar
Cc: Nathan Zimmer; linux-kernel@vger.kernel.org; linux-pm@vger.kernel.org; cpufreq@vger.kernel.org; Shawn Guo; linaro-dev@lists.linaro.org
Subject: Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems

On Tuesday, February 05, 2013 03:28:30 PM Viresh Kumar wrote:
> On Tue, Feb 5, 2013 at 3:33 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > I actually don't agree with that, becuase the Nathan's apprach shows the
> > reasoning that leads to the RCU introduction quite clearly.  So if you
> > don't have technical problems with the patchset, I'm going to take it as is.
>
> Great!!
>
> Okay.. I don't have any technical problems with it, i reviewed most of it
> carefully. The only pending thing is rebase on linux-next, after that i can
> give my ack for it.

Yes, it would be great if it were rebased and retested.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05 14:58           ` Nathan Zimmer
@ 2013-02-05 22:00             ` Rafael J. Wysocki
  2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
  1 sibling, 0 replies; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-05 22:00 UTC (permalink / raw)
  To: Nathan Zimmer
  Cc: Viresh Kumar, linux-kernel, linux-pm, cpufreq, Shawn Guo, linaro-dev

On Tuesday, February 05, 2013 02:58:35 PM Nathan Zimmer wrote:
> Ok, I'll rebase and retest from linux-next then.

Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 linux-next 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-05 14:58           ` Nathan Zimmer
  2013-02-05 22:00             ` Rafael J. Wysocki
@ 2013-02-06  2:04             ` Nathan Zimmer
  2013-02-06  2:04               ` [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
                                 ` (2 more replies)
  1 sibling, 3 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-06  2:04 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

I am noticing the cpufreq_driver_lock is quite hot.
On an idle 512 system perf shows me most of the system time is spent on this
lock.  This is quite significant as top shows 5% of time in system time.
My solution was to first convert the lock to a rwlock and then to the rcu.

v2: Rebase to linux-next

Nathan Zimmer (2):
  cpufreq: Convert the cpufreq_driver_lock to a rwlock
  cpufreq: Convert the cpufreq_driver_lock to use the rcu

 drivers/cpufreq/cpufreq.c | 150 +++++++++++++++++++++++++---------------------
 1 file changed, 83 insertions(+), 67 deletions(-)

-- 
1.8.0.1


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
@ 2013-02-06  2:04               ` Nathan Zimmer
  2013-02-06  2:47                 ` Viresh Kumar
  2013-02-06  2:04               ` [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  2013-02-20 23:56               ` [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-06  2:04 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

This eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9656420..ef25244 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -45,7 +45,7 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
+static DEFINE_RWLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -149,7 +149,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	if (!cpufreq_driver)
 		goto err_out_unlock;
@@ -167,13 +167,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return data;
 
 err_out_put_module:
 	module_put(cpufreq_driver->owner);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 err_out:
 	return NULL;
 }
@@ -775,14 +775,14 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 			goto err_out_kobj_put;
 	}
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		if (!cpu_online(j))
 			continue;
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -827,10 +827,10 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
@@ -977,10 +977,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1036,12 +1036,12 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		unlock_policy_rwsem_write(cpu);
 		return -EINVAL;
 	}
@@ -1068,7 +1068,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			cpumask_set_cpu(cpu, data->cpus);
 			ret = sysfs_create_link(&cpu_dev->kobj, &data->kobj,
 					"cpufreq");
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 			unlock_policy_rwsem_write(cpu);
 			return -EINVAL;
 		}
@@ -1078,7 +1078,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 				__func__, cpu_dev->id, cpu);
 	}
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
 	cpufreq_cpu_put(data);
@@ -1866,13 +1866,13 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	if (cpufreq_driver) {
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
 	cpufreq_driver = driver_data;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
@@ -1904,9 +1904,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1932,9 +1932,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.0.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
  2013-02-06  2:04               ` [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-06  2:04               ` Nathan Zimmer
  2013-02-06  2:52                 ` Viresh Kumar
  2013-02-07 23:29                 ` Rafael J. Wysocki
  2013-02-20 23:56               ` [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2 siblings, 2 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-06  2:04 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: linux-kernel, linux-pm, cpufreq, Nathan Zimmer

In general rwlocks are discourged so we are moving it to use the rcu instead.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
 1 file changed, 96 insertions(+), 77 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index ef25244..a04ceb9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,13 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_RWLOCK(cpufreq_driver_lock);
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -143,21 +143,20 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
-	unsigned long flags;
+	struct cpufreq_driver *driver;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
-
-	if (!cpufreq_driver)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
-
 	/* get the CPU */
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
@@ -167,13 +166,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 err_out_unlock:
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	module_put(rcu_dereference(cpufreq_driver)->owner);
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -267,13 +266,16 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
 
 	BUG_ON(irqs_disabled());
 
 	if (cpufreq_disabled())
 		return;
 
-	freqs->flags = cpufreq_driver->flags;
+	driver = rcu_dereference(cpufreq_driver);
+
+	freqs->flags = driver->flags;
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
@@ -285,7 +287,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(driver->flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -337,11 +339,12 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto out;
 
-	if (cpufreq_driver->setpolicy) {
+	if (driver->setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -350,7 +353,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (driver->target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -501,7 +504,8 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n",
+			 rcu_dereference(cpufreq_driver)->name);
 }
 
 /**
@@ -513,7 +517,7 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	if (!rcu_dereference(cpufreq_driver)->target) {
 		i += sprintf(buf, "performance powersave");
 		goto out;
 	}
@@ -595,8 +599,10 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
+
+	if (driver->bios_limit) {
+		ret = driver->bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -744,6 +750,7 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 	unsigned long flags;
 	int ret = 0;
 	unsigned int j;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	/* prepare interface data */
 	ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq,
@@ -752,37 +759,37 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
 			goto err_out_kobj_put;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		if (!cpu_online(j))
 			continue;
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -799,8 +806,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 
 	if (ret) {
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		if (driver->exit)
+			driver->exit(policy);
 	}
 	return ret;
 
@@ -827,10 +834,10 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	__cpufreq_governor(policy, CPUFREQ_GOV_START);
 	__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
@@ -861,6 +868,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM, found = 0;
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
 	unsigned long flags;
 #ifdef CONFIG_HOTPLUG_CPU
 	struct cpufreq_governor *gov;
@@ -871,6 +879,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		return 0;
 
 	pr_debug("adding CPU %u\n", cpu);
+	driver = rcu_dereference(cpufreq_driver);
 
 #ifdef CONFIG_SMP
 	/* check whether a different CPU already registered this
@@ -891,7 +900,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 #endif
 #endif
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	if (!try_module_get(driver->owner)) {
 		ret = -EINVAL;
 		goto module_out;
 	}
@@ -934,7 +943,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = driver->init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_unlock_policy;
@@ -971,16 +980,16 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unlock_policy_rwsem_write(cpu);
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	module_put(rcu_dereference(cpufreq_driver)->owner);
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 err_out_unregister:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -993,7 +1002,7 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 module_out:
 	return ret;
 }
@@ -1033,20 +1042,21 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	struct kobject *kobj;
 	struct completion *cmp;
 	struct device *cpu_dev;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		unlock_policy_rwsem_write(cpu);
 		return -EINVAL;
 	}
 
-	if (cpufreq_driver->target)
+	if (driver->target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -1068,7 +1078,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			cpumask_set_cpu(cpu, data->cpus);
 			ret = sysfs_create_link(&cpu_dev->kobj, &data->kobj,
 					"cpufreq");
-			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 			unlock_policy_rwsem_write(cpu);
 			return -EINVAL;
 		}
@@ -1078,7 +1088,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 				__func__, cpu_dev->id, cpu);
 	}
 
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	pr_debug("%s: removing link, cpu: %d\n", __func__, cpu);
 	cpufreq_cpu_put(data);
@@ -1102,14 +1112,14 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		pr_debug("wait complete\n");
 
 		lock_policy_rwsem_write(cpu);
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(data);
+		if (driver->exit)
+			driver->exit(data);
 		unlock_policy_rwsem_write(cpu);
 
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
+	} else if (driver->target) {
 		__cpufreq_governor(data, CPUFREQ_GOV_START);
 		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
 	}
@@ -1214,14 +1224,15 @@ static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	unsigned int ret_freq = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver->get)
+	if (!driver->get)
 		return ret_freq;
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = driver->get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(driver->flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1281,6 +1292,7 @@ static int cpufreq_bp_suspend(void)
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("suspending cpu %u\n", cpu);
 
@@ -1289,8 +1301,8 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	if (driver->suspend) {
+		ret = driver->suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1319,6 +1331,7 @@ static void cpufreq_bp_resume(void)
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("resuming cpu %u\n", cpu);
 
@@ -1327,8 +1340,8 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	if (driver->resume) {
+		ret = driver->resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1355,8 +1368,9 @@ static struct syscore_ops cpufreq_syscore_ops = {
  */
 const char *cpufreq_get_current_driver(void)
 {
-	if (cpufreq_driver)
-		return cpufreq_driver->name;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
+	if (driver)
+		return driver->name;
 
 	return NULL;
 }
@@ -1452,6 +1466,7 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1468,8 +1483,8 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpu_online(policy->cpu) && cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	if (cpu_online(policy->cpu) && driver->target)
+		retval = driver->target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1502,18 +1517,19 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (cpufreq_disabled())
 		return ret;
 
-	if (!(cpu_online(cpu) && cpufreq_driver->getavg))
+	if (!(cpu_online(cpu) && driver->getavg))
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = driver->getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1667,6 +1683,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1680,7 +1697,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	}
 
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	ret = driver->verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1694,7 +1711,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = driver->verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1708,10 +1725,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (driver->setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = driver->setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1759,6 +1776,7 @@ int cpufreq_update_policy(unsigned int cpu)
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
 	int ret;
+	struct cpufreq_driver *driver = rcu_dereference(cpufreq_driver);
 
 	if (!data) {
 		ret = -ENODEV;
@@ -1779,8 +1797,8 @@ int cpufreq_update_policy(unsigned int cpu)
 
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	if (driver->get) {
+		policy.cur = driver->get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
@@ -1866,19 +1884,19 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	if (rcu_dereference(cpufreq_driver)) {
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, driver_data);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(rcu_dereference(cpufreq_driver)->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1904,9 +1922,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1923,8 +1941,9 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	if (!old_driver || (driver != old_driver))
 		return -EINVAL;
 
 	pr_debug("unregistering driver %s\n", driver->name);
@@ -1932,9 +1951,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.0.1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-06  2:04               ` [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-06  2:47                 ` Viresh Kumar
  0 siblings, 0 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:47 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, linux-kernel, linux-pm, cpufreq

On 6 February 2013 07:34, Nathan Zimmer <nzimmer@sgi.com> wrote:
> This eliminates the contention I am seeing in __cpufreq_cpu_get.
> It also nicely stages the lock to be replaced by the rcu.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 42 +++++++++++++++++++++---------------------
>  1 file changed, 21 insertions(+), 21 deletions(-)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-06  2:04               ` [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-02-06  2:52                 ` Viresh Kumar
  2013-02-06  8:51                   ` Viresh Kumar
  2013-02-07 23:29                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-06  2:52 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, linux-kernel, linux-pm, cpufreq

On 6 February 2013 07:34, Nathan Zimmer <nzimmer@sgi.com> wrote:
> In general rwlocks are discourged so we are moving it to use the rcu instead.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
>  1 file changed, 96 insertions(+), 77 deletions(-)

bleeding-edge got updated again and this patch would have conflicts :)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-06  2:52                 ` Viresh Kumar
@ 2013-02-06  8:51                   ` Viresh Kumar
  2013-02-06 13:00                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-06  8:51 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq

On 6 February 2013 08:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 6 February 2013 07:34, Nathan Zimmer <nzimmer@sgi.com> wrote:
>> In general rwlocks are discourged so we are moving it to use the rcu instead.
>>
>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
>> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
>> ---
>>  drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
>>  1 file changed, 96 insertions(+), 77 deletions(-)
>
> bleeding-edge got updated again and this patch would have conflicts :)
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Rafael,

I have applied them over bleeding-edge after resolving few conflicts.
Please pick them up from:

Branch: for-rafael
Repo: git://git.linaro.org/people/vireshk/linux.git
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-rafael

I have got my other cpufreq work too in that repo:
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/cpufreq-updates

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-06  8:51                   ` Viresh Kumar
@ 2013-02-06 13:00                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-06 13:00 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Nathan Zimmer, linux-kernel, linux-pm, cpufreq

On Wednesday, February 06, 2013 02:21:11 PM Viresh Kumar wrote:
> On 6 February 2013 08:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 6 February 2013 07:34, Nathan Zimmer <nzimmer@sgi.com> wrote:
> >> In general rwlocks are discourged so we are moving it to use the rcu instead.
> >>
> >> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> >> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> >> ---
> >>  drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
> >>  1 file changed, 96 insertions(+), 77 deletions(-)
> >
> > bleeding-edge got updated again and this patch would have conflicts :)
> >
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
> Rafael,
> 
> I have applied them over bleeding-edge after resolving few conflicts.
> Please pick them up from:
> 
> Branch: for-rafael
> Repo: git://git.linaro.org/people/vireshk/linux.git
> http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-rafael

That's really helpful, thanks a lot!

Both applied to bleeding-edge.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-06  2:04               ` [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  2013-02-06  2:52                 ` Viresh Kumar
@ 2013-02-07 23:29                 ` Rafael J. Wysocki
  2013-02-11 17:13                   ` Nathan Zimmer
  1 sibling, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-07 23:29 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq

On Tuesday, February 05, 2013 08:04:50 PM Nathan Zimmer wrote:
> In general rwlocks are discourged so we are moving it to use the rcu instead.
> 
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
>  1 file changed, 96 insertions(+), 77 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index ef25244..a04ceb9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -39,13 +39,13 @@
>   * level driver of CPUFreq support, and its spinlock. This lock
>   * also protects the cpufreq_cpu_data array.
>   */
> -static struct cpufreq_driver *cpufreq_driver;
> +static struct cpufreq_driver __rcu *cpufreq_driver;
>  static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  #ifdef CONFIG_HOTPLUG_CPU
>  /* This one keeps track of the previously set governor of a removed CPU */
>  static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
>  #endif
> -static DEFINE_RWLOCK(cpufreq_driver_lock);
> +static DEFINE_SPINLOCK(cpufreq_driver_lock);
>  
>  /*
>   * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
> @@ -143,21 +143,20 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
>  static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>  {
>  	struct cpufreq_policy *data;
> -	unsigned long flags;
> +	struct cpufreq_driver *driver;
>  
>  	if (cpu >= nr_cpu_ids)
>  		goto err_out;
>  
>  	/* get the cpufreq driver */
> -	read_lock_irqsave(&cpufreq_driver_lock, flags);
> -
> -	if (!cpufreq_driver)
> +	rcu_read_lock();
> +	driver = rcu_dereference(cpufreq_driver);
> +	if (!driver)
>  		goto err_out_unlock;
>  
> -	if (!try_module_get(cpufreq_driver->owner))
> +	if (!try_module_get(driver->owner))
>  		goto err_out_unlock;
>  
> -
>  	/* get the CPU */
>  	data = per_cpu(cpufreq_cpu_data, cpu);
>  
> @@ -167,13 +166,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>  	if (!sysfs && !kobject_get(&data->kobj))
>  		goto err_out_put_module;
>  
> -	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +	rcu_read_unlock();
>  	return data;
>  
>  err_out_put_module:
> -	module_put(cpufreq_driver->owner);
> +	module_put(driver->owner);
>  err_out_unlock:
> -	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +	rcu_read_unlock();
>  err_out:
>  	return NULL;
>  }
> @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
>  {
>  	if (!sysfs)
>  		kobject_put(&data->kobj);
> -	module_put(cpufreq_driver->owner);
> +	module_put(rcu_dereference(cpufreq_driver)->owner);
>  }
>  
>  void cpufreq_cpu_put(struct cpufreq_policy *data)
> @@ -267,13 +266,16 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
>  void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>  {
>  	struct cpufreq_policy *policy;
> +	struct cpufreq_driver *driver;
>  
>  	BUG_ON(irqs_disabled());
>  
>  	if (cpufreq_disabled())
>  		return;
>  
> -	freqs->flags = cpufreq_driver->flags;
> +	driver = rcu_dereference(cpufreq_driver);

Pardon me for not asking that question earlier, but what sense does it make
to use rcu_dereference() outside of an rcu_read_lock()/rcu_read_unlock()
scope?  Here and in the other places?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-07 23:29                 ` Rafael J. Wysocki
@ 2013-02-11 17:13                   ` Nathan Zimmer
  2013-02-11 19:36                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-11 17:13 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq

There are some spots that I need to give a much deeper review, cpufreq_register_driver for example.

But I believe 
> @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
>  {
>       if (!sysfs)
>               kobject_put(&data->kobj);
> -     module_put(cpufreq_driver->owner);
> +     module_put(rcu_dereference(cpufreq_driver)->owner);
>  }
would be ok.  In the documentation whatisRCU.txt they give a very similar example.



________________________________________
From: Rafael J. Wysocki [rjw@sisk.pl]
Sent: Thursday, February 07, 2013 5:29 PM
To: Nathan Zimmer
Cc: viresh.kumar@linaro.org; linux-kernel@vger.kernel.org; linux-pm@vger.kernel.org; cpufreq@vger.kernel.org
Subject: Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu

On Tuesday, February 05, 2013 08:04:50 PM Nathan Zimmer wrote:
> In general rwlocks are discourged so we are moving it to use the rcu instead.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 173 +++++++++++++++++++++++++---------------------
>  1 file changed, 96 insertions(+), 77 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index ef25244..a04ceb9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -39,13 +39,13 @@
>   * level driver of CPUFreq support, and its spinlock. This lock
>   * also protects the cpufreq_cpu_data array.
>   */
> -static struct cpufreq_driver *cpufreq_driver;
> +static struct cpufreq_driver __rcu *cpufreq_driver;
>  static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  #ifdef CONFIG_HOTPLUG_CPU
>  /* This one keeps track of the previously set governor of a removed CPU */
>  static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
>  #endif
> -static DEFINE_RWLOCK(cpufreq_driver_lock);
> +static DEFINE_SPINLOCK(cpufreq_driver_lock);
>
>  /*
>   * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
> @@ -143,21 +143,20 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
>  static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>  {
>       struct cpufreq_policy *data;
> -     unsigned long flags;
> +     struct cpufreq_driver *driver;
>
>       if (cpu >= nr_cpu_ids)
>               goto err_out;
>
>       /* get the cpufreq driver */
> -     read_lock_irqsave(&cpufreq_driver_lock, flags);
> -
> -     if (!cpufreq_driver)
> +     rcu_read_lock();
> +     driver = rcu_dereference(cpufreq_driver);
> +     if (!driver)
>               goto err_out_unlock;
>
> -     if (!try_module_get(cpufreq_driver->owner))
> +     if (!try_module_get(driver->owner))
>               goto err_out_unlock;
>
> -
>       /* get the CPU */
>       data = per_cpu(cpufreq_cpu_data, cpu);
>
> @@ -167,13 +166,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>       if (!sysfs && !kobject_get(&data->kobj))
>               goto err_out_put_module;
>
> -     read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +     rcu_read_unlock();
>       return data;
>
>  err_out_put_module:
> -     module_put(cpufreq_driver->owner);
> +     module_put(driver->owner);
>  err_out_unlock:
> -     read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +     rcu_read_unlock();
>  err_out:
>       return NULL;
>  }
> @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
>  {
>       if (!sysfs)
>               kobject_put(&data->kobj);
> -     module_put(cpufreq_driver->owner);
> +     module_put(rcu_dereference(cpufreq_driver)->owner);
>  }
>
>  void cpufreq_cpu_put(struct cpufreq_policy *data)
> @@ -267,13 +266,16 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
>  void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>  {
>       struct cpufreq_policy *policy;
> +     struct cpufreq_driver *driver;
>
>       BUG_ON(irqs_disabled());
>
>       if (cpufreq_disabled())
>               return;
>
> -     freqs->flags = cpufreq_driver->flags;
> +     driver = rcu_dereference(cpufreq_driver);

Pardon me for not asking that question earlier, but what sense does it make
to use rcu_dereference() outside of an rcu_read_lock()/rcu_read_unlock()
scope?  Here and in the other places?

Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-11 17:13                   ` Nathan Zimmer
@ 2013-02-11 19:36                     ` Rafael J. Wysocki
  2013-02-12  4:03                       ` Nathan Zimmer
  2013-02-12 15:59                       ` Paul E. McKenney
  0 siblings, 2 replies; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-11 19:36 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq

On Monday, February 11, 2013 05:13:30 PM Nathan Zimmer wrote:
> There are some spots that I need to give a much deeper review, cpufreq_register_driver for example.
> 
> But I believe 
> > @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
> >  {
> >       if (!sysfs)
> >               kobject_put(&data->kobj);
> > -     module_put(cpufreq_driver->owner);
> > +     module_put(rcu_dereference(cpufreq_driver)->owner);
> >  }
> would be ok.  In the documentation whatisRCU.txt they give a very similar example.

Well, the very same document states the following:

        Note that the value returned by rcu_dereference() is valid
        only within the enclosing RCU read-side critical section.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-11 19:36                     ` Rafael J. Wysocki
@ 2013-02-12  4:03                       ` Nathan Zimmer
  2013-02-12 15:59                       ` Paul E. McKenney
  1 sibling, 0 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-12  4:03 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: viresh.kumar, linux-kernel, linux-pm, cpufreq

Argh, your right.  I completely misread that section.
It'll take me a few days to respin and retest properly.

Thanks, 
Nate

________________________________________
From: Rafael J. Wysocki [rjw@sisk.pl]
Sent: Monday, February 11, 2013 1:36 PM
To: Nathan Zimmer
Cc: viresh.kumar@linaro.org; linux-kernel@vger.kernel.org; linux-pm@vger.kernel.org; cpufreq@vger.kernel.org
Subject: Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu

On Monday, February 11, 2013 05:13:30 PM Nathan Zimmer wrote:
> There are some spots that I need to give a much deeper review, cpufreq_register_driver for example.
>
> But I believe
> > @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
> >  {
> >       if (!sysfs)
> >               kobject_put(&data->kobj);
> > -     module_put(cpufreq_driver->owner);
> > +     module_put(rcu_dereference(cpufreq_driver)->owner);
> >  }
> would be ok.  In the documentation whatisRCU.txt they give a very similar example.

Well, the very same document states the following:

        Note that the value returned by rcu_dereference() is valid
        only within the enclosing RCU read-side critical section.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-11 19:36                     ` Rafael J. Wysocki
  2013-02-12  4:03                       ` Nathan Zimmer
@ 2013-02-12 15:59                       ` Paul E. McKenney
  2013-02-13 13:20                         ` Rafael J. Wysocki
  1 sibling, 1 reply; 55+ messages in thread
From: Paul E. McKenney @ 2013-02-12 15:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nathan Zimmer, viresh.kumar, linux-kernel, linux-pm, cpufreq

On Mon, Feb 11, 2013 at 08:36:17PM +0100, Rafael J. Wysocki wrote:
> On Monday, February 11, 2013 05:13:30 PM Nathan Zimmer wrote:
> > There are some spots that I need to give a much deeper review, cpufreq_register_driver for example.
> > 
> > But I believe 
> > > @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
> > >  {
> > >       if (!sysfs)
> > >               kobject_put(&data->kobj);
> > > -     module_put(cpufreq_driver->owner);
> > > +     module_put(rcu_dereference(cpufreq_driver)->owner);
> > >  }
> > would be ok.  In the documentation whatisRCU.txt they give a very similar example.
> 
> Well, the very same document states the following:
> 
>         Note that the value returned by rcu_dereference() is valid
>         only within the enclosing RCU read-side critical section.

Ah, there is a code sample in that document showing a bug.  I added
comments to the code sample making it clear even to someone skimming
the document that the code is buggy.

							Thanx, Paul
------------------------------------------------------------------------

rcu: Make bugginess of code sample more evident

One of the code samples in whatisRCU.txt shows a bug, but someone scanning
the document quickly might mistake it for a valid use of RCU.  Add some
screaming comments to help keep speed-readers on track.

Reported-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 0cc7820..10df0b8 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -265,9 +265,9 @@ rcu_dereference()
 		rcu_read_lock();
 		p = rcu_dereference(head.next);
 		rcu_read_unlock();
-		x = p->address;
+		x = p->address;	/* BUG!!! */
 		rcu_read_lock();
-		y = p->data;
+		y = p->data;	/* BUG!!! */
 		rcu_read_unlock();
 
 	Holding a reference from one RCU read-side critical section


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-12 15:59                       ` Paul E. McKenney
@ 2013-02-13 13:20                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-02-13 13:20 UTC (permalink / raw)
  To: paulmck; +Cc: Nathan Zimmer, viresh.kumar, linux-kernel, linux-pm, cpufreq

On Tuesday, February 12, 2013 07:59:54 AM Paul E. McKenney wrote:
> On Mon, Feb 11, 2013 at 08:36:17PM +0100, Rafael J. Wysocki wrote:
> > On Monday, February 11, 2013 05:13:30 PM Nathan Zimmer wrote:
> > > There are some spots that I need to give a much deeper review, cpufreq_register_driver for example.
> > > 
> > > But I believe 
> > > > @@ -196,7 +195,7 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
> > > >  {
> > > >       if (!sysfs)
> > > >               kobject_put(&data->kobj);
> > > > -     module_put(cpufreq_driver->owner);
> > > > +     module_put(rcu_dereference(cpufreq_driver)->owner);
> > > >  }
> > > would be ok.  In the documentation whatisRCU.txt they give a very similar example.
> > 
> > Well, the very same document states the following:
> > 
> >         Note that the value returned by rcu_dereference() is valid
> >         only within the enclosing RCU read-side critical section.
> 
> Ah, there is a code sample in that document showing a bug.  I added
> comments to the code sample making it clear even to someone skimming
> the document that the code is buggy.
> 
> 							Thanx, Paul
> ------------------------------------------------------------------------
> 
> rcu: Make bugginess of code sample more evident
> 
> One of the code samples in whatisRCU.txt shows a bug, but someone scanning
> the document quickly might mistake it for a valid use of RCU.  Add some
> screaming comments to help keep speed-readers on track.
> 
> Reported-by: Nathan Zimmer <nzimmer@sgi.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
> index 0cc7820..10df0b8 100644
> --- a/Documentation/RCU/whatisRCU.txt
> +++ b/Documentation/RCU/whatisRCU.txt
> @@ -265,9 +265,9 @@ rcu_dereference()
>  		rcu_read_lock();
>  		p = rcu_dereference(head.next);
>  		rcu_read_unlock();
> -		x = p->address;
> +		x = p->address;	/* BUG!!! */
>  		rcu_read_lock();
> -		y = p->data;
> +		y = p->data;	/* BUG!!! */
>  		rcu_read_unlock();
>  
>  	Holding a reference from one RCU read-side critical section
> 
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
  2013-02-06  2:04               ` [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
  2013-02-06  2:04               ` [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-02-20 23:56               ` Nathan Zimmer
  2013-02-20 23:56                 ` [PATCH v3 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
  2013-02-20 23:56                 ` [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  2 siblings, 2 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-20 23:56 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

I am noticing the cpufreq_driver_lock is quite hot.
On an idle 512 system perf shows me most of the system time is spent on this
lock.  This is quite significant as top shows 5% of time in system time.
My solution was to first convert the lock to a rwlock and then to the rcu.

v2: Rebase

v3: Read the RCU documentation instead of skimming it.  Also I based on 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm+acpi-3.9-rc1
I assumed that was what you would prefer Rafael.

Nathan Zimmer (2):
  cpufreq: Convert the cpufreq_driver_lock to a rwlock
  cpufreq: Convert the cpufreq_driver_lock to use the rcu

 drivers/cpufreq/cpufreq.c | 286 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 211 insertions(+), 75 deletions(-)

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v3 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-20 23:56               ` [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
@ 2013-02-20 23:56                 ` Nathan Zimmer
  2013-02-20 23:56                 ` [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  1 sibling, 0 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-20 23:56 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

This eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 52 +++++++++++++++++++++++------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b02824d..c5996fe 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -45,7 +45,7 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
+static DEFINE_RWLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -137,7 +137,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	if (!cpufreq_driver)
 		goto err_out_unlock;
@@ -155,13 +155,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return data;
 
 err_out_put_module:
 	module_put(cpufreq_driver->owner);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 err_out:
 	return NULL;
 }
@@ -266,9 +266,9 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	switch (state) {
 
@@ -765,12 +765,12 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 			goto err_out_kobj_put;
 	}
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -813,12 +813,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -871,15 +871,15 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 #endif
 
@@ -952,10 +952,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1008,12 +1008,12 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
@@ -1047,9 +1047,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			spin_lock_irqsave(&cpufreq_driver_lock, flags);
+			write_lock_irqsave(&cpufreq_driver_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1848,13 +1848,13 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	if (cpufreq_driver) {
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
 	cpufreq_driver = driver_data;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
@@ -1886,9 +1886,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1914,9 +1914,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-20 23:56               ` [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-02-20 23:56                 ` [PATCH v3 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-20 23:56                 ` Nathan Zimmer
  2013-02-21  5:50                   ` Viresh Kumar
  1 sibling, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-20 23:56 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

In general rwlocks are discourged so we are moving it to use the rcu instead.
This does require a bit of care since the cpufreq_driver_lock protects both
the cpufreq_driver and the cpufreq_cpu_data array.
Also since many of the function pointers on cpufreq_driver may sleep when
called we have to grab them under the rcu_read_lock but call them after
rcu_read_unlock();

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 312 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 224 insertions(+), 88 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c5996fe..110ec02 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,13 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_RWLOCK(cpufreq_driver_lock);
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -131,18 +131,19 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
-	unsigned long flags;
+	struct cpufreq_driver *driver;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
 
@@ -155,13 +156,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 err_out_unlock:
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -184,7 +185,9 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -255,20 +258,21 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 {
 	struct cpufreq_policy *policy;
-	unsigned long flags;
+	u8 flags;
 
 	BUG_ON(irqs_disabled());
 
 	if (cpufreq_disabled())
 		return;
 
-	freqs->flags = cpufreq_driver->flags;
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	flags = rcu_dereference(cpufreq_driver)->flags;
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
+	freqs->flags = flags;
 
 	switch (state) {
 
@@ -277,7 +281,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -329,11 +333,23 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
-
-	if (!cpufreq_driver)
+	struct cpufreq_driver *driver;
+	int (*setpolicy)(struct cpufreq_policy *policy);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
+
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver) {
+		rcu_read_unlock();
 		goto out;
+	}
+	setpolicy = driver->setpolicy;
+	target = driver->target;
+	rcu_read_unlock();
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -342,7 +358,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -493,7 +509,11 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	char *name;
+	rcu_read_lock();
+	name = rcu_dereference(cpufreq_driver)->name;
+	rcu_read_unlock();
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", name);
 }
 
 /**
@@ -505,10 +525,13 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	rcu_read_lock();
+	if (!rcu_dereference(cpufreq_driver)->target) {
+		rcu_read_unlock();
 		i += sprintf(buf, "performance powersave");
 		goto out;
 	}
+	rcu_read_unlock();
 
 	list_for_each_entry(t, &cpufreq_governor_list, governor_list) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
@@ -587,8 +610,14 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+	int (*bios_limit)(int cpu, unsigned int *limit);
+
+	rcu_read_lock();
+	bios_limit = rcu_dereference(cpufreq_driver)->bios_limit;
+	rcu_read_unlock();
+
+	if (bios_limit) {
+		ret = bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -730,6 +759,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 				     struct device *dev)
 {
 	struct cpufreq_policy new_policy;
+	struct cpufreq_driver *driver;
+	int (*exit)(struct cpufreq_policy *policy);
 	struct freq_attr **drv_attr;
 	unsigned long flags;
 	int ret = 0;
@@ -742,35 +773,39 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
 			goto err_out_kobj_put;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
+	rcu_read_unlock();
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -787,8 +822,11 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 
 	if (ret) {
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		rcu_read_lock();
+		exit = rcu_dereference(cpufreq_driver)->exit;
+		rcu_read_unlock();
+		if (exit)
+			exit(policy);
 	}
 	return ret;
 
@@ -813,12 +851,13 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -849,6 +888,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM;
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	int (*init)(struct cpufreq_policy *policy);
+	struct module *owner;
+
 	unsigned long flags;
 #ifdef CONFIG_HOTPLUG_CPU
 	struct cpufreq_governor *gov;
@@ -869,21 +912,24 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		return 0;
 	}
 
+	rcu_read_lock();
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			rcu_read_unlock();
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 #endif
+	driver = rcu_dereference(cpufreq_driver);
+	init = driver->init;
+	owner = driver->owner;
+	rcu_read_unlock();
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	if (!try_module_get(owner)) {
 		ret = -EINVAL;
 		goto module_out;
 	}
@@ -911,7 +957,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_set_policy_cpu;
@@ -946,16 +992,17 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		goto err_out_unregister;
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	module_put(owner);
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 err_out_unregister:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -968,7 +1015,7 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	module_put(owner);
 module_out:
 	return ret;
 }
@@ -1002,29 +1049,46 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	unsigned int cpu = dev->id, ret, cpus;
 	unsigned long flags;
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
+	int (*target)(struct cpufreq_policy *policy,
+		       unsigned int target_freq,
+		       unsigned int relation);
+	int (*exit)(struct cpufreq_policy *policy);
+#ifdef CONFIG_HOTPLUG_CPU
+	int (*setpolicy)(struct cpufreq_policy *policy);
+#endif
 	struct kobject *kobj;
 	struct completion *cmp;
 	struct device *cpu_dev;
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
-
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
 		return -EINVAL;
 	}
 
-	if (cpufreq_driver->target)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	target = driver->target;
+	exit = driver->exit;
+#ifdef CONFIG_HOTPLUG_CPU
+	setpolicy = driver->setpolicy;
+#endif
+	rcu_read_unlock();
+
+	if (target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
-	if (!cpufreq_driver->setpolicy)
+	if (!setpolicy)
 		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
 			data->governor->name, CPUFREQ_NAME_LEN);
 #endif
@@ -1047,9 +1111,10 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			write_lock_irqsave(&cpufreq_driver_lock, flags);
+			spin_lock_irqsave(&cpufreq_driver_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			synchronize_rcu();
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1084,13 +1149,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		wait_for_completion(cmp);
 		pr_debug("wait complete\n");
 
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(data);
+		if (exit)
+			exit(data);
 
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		__cpufreq_governor(data, CPUFREQ_GOV_START);
 		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
 	}
@@ -1157,10 +1222,18 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 unsigned int cpufreq_quick_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
 
-	if (cpufreq_driver && cpufreq_driver->setpolicy && cpufreq_driver->get)
-		return cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver && driver->setpolicy && driver->get) {
+		get = driver->get;
+		rcu_read_unlock();
+		return get(cpu);
+	}
+	rcu_read_unlock();
 
 	policy = cpufreq_cpu_get(cpu);
 	if (policy) {
@@ -1197,14 +1270,23 @@ static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	unsigned int ret_freq = 0;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	u8 flags;
 
-	if (!cpufreq_driver->get)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	get = driver->get;
+	flags = driver->flags;
+	rcu_read_unlock();
+
+	if (!get)
 		return ret_freq;
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1261,6 +1343,7 @@ static struct subsys_interface cpufreq_interface = {
 static int cpufreq_bp_suspend(void)
 {
 	int ret = 0;
+	int (*suspend)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1272,8 +1355,11 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	rcu_read_lock();
+	suspend = rcu_dereference(cpufreq_driver)->suspend;
+	rcu_read_unlock();
+	if (suspend) {
+		ret = suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1299,6 +1385,7 @@ static int cpufreq_bp_suspend(void)
 static void cpufreq_bp_resume(void)
 {
 	int ret = 0;
+	int (*resume)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1310,8 +1397,11 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	rcu_read_lock();
+	resume = rcu_dereference(cpufreq_driver)->resume;
+	rcu_read_unlock();
+	if (resume) {
+		ret = resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1338,10 +1428,16 @@ static struct syscore_ops cpufreq_syscore_ops = {
  */
 const char *cpufreq_get_current_driver(void)
 {
-	if (cpufreq_driver)
-		return cpufreq_driver->name;
+	char *name = NULL;
+	struct cpufreq_driver *driver;
 
-	return NULL;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver)
+		name = driver->name;
+	rcu_read_unlock();
+
+	return name;
 }
 EXPORT_SYMBOL_GPL(cpufreq_get_current_driver);
 
@@ -1435,6 +1531,9 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1451,8 +1550,11 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	rcu_read_lock();
+	target = rcu_dereference(cpufreq_driver)->target;
+	rcu_read_unlock();
+	if (target)
+		retval = target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1485,18 +1587,24 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	unsigned int (*getavg)(struct cpufreq_policy *policy,
+			       unsigned int cpu);
 
 	if (cpufreq_disabled())
 		return ret;
 
-	if (!cpufreq_driver->getavg)
+	rcu_read_lock();
+	getavg = rcu_dereference(cpufreq_driver)->getavg;
+	rcu_read_unlock();
+
+	if (!getavg)
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1652,6 +1760,9 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver;
+	int (*verify)(struct cpufreq_policy *policy);
+	int (*setpolicy)(struct cpufreq_policy *policy);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1664,8 +1775,14 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 		goto error_out;
 	}
 
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	verify = driver->verify;
+	setpolicy = driver->setpolicy;
+	rcu_read_unlock();
+
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1679,7 +1796,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1693,10 +1810,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1743,6 +1860,11 @@ int cpufreq_update_policy(unsigned int cpu)
 {
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 	int ret;
 
 	if (!data) {
@@ -1762,15 +1884,21 @@ int cpufreq_update_policy(unsigned int cpu)
 	policy.policy = data->user_policy.policy;
 	policy.governor = data->user_policy.governor;
 
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	get = driver->get;
+	target = driver->target;
+	rcu_read_unlock();
+
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	if (get) {
+		policy.cur = get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
 		} else {
-			if (data->cur != policy.cur && cpufreq_driver->target)
+			if (data->cur != policy.cur && target)
 				cpufreq_out_of_sync(cpu, data->cur,
 								policy.cur);
 		}
@@ -1848,19 +1976,20 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	if (rcu_access_pointer(cpufreq_driver)) {
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, driver_data);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(driver_data->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1886,9 +2015,10 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1905,8 +2035,13 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver;
+
+	rcu_read_lock();
+	old_driver = rcu_access_pointer(cpufreq_driver);
+	rcu_read_unlock();
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	if (!old_driver || (driver != old_driver))
 		return -EINVAL;
 
 	pr_debug("unregistering driver %s\n", driver->name);
@@ -1914,9 +2049,10 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-20 23:56                 ` [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-02-21  5:50                   ` Viresh Kumar
  2013-02-21 17:49                     ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-21  5:50 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 21 February 2013 05:26, Nathan Zimmer <nzimmer@sgi.com> wrote:
> In general rwlocks are discourged so we are moving it to use the rcu instead.
> This does require a bit of care since the cpufreq_driver_lock protects both
> the cpufreq_driver and the cpufreq_cpu_data array.
> Also since many of the function pointers on cpufreq_driver may sleep when
> called we have to grab them under the rcu_read_lock but call them after
> rcu_read_unlock();

Even i have started reading rcu documentation now :)

> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 312 +++++++++++++++++++++++++++++++++-------------
>  1 file changed, 224 insertions(+), 88 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

> @@ -255,20 +258,21 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
>  void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>  {
>         struct cpufreq_policy *policy;
> -       unsigned long flags;
> +       u8 flags;

I think you can get rid of flags.

>         BUG_ON(irqs_disabled());
>
>         if (cpufreq_disabled())
>                 return;
>
> -       freqs->flags = cpufreq_driver->flags;
>         pr_debug("notification %u of frequency transition to %u kHz\n",
>                 state, freqs->new);
>
> -       read_lock_irqsave(&cpufreq_driver_lock, flags);
> +       rcu_read_lock();
> +       flags = rcu_dereference(cpufreq_driver)->flags;

use freq->flags here ...

>         policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
> -       read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +       rcu_read_unlock();
> +       freqs->flags = flags;
>
>         switch (state) {
>
> @@ -277,7 +281,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>                  * which is not equal to what the cpufreq core thinks is
>                  * "old frequency".
>                  */
> -               if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
> +               if (!(flags & CPUFREQ_CONST_LOOPS)) {

and here.

>                         if ((policy) && (policy->cpu == freqs->cpu) &&
>                             (policy->cur) && (policy->cur != freqs->old)) {
>                                 pr_debug("Warning: CPU frequency is"


> @@ -742,35 +773,39 @@ static int cpufreq_add_dev_interface(unsigned int cpu,

> -       write_lock_irqsave(&cpufreq_driver_lock, flags);
> +       spin_lock_irqsave(&cpufreq_driver_lock, flags);
>         for_each_cpu(j, policy->cpus) {
>                 per_cpu(cpufreq_cpu_data, j) = policy;
>                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>         }
> -       write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +       spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +       synchronize_rcu();

I don't think (but i can be wrong too :) ), that we need a synchronize_rcu()
here. We need it only at places where we have updated the cpufreq_driver
pointer.

As we aren't doing any rcu specific read/update for cpufreq_cpu_data.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-21  5:50                   ` Viresh Kumar
@ 2013-02-21 17:49                     ` Nathan Zimmer
  2013-02-22 16:24                       ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-21 17:49 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 02/20/2013 11:50 PM, Viresh Kumar wrote:
> On 21 February 2013 05:26, Nathan Zimmer <nzimmer@sgi.com> wrote:
>> In general rwlocks are discourged so we are moving it to use the rcu instead.
>> This does require a bit of care since the cpufreq_driver_lock protects both
>> the cpufreq_driver and the cpufreq_cpu_data array.
>> Also since many of the function pointers on cpufreq_driver may sleep when
>> called we have to grab them under the rcu_read_lock but call them after
>> rcu_read_unlock();
> Even i have started reading rcu documentation now :)
>
>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
>> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
>> ---
>>   drivers/cpufreq/cpufreq.c | 312 +++++++++++++++++++++++++++++++++-------------
>>   1 file changed, 224 insertions(+), 88 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> @@ -255,20 +258,21 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
>>   void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>>   {
>>          struct cpufreq_policy *policy;
>> -       unsigned long flags;
>> +       u8 flags;
> I think you can get rid of flags.
>
>>          BUG_ON(irqs_disabled());
>>
>>          if (cpufreq_disabled())
>>                  return;
>>
>> -       freqs->flags = cpufreq_driver->flags;
>>          pr_debug("notification %u of frequency transition to %u kHz\n",
>>                  state, freqs->new);
>>
>> -       read_lock_irqsave(&cpufreq_driver_lock, flags);
>> +       rcu_read_lock();
>> +       flags = rcu_dereference(cpufreq_driver)->flags;
> use freq->flags here ...
>
>>          policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
>> -       read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> +       rcu_read_unlock();
>> +       freqs->flags = flags;
>>
>>          switch (state) {
>>
>> @@ -277,7 +281,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>>                   * which is not equal to what the cpufreq core thinks is
>>                   * "old frequency".
>>                   */
>> -               if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
>> +               if (!(flags & CPUFREQ_CONST_LOOPS)) {
> and here.
Of course.
>>                          if ((policy) && (policy->cpu == freqs->cpu) &&
>>                              (policy->cur) && (policy->cur != freqs->old)) {
>>                                  pr_debug("Warning: CPU frequency is"
>
>> @@ -742,35 +773,39 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>> -       write_lock_irqsave(&cpufreq_driver_lock, flags);
>> +       spin_lock_irqsave(&cpufreq_driver_lock, flags);
>>          for_each_cpu(j, policy->cpus) {
>>                  per_cpu(cpufreq_cpu_data, j) = policy;
>>                  per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>>          }
>> -       write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> +       spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> +       synchronize_rcu();
> I don't think (but i can be wrong too :) ), that we need a synchronize_rcu()
> here. We need it only at places where we have updated the cpufreq_driver
> pointer.
>
> As we aren't doing any rcu specific read/update for cpufreq_cpu_data.
Good point.
I placed a similar sycnronize_rcu in cpufreq_add_policy_cpu and 
cpufreq_add_dev.
I will remove them also.


Thanks, I will respin.
Nate


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-21 17:49                     ` Nathan Zimmer
@ 2013-02-22 16:24                       ` Nathan Zimmer
  2013-02-22 16:24                         ` [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
                                           ` (2 more replies)
  0 siblings, 3 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-22 16:24 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

I am noticing the cpufreq_driver_lock is quite hot.
On an idle 512 system perf shows me most of the system time is spent on this
lock.  This is quite significant as top shows 5% of time in system time.
My solution was to first convert the lock to a rwlock and then to the rcu.

v2: Rebase

v3: Read the RCU documentation instead of skimming it.  Also I based on 
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm+acpi-3.9-rc1
I assumed that was what you would prefer Rafael.

v4: Removed an unnecessary syncronize_rcu().


Nathan Zimmer (2):
  cpufreq: Convert the cpufreq_driver_lock to a rwlock
  cpufreq: Convert the cpufreq_driver_lock to use the rcu

 drivers/cpufreq/cpufreq.c | 286 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 211 insertions(+), 75 deletions(-)

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-22 16:24                       ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
@ 2013-02-22 16:24                         ` Nathan Zimmer
  2013-02-23  3:57                           ` Viresh Kumar
  2013-02-22 16:24                         ` [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
  2013-03-11 23:23                         ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
  2 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-22 16:24 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

This eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 52 +++++++++++++++++++++++------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b02824d..c5996fe 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -45,7 +45,7 @@ static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
+static DEFINE_RWLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -137,7 +137,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	if (!cpufreq_driver)
 		goto err_out_unlock;
@@ -155,13 +155,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return data;
 
 err_out_put_module:
 	module_put(cpufreq_driver->owner);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 err_out:
 	return NULL;
 }
@@ -266,9 +266,9 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	switch (state) {
 
@@ -765,12 +765,12 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 			goto err_out_kobj_put;
 	}
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -813,12 +813,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -871,15 +871,15 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 #endif
 
@@ -952,10 +952,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1008,12 +1008,12 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
@@ -1047,9 +1047,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			spin_lock_irqsave(&cpufreq_driver_lock, flags);
+			write_lock_irqsave(&cpufreq_driver_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1848,13 +1848,13 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	if (cpufreq_driver) {
-		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
 	cpufreq_driver = driver_data;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
@@ -1886,9 +1886,9 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1914,9 +1914,9 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	write_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-22 16:24                       ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-02-22 16:24                         ` [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-22 16:24                         ` Nathan Zimmer
  2013-02-23  3:39                           ` Viresh Kumar
  2013-03-11 23:23                         ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
  2 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-22 16:24 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

In general rwlocks are discourged so we are moving it to use the rcu instead.
This does require a bit of care since the cpufreq_driver_lock protects both
the cpufreq_driver and the cpufreq_cpu_data array.
Also since many of the function pointers on cpufreq_driver may sleep when
called we have to grab them under the rcu_read_lock but call them after
rcu_read_unlock();

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 305 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 217 insertions(+), 88 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index c5996fe..c0e90f3 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,13 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_RWLOCK(cpufreq_driver_lock);
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -131,18 +131,19 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
-	unsigned long flags;
+	struct cpufreq_driver *driver;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
 
@@ -155,13 +156,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
 err_out_unlock:
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -184,7 +185,9 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -255,20 +258,19 @@ static inline void adjust_jiffies(unsigned long val, struct cpufreq_freqs *ci)
 void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 {
 	struct cpufreq_policy *policy;
-	unsigned long flags;
 
 	BUG_ON(irqs_disabled());
 
 	if (cpufreq_disabled())
 		return;
 
-	freqs->flags = cpufreq_driver->flags;
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	freqs->flags = rcu_dereference(cpufreq_driver)->flags;
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 
 	switch (state) {
 
@@ -277,7 +279,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(freqs->flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -329,11 +331,23 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
-
-	if (!cpufreq_driver)
+	struct cpufreq_driver *driver;
+	int (*setpolicy)(struct cpufreq_policy *policy);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
+
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver) {
+		rcu_read_unlock();
 		goto out;
+	}
+	setpolicy = driver->setpolicy;
+	target = driver->target;
+	rcu_read_unlock();
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -342,7 +356,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -493,7 +507,11 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	char *name;
+	rcu_read_lock();
+	name = rcu_dereference(cpufreq_driver)->name;
+	rcu_read_unlock();
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", name);
 }
 
 /**
@@ -505,10 +523,13 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	rcu_read_lock();
+	if (!rcu_dereference(cpufreq_driver)->target) {
+		rcu_read_unlock();
 		i += sprintf(buf, "performance powersave");
 		goto out;
 	}
+	rcu_read_unlock();
 
 	list_for_each_entry(t, &cpufreq_governor_list, governor_list) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
@@ -587,8 +608,14 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+	int (*bios_limit)(int cpu, unsigned int *limit);
+
+	rcu_read_lock();
+	bios_limit = rcu_dereference(cpufreq_driver)->bios_limit;
+	rcu_read_unlock();
+
+	if (bios_limit) {
+		ret = bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -730,6 +757,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 				     struct device *dev)
 {
 	struct cpufreq_policy new_policy;
+	struct cpufreq_driver *driver;
+	int (*exit)(struct cpufreq_policy *policy);
 	struct freq_attr **drv_attr;
 	unsigned long flags;
 	int ret = 0;
@@ -742,35 +771,38 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
 			goto err_out_kobj_put;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
 			goto err_out_kobj_put;
 	}
+	rcu_read_unlock();
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -787,8 +819,11 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 
 	if (ret) {
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		rcu_read_lock();
+		exit = rcu_dereference(cpufreq_driver)->exit;
+		rcu_read_unlock();
+		if (exit)
+			exit(policy);
 	}
 	return ret;
 
@@ -813,12 +848,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -849,6 +884,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM;
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	int (*init)(struct cpufreq_policy *policy);
+	struct module *owner;
+
 	unsigned long flags;
 #ifdef CONFIG_HOTPLUG_CPU
 	struct cpufreq_governor *gov;
@@ -869,21 +908,24 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		return 0;
 	}
 
+	rcu_read_lock();
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	read_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			read_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			rcu_read_unlock();
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	read_unlock_irqrestore(&cpufreq_driver_lock, flags);
 #endif
 #endif
+	driver = rcu_dereference(cpufreq_driver);
+	init = driver->init;
+	owner = driver->owner;
+	rcu_read_unlock();
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	if (!try_module_get(owner)) {
 		ret = -EINVAL;
 		goto module_out;
 	}
@@ -911,7 +953,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_set_policy_cpu;
@@ -946,16 +988,16 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		goto err_out_unregister;
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	module_put(owner);
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 err_out_unregister:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -968,7 +1010,7 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	module_put(owner);
 module_out:
 	return ret;
 }
@@ -1002,29 +1044,45 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	unsigned int cpu = dev->id, ret, cpus;
 	unsigned long flags;
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
+	int (*target)(struct cpufreq_policy *policy,
+		       unsigned int target_freq,
+		       unsigned int relation);
+	int (*exit)(struct cpufreq_policy *policy);
+#ifdef CONFIG_HOTPLUG_CPU
+	int (*setpolicy)(struct cpufreq_policy *policy);
+#endif
 	struct kobject *kobj;
 	struct completion *cmp;
 	struct device *cpu_dev;
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
-
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
 		return -EINVAL;
 	}
 
-	if (cpufreq_driver->target)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	target = driver->target;
+	exit = driver->exit;
+#ifdef CONFIG_HOTPLUG_CPU
+	setpolicy = driver->setpolicy;
+#endif
+	rcu_read_unlock();
+
+	if (target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
-	if (!cpufreq_driver->setpolicy)
+	if (!setpolicy)
 		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
 			data->governor->name, CPUFREQ_NAME_LEN);
 #endif
@@ -1047,9 +1105,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			write_lock_irqsave(&cpufreq_driver_lock, flags);
+			spin_lock_irqsave(&cpufreq_driver_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1084,13 +1142,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		wait_for_completion(cmp);
 		pr_debug("wait complete\n");
 
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(data);
+		if (exit)
+			exit(data);
 
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		__cpufreq_governor(data, CPUFREQ_GOV_START);
 		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
 	}
@@ -1157,10 +1215,18 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 unsigned int cpufreq_quick_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
 
-	if (cpufreq_driver && cpufreq_driver->setpolicy && cpufreq_driver->get)
-		return cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver && driver->setpolicy && driver->get) {
+		get = driver->get;
+		rcu_read_unlock();
+		return get(cpu);
+	}
+	rcu_read_unlock();
 
 	policy = cpufreq_cpu_get(cpu);
 	if (policy) {
@@ -1197,14 +1263,23 @@ static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
 	unsigned int ret_freq = 0;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	u8 flags;
 
-	if (!cpufreq_driver->get)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	get = driver->get;
+	flags = driver->flags;
+	rcu_read_unlock();
+
+	if (!get)
 		return ret_freq;
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1261,6 +1336,7 @@ static struct subsys_interface cpufreq_interface = {
 static int cpufreq_bp_suspend(void)
 {
 	int ret = 0;
+	int (*suspend)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1272,8 +1348,11 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	rcu_read_lock();
+	suspend = rcu_dereference(cpufreq_driver)->suspend;
+	rcu_read_unlock();
+	if (suspend) {
+		ret = suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1299,6 +1378,7 @@ static int cpufreq_bp_suspend(void)
 static void cpufreq_bp_resume(void)
 {
 	int ret = 0;
+	int (*resume)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1310,8 +1390,11 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	rcu_read_lock();
+	resume = rcu_dereference(cpufreq_driver)->resume;
+	rcu_read_unlock();
+	if (resume) {
+		ret = resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1338,10 +1421,16 @@ static struct syscore_ops cpufreq_syscore_ops = {
  */
 const char *cpufreq_get_current_driver(void)
 {
-	if (cpufreq_driver)
-		return cpufreq_driver->name;
+	char *name = NULL;
+	struct cpufreq_driver *driver;
 
-	return NULL;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver)
+		name = driver->name;
+	rcu_read_unlock();
+
+	return name;
 }
 EXPORT_SYMBOL_GPL(cpufreq_get_current_driver);
 
@@ -1435,6 +1524,9 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1451,8 +1543,11 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	rcu_read_lock();
+	target = rcu_dereference(cpufreq_driver)->target;
+	rcu_read_unlock();
+	if (target)
+		retval = target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1485,18 +1580,24 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	unsigned int (*getavg)(struct cpufreq_policy *policy,
+			       unsigned int cpu);
 
 	if (cpufreq_disabled())
 		return ret;
 
-	if (!cpufreq_driver->getavg)
+	rcu_read_lock();
+	getavg = rcu_dereference(cpufreq_driver)->getavg;
+	rcu_read_unlock();
+
+	if (!getavg)
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1652,6 +1753,9 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver;
+	int (*verify)(struct cpufreq_policy *policy);
+	int (*setpolicy)(struct cpufreq_policy *policy);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1664,8 +1768,14 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 		goto error_out;
 	}
 
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	verify = driver->verify;
+	setpolicy = driver->setpolicy;
+	rcu_read_unlock();
+
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1679,7 +1789,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1693,10 +1803,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1743,6 +1853,11 @@ int cpufreq_update_policy(unsigned int cpu)
 {
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 	int ret;
 
 	if (!data) {
@@ -1762,15 +1877,21 @@ int cpufreq_update_policy(unsigned int cpu)
 	policy.policy = data->user_policy.policy;
 	policy.governor = data->user_policy.governor;
 
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	get = driver->get;
+	target = driver->target;
+	rcu_read_unlock();
+
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	if (get) {
+		policy.cur = get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
 		} else {
-			if (data->cur != policy.cur && cpufreq_driver->target)
+			if (data->cur != policy.cur && target)
 				cpufreq_out_of_sync(cpu, data->cur,
 								policy.cur);
 		}
@@ -1848,19 +1969,20 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 	if (driver_data->setpolicy)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
-		write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	if (rcu_access_pointer(cpufreq_driver)) {
+		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, driver_data);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(driver_data->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1886,9 +2008,10 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1905,8 +2028,13 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver;
+
+	rcu_read_lock();
+	old_driver = rcu_access_pointer(cpufreq_driver);
+	rcu_read_unlock();
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	if (!old_driver || (driver != old_driver))
 		return -EINVAL;
 
 	pr_debug("unregistering driver %s\n", driver->name);
@@ -1914,9 +2042,10 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	subsys_interface_unregister(&cpufreq_interface);
 	unregister_hotcpu_notifier(&cpufreq_cpu_notifier);
 
-	write_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
-	write_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_assign_pointer(cpufreq_driver, NULL);
+	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-22 16:24                         ` [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-02-23  3:39                           ` Viresh Kumar
  2013-02-25 20:07                             ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-02-23  3:39 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

Hi Nathan,

Sorry for pointing out this so late but i still feel we are missing something
really important.

On 22 February 2013 21:54, Nathan Zimmer <nzimmer@sgi.com> wrote:

> -       read_lock_irqsave(&cpufreq_driver_lock, flags);
> +       rcu_read_lock();
> +       freqs->flags = rcu_dereference(cpufreq_driver)->flags;
>         policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
> -       read_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +       rcu_read_unlock();

> -       write_lock_irqsave(&cpufreq_driver_lock, flags);
> +       spin_lock_irqsave(&cpufreq_driver_lock, flags);
>         for_each_cpu(j, policy->cpus) {
>                 per_cpu(cpufreq_cpu_data, j) = policy;
>                 per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>         }
> -       write_unlock_irqrestore(&cpufreq_driver_lock, flags);
> +       spin_unlock_irqrestore(&cpufreq_driver_lock, flags);

Look at how we are protecting cpufreq_cpu_data here. rcu_read_[un]lock()
only marks the start/end of critical section. How are we sure here that
cpufreq_cpu_data is not read simultaneously when we are updating it?

rcu lock/unlock only works for cpufreq_driver pointer only and not for
this data. We still need the same locking for for cpufreq_cpu_data.

What do you say?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock
  2013-02-22 16:24                         ` [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
@ 2013-02-23  3:57                           ` Viresh Kumar
  0 siblings, 0 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-02-23  3:57 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 22 February 2013 21:54, Nathan Zimmer <nzimmer@sgi.com> wrote:
> This eliminates the contention I am seeing in __cpufreq_cpu_get.
> It also nicely stages the lock to be replaced by the rcu.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

@Rafael: I am too bored of seeing this patch again and again :)
Can we get this applied alone for now?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu
  2013-02-23  3:39                           ` Viresh Kumar
@ 2013-02-25 20:07                             ` Nathan Zimmer
  0 siblings, 0 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-02-25 20:07 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 02/22/2013 09:39 PM, Viresh Kumar wrote:
> Hi Nathan,
>
> Sorry for pointing out this so late but i still feel we are missing something
> really important.
>
> On 22 February 2013 21:54, Nathan Zimmer <nzimmer@sgi.com> wrote:
>
>> -       read_lock_irqsave(&cpufreq_driver_lock, flags);
>> +       rcu_read_lock();
>> +       freqs->flags = rcu_dereference(cpufreq_driver)->flags;
>>          policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
>> -       read_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> +       rcu_read_unlock();
>> -       write_lock_irqsave(&cpufreq_driver_lock, flags);
>> +       spin_lock_irqsave(&cpufreq_driver_lock, flags);
>>          for_each_cpu(j, policy->cpus) {
>>                  per_cpu(cpufreq_cpu_data, j) = policy;
>>                  per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>>          }
>> -       write_unlock_irqrestore(&cpufreq_driver_lock, flags);
>> +       spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
> Look at how we are protecting cpufreq_cpu_data here. rcu_read_[un]lock()
> only marks the start/end of critical section. How are we sure here that
> cpufreq_cpu_data is not read simultaneously when we are updating it?
>
> rcu lock/unlock only works for cpufreq_driver pointer only and not for
> this data. We still need the same locking for for cpufreq_cpu_data.
>
> What do you say?

That would include putting the lock around the  __cpufreq_cpu_get.
But I do think your right.

Perhaps a better way at this point is to have one lock for 
cpufreq_cpu_data, and a second with the rcu to protect cpufreq_driver.



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-02-22 16:24                       ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-02-22 16:24                         ` [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
  2013-02-22 16:24                         ` [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
@ 2013-03-11 23:23                         ` Rafael J. Wysocki
  2013-03-13 20:50                           ` Nathan Zimmer
  2 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-03-11 23:23 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: viresh.kumar, cpufreq, linux-pm, linux-kernel

On Friday, February 22, 2013 10:24:33 AM Nathan Zimmer wrote:
> I am noticing the cpufreq_driver_lock is quite hot.
> On an idle 512 system perf shows me most of the system time is spent on this
> lock.  This is quite significant as top shows 5% of time in system time.
> My solution was to first convert the lock to a rwlock and then to the rcu.
> 
> v2: Rebase
> 
> v3: Read the RCU documentation instead of skimming it.  Also I based on 
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm+acpi-3.9-rc1
> I assumed that was what you would prefer Rafael.
> 
> v4: Removed an unnecessary syncronize_rcu().
> 
> 
> Nathan Zimmer (2):
>   cpufreq: Convert the cpufreq_driver_lock to a rwlock
>   cpufreq: Convert the cpufreq_driver_lock to use the rcu
> 
>  drivers/cpufreq/cpufreq.c | 286 ++++++++++++++++++++++++++++++++++------------
>  1 file changed, 211 insertions(+), 75 deletions(-)

I'm going to take patch [1/2] for v3.10, but patch [2/2] still needs some
work it seems.  Is that correct?  If so, are you going to send an update?

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-03-11 23:23                         ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
@ 2013-03-13 20:50                           ` Nathan Zimmer
  2013-04-01 15:33                             ` [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems) Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-03-13 20:50 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: viresh.kumar, cpufreq, linux-pm, linux-kernel

On 03/11/2013 06:23 PM, Rafael J. Wysocki wrote:
> On Friday, February 22, 2013 10:24:33 AM Nathan Zimmer wrote:
>> I am noticing the cpufreq_driver_lock is quite hot.
>> On an idle 512 system perf shows me most of the system time is spent on this
>> lock.  This is quite significant as top shows 5% of time in system time.
>> My solution was to first convert the lock to a rwlock and then to the rcu.
>>
>> v2: Rebase
>>
>> v3: Read the RCU documentation instead of skimming it.  Also I based on
>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git pm+acpi-3.9-rc1
>> I assumed that was what you would prefer Rafael.
>>
>> v4: Removed an unnecessary syncronize_rcu().
>>
>>
>> Nathan Zimmer (2):
>>    cpufreq: Convert the cpufreq_driver_lock to a rwlock
>>    cpufreq: Convert the cpufreq_driver_lock to use the rcu
>>
>>   drivers/cpufreq/cpufreq.c | 286 ++++++++++++++++++++++++++++++++++------------
>>   1 file changed, 211 insertions(+), 75 deletions(-)
> I'm going to take patch [1/2] for v3.10, but patch [2/2] still needs some
> work it seems.  Is that correct?  If so, are you going to send an update?
>
> Rafael
>

Viresh pointed out that cpufreq_cpu_data still needs a lock.
This means placing a vanilla spinlock back into __cpufreq_cpu_get which 
is what I need to avoid.  I haven't had the time I should to sort that out.


Nate



^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems)
  2013-03-13 20:50                           ` Nathan Zimmer
@ 2013-04-01 15:33                             ` Nathan Zimmer
  2013-04-01 16:28                               ` Viresh Kumar
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-01 15:33 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

The cpufreq_driver_lock is hot with some configs.
This lock covers both cpufreq_driver and cpufreq_cpu_data so part one of the
proposed fix is to split up the lock abit.
cpufreq_cpu_data is now covered by the cpufreq_data_lock.
cpufreq_driver is now covered by the cpufreq_driver lock and the rcu.

This means that the cpufreq_driver_lock is no longer hot.
There remains some measurable heat on the cpufreq_data_lock it is significantly 
less then previous measured though. 

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 305 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 222 insertions(+), 83 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b02824d..387a5f8 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,15 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
+
+static DEFINE_SPINLOCK(cpufreq_data_lock);
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -131,22 +133,24 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
 	unsigned long flags;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
-
 	/* get the CPU */
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data)
@@ -155,13 +159,15 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -184,7 +190,9 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -262,13 +270,15 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	if (cpufreq_disabled())
 		return;
 
-	freqs->flags = cpufreq_driver->flags;
+	rcu_read_lock();
+	freqs->flags = rcu_dereference(cpufreq_driver)->flags;
+	rcu_read_unlock();
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	switch (state) {
 
@@ -277,7 +287,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(freqs->flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -329,11 +339,23 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
-
-	if (!cpufreq_driver)
+	struct cpufreq_driver *driver;
+	int (*setpolicy)(struct cpufreq_policy *policy);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
+
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver) {
+		rcu_read_unlock();
 		goto out;
+	}
+	setpolicy = driver->setpolicy;
+	target = driver->target;
+	rcu_read_unlock();
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -342,7 +364,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -493,7 +515,11 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	char *name;
+	rcu_read_lock();
+	name = rcu_dereference(cpufreq_driver)->name;
+	rcu_read_unlock();
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", name);
 }
 
 /**
@@ -505,10 +531,13 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	rcu_read_lock();
+	if (!rcu_dereference(cpufreq_driver)->target) {
 		i += sprintf(buf, "performance powersave");
+		rcu_read_unlock();
 		goto out;
 	}
+	rcu_read_unlock();
 
 	list_for_each_entry(t, &cpufreq_governor_list, governor_list) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
@@ -586,9 +615,15 @@ static ssize_t show_scaling_setspeed(struct cpufreq_policy *policy, char *buf)
 static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
+	int (*bios_limit)(int cpu, unsigned int *limit);
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+
+	rcu_read_lock();
+	bios_limit = rcu_dereference(cpufreq_driver)->bios_limit;
+	rcu_read_unlock();
+
+	if (bios_limit) {
+		ret = bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -731,6 +766,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 {
 	struct cpufreq_policy new_policy;
 	struct freq_attr **drv_attr;
+	struct cpufreq_driver *driver;
+	int (*exit)(struct cpufreq_policy *policy);
 	unsigned long flags;
 	int ret = 0;
 	unsigned int j;
@@ -742,35 +779,38 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
+	rcu_read_unlock();
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -787,11 +827,17 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 
 	if (ret) {
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		rcu_read_lock();
+		exit = rcu_dereference(cpufreq_driver)->exit;
+		if (exit)
+			exit(policy);
+		rcu_read_unlock();
+
 	}
 	return ret;
 
+err_out_unlock:
+	rcu_read_unlock();
 err_out_kobj_put:
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -813,12 +859,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -849,6 +895,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM;
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	int (*init)(struct cpufreq_policy *policy);
 	unsigned long flags;
 #ifdef CONFIG_HOTPLUG_CPU
 	struct cpufreq_governor *gov;
@@ -871,22 +919,27 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 #endif
 #endif
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!try_module_get(driver->owner)) {
+		rcu_read_unlock();
 		ret = -EINVAL;
 		goto module_out;
 	}
+	init = driver->init;
+	rcu_read_unlock();
 
 	policy = kzalloc(sizeof(struct cpufreq_policy), GFP_KERNEL);
 	if (!policy)
@@ -911,7 +964,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_set_policy_cpu;
@@ -946,16 +999,18 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		goto err_out_unregister;
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -968,7 +1023,9 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 module_out:
 	return ret;
 }
@@ -1002,32 +1059,42 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	unsigned int cpu = dev->id, ret, cpus;
 	unsigned long flags;
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
 	struct kobject *kobj;
 	struct completion *cmp;
 	struct device *cpu_dev;
+	int (*target)(struct cpufreq_policy *policy,
+		       unsigned int target_freq,
+		       unsigned int relation);
+	int (*exit)(struct cpufreq_policy *policy);
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
 		return -EINVAL;
 	}
 
-	if (cpufreq_driver->target)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	target = driver->target;
+	exit = driver->exit;
+	if (target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
-	if (!cpufreq_driver->setpolicy)
+	if (!driver->setpolicy)
 		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
 			data->governor->name, CPUFREQ_NAME_LEN);
 #endif
+	rcu_read_unlock();
 
 	WARN_ON(lock_policy_rwsem_write(cpu));
 	cpus = cpumask_weight(data->cpus);
@@ -1047,9 +1114,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			spin_lock_irqsave(&cpufreq_driver_lock, flags);
+			spin_lock_irqsave(&cpufreq_data_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1084,13 +1151,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		wait_for_completion(cmp);
 		pr_debug("wait complete\n");
 
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(data);
+		if (exit)
+			exit(data);
 
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
+	} else if (target) {
 		__cpufreq_governor(data, CPUFREQ_GOV_START);
 		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
 	}
@@ -1157,10 +1224,18 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 unsigned int cpufreq_quick_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
 
-	if (cpufreq_driver && cpufreq_driver->setpolicy && cpufreq_driver->get)
-		return cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver && driver->setpolicy && driver->get) {
+		get = driver->get;
+		rcu_read_unlock();
+		return get(cpu);
+	}
+	rcu_read_unlock();
 
 	policy = cpufreq_cpu_get(cpu);
 	if (policy) {
@@ -1196,15 +1271,26 @@ EXPORT_SYMBOL(cpufreq_quick_get_max);
 static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
+	u8 flags;
+
 
-	if (!cpufreq_driver->get)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver->get) {
+		rcu_read_unlock();
 		return ret_freq;
+	}
+	flags = driver->flags;
+	get = driver->get;
+	rcu_read_unlock();
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1260,6 +1346,7 @@ static struct subsys_interface cpufreq_interface = {
  */
 static int cpufreq_bp_suspend(void)
 {
+	int (*suspend)(struct cpufreq_policy *policy);
 	int ret = 0;
 
 	int cpu = smp_processor_id();
@@ -1272,8 +1359,11 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	rcu_read_lock();
+	suspend = rcu_dereference(cpufreq_driver)->suspend;
+	rcu_read_unlock();
+	if (suspend) {
+		ret = suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1299,6 +1389,7 @@ static int cpufreq_bp_suspend(void)
 static void cpufreq_bp_resume(void)
 {
 	int ret = 0;
+	int (*resume)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1310,8 +1401,12 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	rcu_read_lock();
+	resume = rcu_dereference(cpufreq_driver)->resume;
+	rcu_read_unlock();
+
+	if (resume) {
+		ret = resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1338,10 +1433,14 @@ static struct syscore_ops cpufreq_syscore_ops = {
  */
 const char *cpufreq_get_current_driver(void)
 {
-	if (cpufreq_driver)
-		return cpufreq_driver->name;
-
-	return NULL;
+	struct cpufreq_driver *driver;
+	const char *name = NULL;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver)
+		name = driver->name;
+	rcu_read_unlock();
+	return name;
 }
 EXPORT_SYMBOL_GPL(cpufreq_get_current_driver);
 
@@ -1435,6 +1534,9 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1451,8 +1553,11 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	rcu_read_lock();
+	target = rcu_dereference(cpufreq_driver)->target;
+	rcu_read_unlock();
+	if (target)
+		retval = target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1485,18 +1590,24 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	unsigned int (*getavg)(struct cpufreq_policy *policy,
+			       unsigned int cpu);
 
 	if (cpufreq_disabled())
 		return ret;
 
-	if (!cpufreq_driver->getavg)
+	rcu_read_lock();
+	getavg = rcu_dereference(cpufreq_driver)->getavg;
+	rcu_read_unlock();
+
+	if (!getavg)
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1652,6 +1763,9 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver;
+	int (*verify)(struct cpufreq_policy *policy);
+	int (*setpolicy)(struct cpufreq_policy *policy);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1665,7 +1779,13 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	}
 
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	verify = driver->verify;
+	setpolicy = driver->setpolicy;
+	rcu_read_unlock();
+
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1679,7 +1799,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1693,10 +1813,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1743,6 +1863,11 @@ int cpufreq_update_policy(unsigned int cpu)
 {
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 	int ret;
 
 	if (!data) {
@@ -1764,13 +1889,18 @@ int cpufreq_update_policy(unsigned int cpu)
 
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_access_pointer(cpufreq_driver);
+	get = driver->get;
+	target = driver->target;
+	rcu_read_unlock();
+	if (get) {
+		policy.cur = get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
 		} else {
-			if (data->cur != policy.cur && cpufreq_driver->target)
+			if (data->cur != policy.cur && target)
 				cpufreq_out_of_sync(cpu, data->cur,
 								policy.cur);
 		}
@@ -1849,18 +1979,19 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
+	if (rcu_access_pointer(cpufreq_driver)) {
 		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
+	rcu_assign_pointer(cpufreq_driver, driver_data);
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(driver_data->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1887,8 +2018,9 @@ err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
+	rcu_assign_pointer(cpufreq_driver, NULL);
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1905,9 +2037,15 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver;
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	rcu_read_lock();
+	old_driver = rcu_access_pointer(cpufreq_driver);
+	if (!old_driver || (driver != old_driver)) {
+		rcu_read_unlock();
 		return -EINVAL;
+	}
+	rcu_read_unlock();
 
 	pr_debug("unregistering driver %s\n", driver->name);
 
@@ -1917,6 +2055,7 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems)
  2013-04-01 15:33                             ` [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems) Nathan Zimmer
@ 2013-04-01 16:28                               ` Viresh Kumar
  2013-04-01 17:17                                 ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-04-01 16:28 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

Hi Nathan,

Welcome back :)

On 1 April 2013 21:03, Nathan Zimmer <nzimmer@sgi.com> wrote:

You need to resent this patch as we don't want current mail subject as commit
subject.. You could have used the area after three dashes "-" inside the
commit for logs which you don't want to commit.

> The cpufreq_driver_lock is hot with some configs.
> This lock covers both cpufreq_driver and cpufreq_cpu_data so part one of the

s/ so/, so/

> proposed fix is to split up the lock abit.

s/abit/a bit/

What's the other part?

> cpufreq_cpu_data is now covered by the cpufreq_data_lock.
> cpufreq_driver is now covered by the cpufreq_driver lock and the rcu.
>
> This means that the cpufreq_driver_lock is no longer hot.
> There remains some measurable heat on the cpufreq_data_lock it is significantly

s/it/but it/

> less then previous measured though.
>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> ---
>  drivers/cpufreq/cpufreq.c | 305 +++++++++++++++++++++++++++++++++-------------
>  1 file changed, 222 insertions(+), 83 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

> @@ -329,11 +339,23 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
>                                 struct cpufreq_governor **governor)
>  {
>         int err = -EINVAL;
> -
> -       if (!cpufreq_driver)
> +       struct cpufreq_driver *driver;
> +       int (*setpolicy)(struct cpufreq_policy *policy);
> +       int (*target)(struct cpufreq_policy *policy,
> +                     unsigned int target_freq,
> +                     unsigned int relation);

You can keep bools here instead of complex function pointers.
setpolicy_supported and target_supported

> +       rcu_read_lock();
> +       driver = rcu_dereference(cpufreq_driver);
> +       if (!driver) {
> +               rcu_read_unlock();
>                 goto out;
> +       }
> +       setpolicy = driver->setpolicy;
> +       target = driver->target;
> +       rcu_read_unlock();
>
> -       if (cpufreq_driver->setpolicy) {
> +       if (setpolicy) {
>                 if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
>                         *policy = CPUFREQ_POLICY_PERFORMANCE;
>                         err = 0;
> @@ -342,7 +364,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
>                         *policy = CPUFREQ_POLICY_POWERSAVE;
>                         err = 0;
>                 }
> -       } else if (cpufreq_driver->target) {
> +       } else if (target) {
>                 struct cpufreq_governor *t;
>
>                 mutex_lock(&cpufreq_governor_mutex);

> @@ -731,6 +766,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>  {
>         struct cpufreq_policy new_policy;
>         struct freq_attr **drv_attr;
> +       struct cpufreq_driver *driver;
> +       int (*exit)(struct cpufreq_policy *policy);

Declare it in the block which used it.

>         if (ret) {
>                 pr_debug("setting policy failed\n");
> -               if (cpufreq_driver->exit)
> -                       cpufreq_driver->exit(policy);
> +               rcu_read_lock();
> +               exit = rcu_dereference(cpufreq_driver)->exit;
> +               if (exit)
> +                       exit(policy);
> +               rcu_read_unlock();
> +
>         }

> @@ -1002,32 +1059,42 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>         unsigned int cpu = dev->id, ret, cpus;
>         unsigned long flags;
>         struct cpufreq_policy *data;
> +       struct cpufreq_driver *driver;
>         struct kobject *kobj;
>         struct completion *cmp;
>         struct device *cpu_dev;
> +       int (*target)(struct cpufreq_policy *policy,
> +                      unsigned int target_freq,
> +                      unsigned int relation);

can be bool?

> +       int (*exit)(struct cpufreq_policy *policy);
>


One more generic comment: What about a reader-writer lock for
cpufreq_data_lock??

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems)
  2013-04-01 16:28                               ` Viresh Kumar
@ 2013-04-01 17:17                                 ` Nathan Zimmer
  2013-04-01 20:11                                   ` [PATCH v6 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-01 17:17 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 04/01/2013 11:28 AM, Viresh Kumar wrote:
> Hi Nathan,
>
> Welcome back :)
>
> On 1 April 2013 21:03, Nathan Zimmer <nzimmer@sgi.com> wrote:
>
> You need to resent this patch as we don't want current mail subject as commit
> subject.. You could have used the area after three dashes "-" inside the
> commit for logs which you don't want to commit.
Ok.
>> The cpufreq_driver_lock is hot with some configs.
>> This lock covers both cpufreq_driver and cpufreq_cpu_data so part one of the
> s/ so/, so/
>
>> proposed fix is to split up the lock abit.
> s/abit/a bit/
>
> What's the other part?
>
>> cpufreq_cpu_data is now covered by the cpufreq_data_lock.
>> cpufreq_driver is now covered by the cpufreq_driver lock and the rcu.
>>
>> This means that the cpufreq_driver_lock is no longer hot.
>> There remains some measurable heat on the cpufreq_data_lock it is significantly
> s/it/but it/

>> less then previous measured though.
>>
>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
>> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
>> ---
>>   drivers/cpufreq/cpufreq.c | 305 +++++++++++++++++++++++++++++++++-------------
>>   1 file changed, 222 insertions(+), 83 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> @@ -329,11 +339,23 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
>>                                  struct cpufreq_governor **governor)
>>   {
>>          int err = -EINVAL;
>> -
>> -       if (!cpufreq_driver)
>> +       struct cpufreq_driver *driver;
>> +       int (*setpolicy)(struct cpufreq_policy *policy);
>> +       int (*target)(struct cpufreq_policy *policy,
>> +                     unsigned int target_freq,
>> +                     unsigned int relation);
> You can keep bools here instead of complex function pointers.
> setpolicy_supported and target_supported
Good point.  In a few places I needed the function pointer but not here.
I'll convert the unneeded ones to bools and resend.

>> +       rcu_read_lock();
>> +       driver = rcu_dereference(cpufreq_driver);
>> +       if (!driver) {
>> +               rcu_read_unlock();
>>                  goto out;
>> +       }
>> +       setpolicy = driver->setpolicy;
>> +       target = driver->target;
>> +       rcu_read_unlock();
>>
>> -       if (cpufreq_driver->setpolicy) {
>> +       if (setpolicy) {
>>                  if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
>>                          *policy = CPUFREQ_POLICY_PERFORMANCE;
>>                          err = 0;
>> @@ -342,7 +364,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
>>                          *policy = CPUFREQ_POLICY_POWERSAVE;
>>                          err = 0;
>>                  }
>> -       } else if (cpufreq_driver->target) {
>> +       } else if (target) {
>>                  struct cpufreq_governor *t;
>>
>>                  mutex_lock(&cpufreq_governor_mutex);
>> @@ -731,6 +766,8 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>>   {
>>          struct cpufreq_policy new_policy;
>>          struct freq_attr **drv_attr;
>> +       struct cpufreq_driver *driver;
>> +       int (*exit)(struct cpufreq_policy *policy);
> Declare it in the block which used it.
>
>>          if (ret) {
>>                  pr_debug("setting policy failed\n");
>> -               if (cpufreq_driver->exit)
>> -                       cpufreq_driver->exit(policy);
>> +               rcu_read_lock();
>> +               exit = rcu_dereference(cpufreq_driver)->exit;
>> +               if (exit)
>> +                       exit(policy);
>> +               rcu_read_unlock();
>> +
>>          }
>> @@ -1002,32 +1059,42 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>>          unsigned int cpu = dev->id, ret, cpus;
>>          unsigned long flags;
>>          struct cpufreq_policy *data;
>> +       struct cpufreq_driver *driver;
>>          struct kobject *kobj;
>>          struct completion *cmp;That
>>          struct device *cpu_dev;
>> +       int (*target)(struct cpufreq_policy *policy,
>> +                      unsigned int target_freq,
>> +                      unsigned int relation);
> can be bool?
>
>> +       int (*exit)(struct cpufreq_policy *policy);
>>
>
> One more generic comment: What about a reader-writer lock for
> cpufreq_data_lock??
I had been looking for ways to use the rcu but wasn't having much success.
Let me try a rwlock and grab some numbers after lunch.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v6 0/2] cpufreq: cpufreq_driver_lock is hot on large systems
  2013-04-01 17:17                                 ` Nathan Zimmer
@ 2013-04-01 20:11                                   ` Nathan Zimmer
  2013-04-01 20:11                                     ` [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu Nathan Zimmer
  2013-04-01 20:11                                     ` [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock Nathan Zimmer
  0 siblings, 2 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-01 20:11 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

I am noticing the cpufreq_driver_lock is quite hot.
On an idle 512 system perf shows me most of the system time is spent on this
lock.  This is quite significant as top shows 5% of time in system time.
My solution was to first split the lock into two parts, cpu_driver_lock and 
cpu_data_lock, with the cpufreq_driver also being protected by the RCU.
There was measurable heat left on the cpufreq_data_lock in __cpufreq_cpu_get.
So in the second part I converted the cpufreq_data_lock to be a rw lock since
an rcu solution was not apparent, at least to me.

v5: Go a different way and split up the lock and use the rcu
v6: use bools instead of checking function pointers
    covert the cpufreq_data_lock to a rwlock

Nathan Zimmer (2):
  cpufreq: split the cpufreq_driver_lock and use the rcu
  cpufreq: covert the cpufreq_data_lock to a spinlock

 drivers/cpufreq/cpufreq.c | 302 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 219 insertions(+), 83 deletions(-)

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-01 20:11                                   ` [PATCH v6 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
@ 2013-04-01 20:11                                     ` Nathan Zimmer
  2013-04-02  5:05                                       ` Viresh Kumar
  2013-04-01 20:11                                     ` [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock Nathan Zimmer
  1 sibling, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-01 20:11 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

The cpufreq_driver_lock is hot with some configs.
This lock covers both cpufreq_driver and cpufreq_cpu_data, so part one of the
proposed fix is to split up the lock into two pieces.
cpufreq_cpu_data is now covered by the cpufreq_data_lock.
cpufreq_driver is now covered by the cpufreq_driver lock and the rcu.

This means that the cpufreq_driver_lock is no longer hot.
There remains some measurable heat on the cpufreq_data_lock but it is
significantly less then previously measured.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 302 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 219 insertions(+), 83 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b02824d..5139eab 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -39,13 +39,15 @@
  * level driver of CPUFreq support, and its spinlock. This lock
  * also protects the cpufreq_cpu_data array.
  */
-static struct cpufreq_driver *cpufreq_driver;
+static struct cpufreq_driver __rcu *cpufreq_driver;
+static DEFINE_SPINLOCK(cpufreq_driver_lock);
+
+static DEFINE_SPINLOCK(cpufreq_data_lock);
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
 static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
 #endif
-static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
 /*
  * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
@@ -131,22 +133,24 @@ static DEFINE_MUTEX(cpufreq_governor_mutex);
 static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 {
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
 	unsigned long flags;
 
 	if (cpu >= nr_cpu_ids)
 		goto err_out;
 
 	/* get the cpufreq driver */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
 
-	if (!cpufreq_driver)
+	if (!driver)
 		goto err_out_unlock;
 
-	if (!try_module_get(cpufreq_driver->owner))
+	if (!try_module_get(driver->owner))
 		goto err_out_unlock;
 
-
 	/* get the CPU */
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data)
@@ -155,13 +159,15 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
-	module_put(cpufreq_driver->owner);
+	module_put(driver->owner);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 err_out_unlock:
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	rcu_read_unlock();
 err_out:
 	return NULL;
 }
@@ -184,7 +190,9 @@ static void __cpufreq_cpu_put(struct cpufreq_policy *data, bool sysfs)
 {
 	if (!sysfs)
 		kobject_put(&data->kobj);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 }
 
 void cpufreq_cpu_put(struct cpufreq_policy *data)
@@ -262,13 +270,15 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	if (cpufreq_disabled())
 		return;
 
-	freqs->flags = cpufreq_driver->flags;
+	rcu_read_lock();
+	freqs->flags = rcu_dereference(cpufreq_driver)->flags;
+	rcu_read_unlock();
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	switch (state) {
 
@@ -277,7 +287,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
 		 */
-		if (!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		if (!(freqs->flags & CPUFREQ_CONST_LOOPS)) {
 			if ((policy) && (policy->cpu == freqs->cpu) &&
 			    (policy->cur) && (policy->cur != freqs->old)) {
 				pr_debug("Warning: CPU frequency is"
@@ -329,11 +339,21 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 				struct cpufreq_governor **governor)
 {
 	int err = -EINVAL;
-
-	if (!cpufreq_driver)
+	struct cpufreq_driver *driver;
+	bool has_setpolicy;
+	bool has_target;
+
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver) {
+		rcu_read_unlock();
 		goto out;
+	}
+	has_setpolicy = driver->setpolicy ? true : false;
+	has_target = driver->target ? true : false;
+	rcu_read_unlock();
 
-	if (cpufreq_driver->setpolicy) {
+	if (has_setpolicy) {
 		if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
 			*policy = CPUFREQ_POLICY_PERFORMANCE;
 			err = 0;
@@ -342,7 +362,7 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 			*policy = CPUFREQ_POLICY_POWERSAVE;
 			err = 0;
 		}
-	} else if (cpufreq_driver->target) {
+	} else if (has_target) {
 		struct cpufreq_governor *t;
 
 		mutex_lock(&cpufreq_governor_mutex);
@@ -493,7 +513,11 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
  */
 static ssize_t show_scaling_driver(struct cpufreq_policy *policy, char *buf)
 {
-	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", cpufreq_driver->name);
+	char *name;
+	rcu_read_lock();
+	name = rcu_dereference(cpufreq_driver)->name;
+	rcu_read_unlock();
+	return scnprintf(buf, CPUFREQ_NAME_PLEN, "%s\n", name);
 }
 
 /**
@@ -505,10 +529,13 @@ static ssize_t show_scaling_available_governors(struct cpufreq_policy *policy,
 	ssize_t i = 0;
 	struct cpufreq_governor *t;
 
-	if (!cpufreq_driver->target) {
+	rcu_read_lock();
+	if (!rcu_dereference(cpufreq_driver)->target) {
 		i += sprintf(buf, "performance powersave");
+		rcu_read_unlock();
 		goto out;
 	}
+	rcu_read_unlock();
 
 	list_for_each_entry(t, &cpufreq_governor_list, governor_list) {
 		if (i >= (ssize_t) ((PAGE_SIZE / sizeof(char))
@@ -586,9 +613,15 @@ static ssize_t show_scaling_setspeed(struct cpufreq_policy *policy, char *buf)
 static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf)
 {
 	unsigned int limit;
+	int (*bios_limit)(int cpu, unsigned int *limit);
 	int ret;
-	if (cpufreq_driver->bios_limit) {
-		ret = cpufreq_driver->bios_limit(policy->cpu, &limit);
+
+	rcu_read_lock();
+	bios_limit = rcu_dereference(cpufreq_driver)->bios_limit;
+	rcu_read_unlock();
+
+	if (bios_limit) {
+		ret = bios_limit(policy->cpu, &limit);
 		if (!ret)
 			return sprintf(buf, "%u\n", limit);
 	}
@@ -731,6 +764,7 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 {
 	struct cpufreq_policy new_policy;
 	struct freq_attr **drv_attr;
+	struct cpufreq_driver *driver;
 	unsigned long flags;
 	int ret = 0;
 	unsigned int j;
@@ -742,35 +776,38 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 		return ret;
 
 	/* set up files for this cpu device */
-	drv_attr = cpufreq_driver->attr;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	drv_attr = driver->attr;
 	while ((drv_attr) && (*drv_attr)) {
 		ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 		drv_attr++;
 	}
-	if (cpufreq_driver->get) {
+	if (driver->get) {
 		ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
-	if (cpufreq_driver->target) {
+	if (driver->target) {
 		ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
-	if (cpufreq_driver->bios_limit) {
+	if (driver->bios_limit) {
 		ret = sysfs_create_file(&policy->kobj, &bios_limit.attr);
 		if (ret)
-			goto err_out_kobj_put;
+			goto err_out_unlock;
 	}
+	rcu_read_unlock();
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -786,12 +823,20 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 	policy->user_policy.governor = policy->governor;
 
 	if (ret) {
+		int (*exit)(struct cpufreq_policy *policy);
+
 		pr_debug("setting policy failed\n");
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(policy);
+		rcu_read_lock();
+		exit = rcu_dereference(cpufreq_driver)->exit;
+		if (exit)
+			exit(policy);
+		rcu_read_unlock();
+
 	}
 	return ret;
 
+err_out_unlock:
+	rcu_read_unlock();
 err_out_kobj_put:
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -813,12 +858,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -849,6 +894,8 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	unsigned int j, cpu = dev->id;
 	int ret = -ENOMEM;
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	int (*init)(struct cpufreq_policy *policy);
 	unsigned long flags;
 #ifdef CONFIG_HOTPLUG_CPU
 	struct cpufreq_governor *gov;
@@ -871,22 +918,27 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 #endif
 #endif
 
-	if (!try_module_get(cpufreq_driver->owner)) {
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!try_module_get(driver->owner)) {
+		rcu_read_unlock();
 		ret = -EINVAL;
 		goto module_out;
 	}
+	init = driver->init;
+	rcu_read_unlock();
 
 	policy = kzalloc(sizeof(struct cpufreq_policy), GFP_KERNEL);
 	if (!policy)
@@ -911,7 +963,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	/* call driver. From then on the cpufreq must be able
 	 * to accept all calls to ->verify and ->setpolicy for this CPU
 	 */
-	ret = cpufreq_driver->init(policy);
+	ret = init(policy);
 	if (ret) {
 		pr_debug("initialization failed\n");
 		goto err_set_policy_cpu;
@@ -946,16 +998,18 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 		goto err_out_unregister;
 
 	kobject_uevent(&policy->kobj, KOBJ_ADD);
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 	pr_debug("initialization complete\n");
 
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -968,7 +1022,9 @@ err_free_cpumask:
 err_free_policy:
 	kfree(policy);
 nomem_out:
-	module_put(cpufreq_driver->owner);
+	rcu_read_lock();
+	module_put(rcu_dereference(cpufreq_driver)->owner);
+	rcu_read_unlock();
 module_out:
 	return ret;
 }
@@ -1002,32 +1058,40 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 	unsigned int cpu = dev->id, ret, cpus;
 	unsigned long flags;
 	struct cpufreq_policy *data;
+	struct cpufreq_driver *driver;
 	struct kobject *kobj;
 	struct completion *cmp;
 	struct device *cpu_dev;
+	bool has_target;
+	int (*exit)(struct cpufreq_policy *policy);
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_driver_lock, flags);
+	spin_lock_irqsave(&cpufreq_data_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
 
-	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
 		return -EINVAL;
 	}
 
-	if (cpufreq_driver->target)
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	has_target = driver->target ? true : false;
+	exit = driver->exit;
+	if (has_target)
 		__cpufreq_governor(data, CPUFREQ_GOV_STOP);
 
 #ifdef CONFIG_HOTPLUG_CPU
-	if (!cpufreq_driver->setpolicy)
+	if (!driver->setpolicy)
 		strncpy(per_cpu(cpufreq_cpu_governor, cpu),
 			data->governor->name, CPUFREQ_NAME_LEN);
 #endif
+	rcu_read_unlock();
 
 	WARN_ON(lock_policy_rwsem_write(cpu));
 	cpus = cpumask_weight(data->cpus);
@@ -1047,9 +1111,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			spin_lock_irqsave(&cpufreq_driver_lock, flags);
+			spin_lock_irqsave(&cpufreq_data_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
@@ -1084,13 +1148,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 		wait_for_completion(cmp);
 		pr_debug("wait complete\n");
 
-		if (cpufreq_driver->exit)
-			cpufreq_driver->exit(data);
+		if (exit)
+			exit(data);
 
 		free_cpumask_var(data->related_cpus);
 		free_cpumask_var(data->cpus);
 		kfree(data);
-	} else if (cpufreq_driver->target) {
+	} else if (has_target) {
 		__cpufreq_governor(data, CPUFREQ_GOV_START);
 		__cpufreq_governor(data, CPUFREQ_GOV_LIMITS);
 	}
@@ -1157,10 +1221,18 @@ static void cpufreq_out_of_sync(unsigned int cpu, unsigned int old_freq,
 unsigned int cpufreq_quick_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
 
-	if (cpufreq_driver && cpufreq_driver->setpolicy && cpufreq_driver->get)
-		return cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver && driver->setpolicy && driver->get) {
+		get = driver->get;
+		rcu_read_unlock();
+		return get(cpu);
+	}
+	rcu_read_unlock();
 
 	policy = cpufreq_cpu_get(cpu);
 	if (policy) {
@@ -1196,15 +1268,26 @@ EXPORT_SYMBOL(cpufreq_quick_get_max);
 static unsigned int __cpufreq_get(unsigned int cpu)
 {
 	struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
 	unsigned int ret_freq = 0;
+	u8 flags;
 
-	if (!cpufreq_driver->get)
+
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (!driver->get) {
+		rcu_read_unlock();
 		return ret_freq;
+	}
+	flags = driver->flags;
+	get = driver->get;
+	rcu_read_unlock();
 
-	ret_freq = cpufreq_driver->get(cpu);
+	ret_freq = get(cpu);
 
 	if (ret_freq && policy->cur &&
-		!(cpufreq_driver->flags & CPUFREQ_CONST_LOOPS)) {
+		!(flags & CPUFREQ_CONST_LOOPS)) {
 		/* verify no discrepancy between actual and
 					saved value exists */
 		if (unlikely(ret_freq != policy->cur)) {
@@ -1260,6 +1343,7 @@ static struct subsys_interface cpufreq_interface = {
  */
 static int cpufreq_bp_suspend(void)
 {
+	int (*suspend)(struct cpufreq_policy *policy);
 	int ret = 0;
 
 	int cpu = smp_processor_id();
@@ -1272,8 +1356,11 @@ static int cpufreq_bp_suspend(void)
 	if (!cpu_policy)
 		return 0;
 
-	if (cpufreq_driver->suspend) {
-		ret = cpufreq_driver->suspend(cpu_policy);
+	rcu_read_lock();
+	suspend = rcu_dereference(cpufreq_driver)->suspend;
+	rcu_read_unlock();
+	if (suspend) {
+		ret = suspend(cpu_policy);
 		if (ret)
 			printk(KERN_ERR "cpufreq: suspend failed in ->suspend "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1299,6 +1386,7 @@ static int cpufreq_bp_suspend(void)
 static void cpufreq_bp_resume(void)
 {
 	int ret = 0;
+	int (*resume)(struct cpufreq_policy *policy);
 
 	int cpu = smp_processor_id();
 	struct cpufreq_policy *cpu_policy;
@@ -1310,8 +1398,12 @@ static void cpufreq_bp_resume(void)
 	if (!cpu_policy)
 		return;
 
-	if (cpufreq_driver->resume) {
-		ret = cpufreq_driver->resume(cpu_policy);
+	rcu_read_lock();
+	resume = rcu_dereference(cpufreq_driver)->resume;
+	rcu_read_unlock();
+
+	if (resume) {
+		ret = resume(cpu_policy);
 		if (ret) {
 			printk(KERN_ERR "cpufreq: resume failed in ->resume "
 					"step on CPU %u\n", cpu_policy->cpu);
@@ -1338,10 +1430,14 @@ static struct syscore_ops cpufreq_syscore_ops = {
  */
 const char *cpufreq_get_current_driver(void)
 {
-	if (cpufreq_driver)
-		return cpufreq_driver->name;
-
-	return NULL;
+	struct cpufreq_driver *driver;
+	const char *name = NULL;
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	if (driver)
+		name = driver->name;
+	rcu_read_unlock();
+	return name;
 }
 EXPORT_SYMBOL_GPL(cpufreq_get_current_driver);
 
@@ -1435,6 +1531,9 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 {
 	int retval = -EINVAL;
 	unsigned int old_target_freq = target_freq;
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 
 	if (cpufreq_disabled())
 		return -ENODEV;
@@ -1451,8 +1550,11 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
 	if (target_freq == policy->cur)
 		return 0;
 
-	if (cpufreq_driver->target)
-		retval = cpufreq_driver->target(policy, target_freq, relation);
+	rcu_read_lock();
+	target = rcu_dereference(cpufreq_driver)->target;
+	rcu_read_unlock();
+	if (target)
+		retval = target(policy, target_freq, relation);
 
 	return retval;
 }
@@ -1485,18 +1587,24 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_target);
 int __cpufreq_driver_getavg(struct cpufreq_policy *policy, unsigned int cpu)
 {
 	int ret = 0;
+	unsigned int (*getavg)(struct cpufreq_policy *policy,
+			       unsigned int cpu);
 
 	if (cpufreq_disabled())
 		return ret;
 
-	if (!cpufreq_driver->getavg)
+	rcu_read_lock();
+	getavg = rcu_dereference(cpufreq_driver)->getavg;
+	rcu_read_unlock();
+
+	if (!getavg)
 		return 0;
 
 	policy = cpufreq_cpu_get(policy->cpu);
 	if (!policy)
 		return -EINVAL;
 
-	ret = cpufreq_driver->getavg(policy, cpu);
+	ret = getavg(policy, cpu);
 
 	cpufreq_cpu_put(policy);
 	return ret;
@@ -1652,6 +1760,9 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 				struct cpufreq_policy *policy)
 {
 	int ret = 0;
+	struct cpufreq_driver *driver;
+	int (*verify)(struct cpufreq_policy *policy);
+	int (*setpolicy)(struct cpufreq_policy *policy);
 
 	pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
 		policy->min, policy->max);
@@ -1665,7 +1776,13 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	}
 
 	/* verify the cpu speed can be set within this limit */
-	ret = cpufreq_driver->verify(policy);
+	rcu_read_lock();
+	driver = rcu_dereference(cpufreq_driver);
+	verify = driver->verify;
+	setpolicy = driver->setpolicy;
+	rcu_read_unlock();
+
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1679,7 +1796,7 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 
 	/* verify the cpu speed can be set within this limit,
 	   which might be different to the first one */
-	ret = cpufreq_driver->verify(policy);
+	ret = verify(policy);
 	if (ret)
 		goto error_out;
 
@@ -1693,10 +1810,10 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
 	pr_debug("new min and max freqs are %u - %u kHz\n",
 					data->min, data->max);
 
-	if (cpufreq_driver->setpolicy) {
+	if (setpolicy) {
 		data->policy = policy->policy;
 		pr_debug("setting range\n");
-		ret = cpufreq_driver->setpolicy(policy);
+		ret = setpolicy(policy);
 	} else {
 		if (policy->governor != data->governor) {
 			/* save old, working values */
@@ -1743,6 +1860,11 @@ int cpufreq_update_policy(unsigned int cpu)
 {
 	struct cpufreq_policy *data = cpufreq_cpu_get(cpu);
 	struct cpufreq_policy policy;
+	struct cpufreq_driver *driver;
+	unsigned int (*get)(unsigned int cpu);
+	int (*target)(struct cpufreq_policy *policy,
+		      unsigned int target_freq,
+		      unsigned int relation);
 	int ret;
 
 	if (!data) {
@@ -1764,13 +1886,18 @@ int cpufreq_update_policy(unsigned int cpu)
 
 	/* BIOS might change freq behind our back
 	  -> ask driver for current freq and notify governors about a change */
-	if (cpufreq_driver->get) {
-		policy.cur = cpufreq_driver->get(cpu);
+	rcu_read_lock();
+	driver = rcu_access_pointer(cpufreq_driver);
+	get = driver->get;
+	target = driver->target;
+	rcu_read_unlock();
+	if (get) {
+		policy.cur = get(cpu);
 		if (!data->cur) {
 			pr_debug("Driver did not initialize current freq");
 			data->cur = policy.cur;
 		} else {
-			if (data->cur != policy.cur && cpufreq_driver->target)
+			if (data->cur != policy.cur && target)
 				cpufreq_out_of_sync(cpu, data->cur,
 								policy.cur);
 		}
@@ -1849,18 +1976,19 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 		driver_data->flags |= CPUFREQ_CONST_LOOPS;
 
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
-	if (cpufreq_driver) {
+	if (rcu_access_pointer(cpufreq_driver)) {
 		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
 		return -EBUSY;
 	}
-	cpufreq_driver = driver_data;
+	rcu_assign_pointer(cpufreq_driver, driver_data);
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	ret = subsys_interface_register(&cpufreq_interface);
 	if (ret)
 		goto err_null_driver;
 
-	if (!(cpufreq_driver->flags & CPUFREQ_STICKY)) {
+	if (!(driver_data->flags & CPUFREQ_STICKY)) {
 		int i;
 		ret = -ENODEV;
 
@@ -1887,8 +2015,9 @@ err_if_unreg:
 	subsys_interface_unregister(&cpufreq_interface);
 err_null_driver:
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
-	cpufreq_driver = NULL;
+	rcu_assign_pointer(cpufreq_driver, NULL);
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpufreq_register_driver);
@@ -1905,9 +2034,15 @@ EXPORT_SYMBOL_GPL(cpufreq_register_driver);
 int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 {
 	unsigned long flags;
+	struct cpufreq_driver *old_driver;
 
-	if (!cpufreq_driver || (driver != cpufreq_driver))
+	rcu_read_lock();
+	old_driver = rcu_access_pointer(cpufreq_driver);
+	if (!old_driver || (driver != old_driver)) {
+		rcu_read_unlock();
 		return -EINVAL;
+	}
+	rcu_read_unlock();
 
 	pr_debug("unregistering driver %s\n", driver->name);
 
@@ -1917,6 +2052,7 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
 	spin_lock_irqsave(&cpufreq_driver_lock, flags);
 	cpufreq_driver = NULL;
 	spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
+	synchronize_rcu();
 
 	return 0;
 }
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-01 20:11                                   ` [PATCH v6 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
  2013-04-01 20:11                                     ` [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu Nathan Zimmer
@ 2013-04-01 20:11                                     ` Nathan Zimmer
  2013-04-01 20:41                                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-01 20:11 UTC (permalink / raw)
  To: viresh.kumar, rjw; +Cc: cpufreq, linux-pm, linux-kernel, Nathan Zimmer

This eliminates the rest of the contention found in __cpufreq_cpu_get.
I am not seeing a way to use the rcu so we will have to make due with a
rwlock for now.

Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 drivers/cpufreq/cpufreq.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 5139eab..7438c34 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -42,7 +42,7 @@
 static struct cpufreq_driver __rcu *cpufreq_driver;
 static DEFINE_SPINLOCK(cpufreq_driver_lock);
 
-static DEFINE_SPINLOCK(cpufreq_data_lock);
+static DEFINE_RWLOCK(cpufreq_data_lock);
 static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
 #ifdef CONFIG_HOTPLUG_CPU
 /* This one keeps track of the previously set governor of a removed CPU */
@@ -150,7 +150,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 		goto err_out_unlock;
 
 	/* get the CPU */
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	read_lock_irqsave(&cpufreq_data_lock, flags);
 	data = per_cpu(cpufreq_cpu_data, cpu);
 
 	if (!data)
@@ -159,13 +159,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
 	if (!sysfs && !kobject_get(&data->kobj))
 		goto err_out_put_module;
 
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	read_unlock_irqrestore(&cpufreq_data_lock, flags);
 	rcu_read_unlock();
 	return data;
 
 err_out_put_module:
 	module_put(driver->owner);
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	read_unlock_irqrestore(&cpufreq_data_lock, flags);
 err_out_unlock:
 	rcu_read_unlock();
 err_out:
@@ -276,9 +276,9 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 	pr_debug("notification %u of frequency transition to %u kHz\n",
 		state, freqs->new);
 
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	read_lock_irqsave(&cpufreq_data_lock, flags);
 	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	read_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	switch (state) {
 
@@ -802,12 +802,12 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
 	}
 	rcu_read_unlock();
 
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	write_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus) {
 		per_cpu(cpufreq_cpu_data, j) = policy;
 		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
 	}
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	write_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	ret = cpufreq_add_dev_symlink(cpu, policy);
 	if (ret)
@@ -858,12 +858,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
 
 	lock_policy_rwsem_write(sibling);
 
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	write_lock_irqsave(&cpufreq_data_lock, flags);
 
 	cpumask_set_cpu(cpu, policy->cpus);
 	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
 	per_cpu(cpufreq_cpu_data, cpu) = policy;
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	write_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	unlock_policy_rwsem_write(sibling);
 
@@ -918,15 +918,15 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 
 #ifdef CONFIG_HOTPLUG_CPU
 	/* Check if this cpu was hot-unplugged earlier and has siblings */
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	read_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_online_cpu(sibling) {
 		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
 		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
-			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+			read_unlock_irqrestore(&cpufreq_data_lock, flags);
 			return cpufreq_add_policy_cpu(cpu, sibling, dev);
 		}
 	}
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	read_unlock_irqrestore(&cpufreq_data_lock, flags);
 #endif
 #endif
 
@@ -1006,10 +1006,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	return 0;
 
 err_out_unregister:
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	write_lock_irqsave(&cpufreq_data_lock, flags);
 	for_each_cpu(j, policy->cpus)
 		per_cpu(cpufreq_cpu_data, j) = NULL;
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	write_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	kobject_put(&policy->kobj);
 	wait_for_completion(&policy->kobj_unregister);
@@ -1067,12 +1067,12 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 
 	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
 
-	spin_lock_irqsave(&cpufreq_data_lock, flags);
+	write_lock_irqsave(&cpufreq_data_lock, flags);
 
 	data = per_cpu(cpufreq_cpu_data, cpu);
 	per_cpu(cpufreq_cpu_data, cpu) = NULL;
 
-	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+	write_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 	if (!data) {
 		pr_debug("%s: No cpu_data found\n", __func__);
@@ -1111,9 +1111,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
 			WARN_ON(lock_policy_rwsem_write(cpu));
 			cpumask_set_cpu(cpu, data->cpus);
 
-			spin_lock_irqsave(&cpufreq_data_lock, flags);
+			write_lock_irqsave(&cpufreq_data_lock, flags);
 			per_cpu(cpufreq_cpu_data, cpu) = data;
-			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
+			write_unlock_irqrestore(&cpufreq_data_lock, flags);
 
 			unlock_policy_rwsem_write(cpu);
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-01 20:11                                     ` [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock Nathan Zimmer
@ 2013-04-01 20:41                                       ` Rafael J. Wysocki
  2013-04-02  0:56                                         ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-04-01 20:41 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: viresh.kumar, cpufreq, linux-pm, linux-kernel

On Monday, April 01, 2013 03:11:09 PM Nathan Zimmer wrote:
> This eliminates the rest of the contention found in __cpufreq_cpu_get.
> I am not seeing a way to use the rcu so we will have to make due with a
> rwlock for now.
> 
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>

I've already applied this one.

Can you please check if the version in my tree is OK?

Rafael


> ---
>  drivers/cpufreq/cpufreq.c | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 5139eab..7438c34 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -42,7 +42,7 @@
>  static struct cpufreq_driver __rcu *cpufreq_driver;
>  static DEFINE_SPINLOCK(cpufreq_driver_lock);
>  
> -static DEFINE_SPINLOCK(cpufreq_data_lock);
> +static DEFINE_RWLOCK(cpufreq_data_lock);
>  static DEFINE_PER_CPU(struct cpufreq_policy *, cpufreq_cpu_data);
>  #ifdef CONFIG_HOTPLUG_CPU
>  /* This one keeps track of the previously set governor of a removed CPU */
> @@ -150,7 +150,7 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>  		goto err_out_unlock;
>  
>  	/* get the CPU */
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	read_lock_irqsave(&cpufreq_data_lock, flags);
>  	data = per_cpu(cpufreq_cpu_data, cpu);
>  
>  	if (!data)
> @@ -159,13 +159,13 @@ static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
>  	if (!sysfs && !kobject_get(&data->kobj))
>  		goto err_out_put_module;
>  
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	read_unlock_irqrestore(&cpufreq_data_lock, flags);
>  	rcu_read_unlock();
>  	return data;
>  
>  err_out_put_module:
>  	module_put(driver->owner);
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	read_unlock_irqrestore(&cpufreq_data_lock, flags);
>  err_out_unlock:
>  	rcu_read_unlock();
>  err_out:
> @@ -276,9 +276,9 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
>  	pr_debug("notification %u of frequency transition to %u kHz\n",
>  		state, freqs->new);
>  
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	read_lock_irqsave(&cpufreq_data_lock, flags);
>  	policy = per_cpu(cpufreq_cpu_data, freqs->cpu);
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	read_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  	switch (state) {
>  
> @@ -802,12 +802,12 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>  	}
>  	rcu_read_unlock();
>  
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	write_lock_irqsave(&cpufreq_data_lock, flags);
>  	for_each_cpu(j, policy->cpus) {
>  		per_cpu(cpufreq_cpu_data, j) = policy;
>  		per_cpu(cpufreq_policy_cpu, j) = policy->cpu;
>  	}
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	write_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  	ret = cpufreq_add_dev_symlink(cpu, policy);
>  	if (ret)
> @@ -858,12 +858,12 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
>  
>  	lock_policy_rwsem_write(sibling);
>  
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	write_lock_irqsave(&cpufreq_data_lock, flags);
>  
>  	cpumask_set_cpu(cpu, policy->cpus);
>  	per_cpu(cpufreq_policy_cpu, cpu) = policy->cpu;
>  	per_cpu(cpufreq_cpu_data, cpu) = policy;
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	write_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  	unlock_policy_rwsem_write(sibling);
>  
> @@ -918,15 +918,15 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  
>  #ifdef CONFIG_HOTPLUG_CPU
>  	/* Check if this cpu was hot-unplugged earlier and has siblings */
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	read_lock_irqsave(&cpufreq_data_lock, flags);
>  	for_each_online_cpu(sibling) {
>  		struct cpufreq_policy *cp = per_cpu(cpufreq_cpu_data, sibling);
>  		if (cp && cpumask_test_cpu(cpu, cp->related_cpus)) {
> -			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +			read_unlock_irqrestore(&cpufreq_data_lock, flags);
>  			return cpufreq_add_policy_cpu(cpu, sibling, dev);
>  		}
>  	}
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	read_unlock_irqrestore(&cpufreq_data_lock, flags);
>  #endif
>  #endif
>  
> @@ -1006,10 +1006,10 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  	return 0;
>  
>  err_out_unregister:
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	write_lock_irqsave(&cpufreq_data_lock, flags);
>  	for_each_cpu(j, policy->cpus)
>  		per_cpu(cpufreq_cpu_data, j) = NULL;
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	write_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  	kobject_put(&policy->kobj);
>  	wait_for_completion(&policy->kobj_unregister);
> @@ -1067,12 +1067,12 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>  
>  	pr_debug("%s: unregistering CPU %u\n", __func__, cpu);
>  
> -	spin_lock_irqsave(&cpufreq_data_lock, flags);
> +	write_lock_irqsave(&cpufreq_data_lock, flags);
>  
>  	data = per_cpu(cpufreq_cpu_data, cpu);
>  	per_cpu(cpufreq_cpu_data, cpu) = NULL;
>  
> -	spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +	write_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  	if (!data) {
>  		pr_debug("%s: No cpu_data found\n", __func__);
> @@ -1111,9 +1111,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>  			WARN_ON(lock_policy_rwsem_write(cpu));
>  			cpumask_set_cpu(cpu, data->cpus);
>  
> -			spin_lock_irqsave(&cpufreq_data_lock, flags);
> +			write_lock_irqsave(&cpufreq_data_lock, flags);
>  			per_cpu(cpufreq_cpu_data, cpu) = data;
> -			spin_unlock_irqrestore(&cpufreq_data_lock, flags);
> +			write_unlock_irqrestore(&cpufreq_data_lock, flags);
>  
>  			unlock_policy_rwsem_write(cpu);
>  
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-01 20:41                                       ` Rafael J. Wysocki
@ 2013-04-02  0:56                                         ` Nathan Zimmer
  2013-04-02  5:04                                           ` Viresh Kumar
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-02  0:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nathan Zimmer, viresh.kumar, cpufreq, linux-pm, linux-kernel

On Mon, Apr 01, 2013 at 10:41:27PM +0200, Rafael J. Wysocki wrote:
> On Monday, April 01, 2013 03:11:09 PM Nathan Zimmer wrote:
> > This eliminates the rest of the contention found in __cpufreq_cpu_get.
> > I am not seeing a way to use the rcu so we will have to make due with a
> > rwlock for now.
> > 
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> > Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> 
> I've already applied this one.
> 
> Can you please check if the version in my tree is OK?
> 
> Rafael
> 

Nope, the previous version was too different, probably best to just replace it.
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-02  0:56                                         ` Nathan Zimmer
@ 2013-04-02  5:04                                           ` Viresh Kumar
  2013-04-02 12:48                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-04-02  5:04 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: Rafael J. Wysocki, cpufreq, linux-pm, linux-kernel

On 2 April 2013 06:26, Nathan Zimmer <nzimmer@sgi.com> wrote:
> On Mon, Apr 01, 2013 at 10:41:27PM +0200, Rafael J. Wysocki wrote:
>> On Monday, April 01, 2013 03:11:09 PM Nathan Zimmer wrote:
>> > This eliminates the rest of the contention found in __cpufreq_cpu_get.
>> > I am not seeing a way to use the rcu so we will have to make due with a
>> > rwlock for now.
>> >
>> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
>> > Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
>> > Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
>>
>> I've already applied this one.
>>
>> Can you please check if the version in my tree is OK?
>>
>> Rafael
>>
>
> Nope, the previous version was too different, probably best to just replace it.

Nathan,

First of all I should accept that I didn't had your last patch while
reviewing this
one earlier. Thanks Rafael.

Now, I believe the previous patch which Rafael has pushed was good and we
can simply keep it. What you can do is, just add a patch over it (which would
mostly be 1/2 of your patchset), that simply separates rcu stuff out of the lock
and leave lock for cpufreq_data..

--
viresh

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-01 20:11                                     ` [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu Nathan Zimmer
@ 2013-04-02  5:05                                       ` Viresh Kumar
  2013-04-02 14:55                                         ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Viresh Kumar @ 2013-04-02  5:05 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 2 April 2013 01:41, Nathan Zimmer <nzimmer@sgi.com> wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

> +static struct cpufreq_driver __rcu *cpufreq_driver;
> +static DEFINE_SPINLOCK(cpufreq_driver_lock);

You really need this lock? This is only used in cpufreq_register_driver
and unregister_driver... And it doesn't protect other routines at all. And
because we are using rcu stuff now, probably this lock is just not required.

> +static DEFINE_SPINLOCK(cpufreq_data_lock);

Only this one is required and it can be the rwlock which is already pushed
by rafael.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-02  5:04                                           ` Viresh Kumar
@ 2013-04-02 12:48                                             ` Rafael J. Wysocki
  2013-04-02 14:58                                               ` Nathan Zimmer
  0 siblings, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-04-02 12:48 UTC (permalink / raw)
  To: Viresh Kumar, Nathan Zimmer; +Cc: cpufreq, linux-pm, linux-kernel

On Tuesday, April 02, 2013 10:34:21 AM Viresh Kumar wrote:
> On 2 April 2013 06:26, Nathan Zimmer <nzimmer@sgi.com> wrote:
> > On Mon, Apr 01, 2013 at 10:41:27PM +0200, Rafael J. Wysocki wrote:
> >> On Monday, April 01, 2013 03:11:09 PM Nathan Zimmer wrote:
> >> > This eliminates the rest of the contention found in __cpufreq_cpu_get.
> >> > I am not seeing a way to use the rcu so we will have to make due with a
> >> > rwlock for now.
> >> >
> >> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >> > Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> >> > Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> >>
> >> I've already applied this one.
> >>
> >> Can you please check if the version in my tree is OK?
> >>
> >> Rafael
> >>
> >
> > Nope, the previous version was too different, probably best to just replace it.
> 
> Nathan,
> 
> First of all I should accept that I didn't had your last patch while
> reviewing this
> one earlier. Thanks Rafael.
> 
> Now, I believe the previous patch which Rafael has pushed was good and we
> can simply keep it. What you can do is, just add a patch over it (which would
> mostly be 1/2 of your patchset), that simply separates rcu stuff out of the lock
> and leave lock for cpufreq_data..

Yeah, I'd very much prefer that.

Nathan, I'm going to keep the rwlock patch unless it is demonstrably incorrect.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02  5:05                                       ` Viresh Kumar
@ 2013-04-02 14:55                                         ` Nathan Zimmer
  2013-04-02 14:59                                           ` Viresh Kumar
  0 siblings, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-02 14:55 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Nathan Zimmer, rjw, cpufreq, linux-pm, linux-kernel

On Tue, Apr 02, 2013 at 10:35:46AM +0530, Viresh Kumar wrote:
> On 2 April 2013 01:41, Nathan Zimmer <nzimmer@sgi.com> wrote:
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> 
> > +static struct cpufreq_driver __rcu *cpufreq_driver;
> > +static DEFINE_SPINLOCK(cpufreq_driver_lock);
> 
> You really need this lock? This is only used in cpufreq_register_driver
> and unregister_driver... And it doesn't protect other routines at all. And
> because we are using rcu stuff now, probably this lock is just not required.
> 
The lock is unneeded if we expect register and unregister driver to not be
called from muliple threads at once.  I didn't make that assumption.

> > +static DEFINE_SPINLOCK(cpufreq_data_lock);
> 
> Only this one is required and it can be the rwlock which is already pushed
> by rafael.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock
  2013-04-02 12:48                                             ` Rafael J. Wysocki
@ 2013-04-02 14:58                                               ` Nathan Zimmer
  0 siblings, 0 replies; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-02 14:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Nathan Zimmer, cpufreq, linux-pm, linux-kernel

On Tue, Apr 02, 2013 at 02:48:07PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, April 02, 2013 10:34:21 AM Viresh Kumar wrote:
> > On 2 April 2013 06:26, Nathan Zimmer <nzimmer@sgi.com> wrote:
> > > On Mon, Apr 01, 2013 at 10:41:27PM +0200, Rafael J. Wysocki wrote:
> > >> On Monday, April 01, 2013 03:11:09 PM Nathan Zimmer wrote:
> > >> > This eliminates the rest of the contention found in __cpufreq_cpu_get.
> > >> > I am not seeing a way to use the rcu so we will have to make due with a
> > >> > rwlock for now.
> > >> >
> > >> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > >> > Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> > >> > Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
> > >>
> > >> I've already applied this one.
> > >>
> > >> Can you please check if the version in my tree is OK?
> > >>
> > >> Rafael
> > >>
> > >
> > > Nope, the previous version was too different, probably best to just replace it.
> > 
> > Nathan,
> > 
> > First of all I should accept that I didn't had your last patch while
> > reviewing this
> > one earlier. Thanks Rafael.
> > 
> > Now, I believe the previous patch which Rafael has pushed was good and we
> > can simply keep it. What you can do is, just add a patch over it (which would
> > mostly be 1/2 of your patchset), that simply separates rcu stuff out of the lock
> > and leave lock for cpufreq_data..
> 
> Yeah, I'd very much prefer that.
> 
> Nathan, I'm going to keep the rwlock patch unless it is demonstrably incorrect.
> 
> Thanks,
> Rafael
> 
> 
> -- 
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

Ok I'll go that route.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02 14:55                                         ` Nathan Zimmer
@ 2013-04-02 14:59                                           ` Viresh Kumar
  2013-04-02 15:40                                             ` Nathan Zimmer
  2013-04-02 22:57                                             ` Rafael J. Wysocki
  0 siblings, 2 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-04-02 14:59 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 2 April 2013 20:25, Nathan Zimmer <nzimmer@sgi.com> wrote:
> The lock is unneeded if we expect register and unregister driver to not be
> called from muliple threads at once.  I didn't make that assumption.

Hmm.. But doesn't rcu part take care of that too?? Two writers
updating stuff simultaneously?

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02 14:59                                           ` Viresh Kumar
@ 2013-04-02 15:40                                             ` Nathan Zimmer
  2013-04-02 15:52                                               ` Viresh Kumar
  2013-04-02 22:57                                             ` Rafael J. Wysocki
  1 sibling, 1 reply; 55+ messages in thread
From: Nathan Zimmer @ 2013-04-02 15:40 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Nathan Zimmer, rjw, cpufreq, linux-pm, linux-kernel

On Tue, Apr 02, 2013 at 08:29:12PM +0530, Viresh Kumar wrote:
> On 2 April 2013 20:25, Nathan Zimmer <nzimmer@sgi.com> wrote:
> > The lock is unneeded if we expect register and unregister driver to not be
> > called from muliple threads at once.  I didn't make that assumption.
> 
> Hmm.. But doesn't rcu part take care of that too?? Two writers
> updating stuff simultaneously?

My concern is in the cpufreq_register_driver.  Since we are only to set the
pointer when it is null we have have to hold the lock over both operations.

int cpufreq_register_driver(struct cpufreq_driver *driver_data)
{
...
        spin_lock_irqsave(&cpufreq_driver_lock, flags);
        if (rcu_access_pointer(cpufreq_driver)) {
                spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
                return -EBUSY;
        }
        rcu_assign_pointer(cpufreq_driver, driver_data);
        spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
        synchronize_rcu();
...
}



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02 15:40                                             ` Nathan Zimmer
@ 2013-04-02 15:52                                               ` Viresh Kumar
  0 siblings, 0 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-04-02 15:52 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: rjw, cpufreq, linux-pm, linux-kernel

On 2 April 2013 21:10, Nathan Zimmer <nzimmer@sgi.com> wrote:
> My concern is in the cpufreq_register_driver.  Since we are only to set the
> pointer when it is null we have have to hold the lock over both operations.
>
> int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> {
> ...
>         spin_lock_irqsave(&cpufreq_driver_lock, flags);
>         if (rcu_access_pointer(cpufreq_driver)) {
>                 spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
>                 return -EBUSY;
>         }
>         rcu_assign_pointer(cpufreq_driver, driver_data);
>         spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
>         synchronize_rcu();
> ...
> }

How will the lock help you here?? Lock is useful only when somebody
else who want to access it is waiting on the lock and we are updating
the pointer.

Because all other accesses to cpufreq_driver don't have any lock, this
lock is just a waste of time.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02 14:59                                           ` Viresh Kumar
  2013-04-02 15:40                                             ` Nathan Zimmer
@ 2013-04-02 22:57                                             ` Rafael J. Wysocki
  2013-04-03  5:25                                               ` Viresh Kumar
  1 sibling, 1 reply; 55+ messages in thread
From: Rafael J. Wysocki @ 2013-04-02 22:57 UTC (permalink / raw)
  To: Viresh Kumar; +Cc: Nathan Zimmer, cpufreq, linux-pm, linux-kernel

On Tuesday, April 02, 2013 08:29:12 PM Viresh Kumar wrote:
> On 2 April 2013 20:25, Nathan Zimmer <nzimmer@sgi.com> wrote:
> > The lock is unneeded if we expect register and unregister driver to not be
> > called from muliple threads at once.  I didn't make that assumption.
> 
> Hmm.. But doesn't rcu part take care of that too?? Two writers
> updating stuff simultaneously?

RCU doesn't cover that in general.  Additional locking is needed to provide
synchronization between writers.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu
  2013-04-02 22:57                                             ` Rafael J. Wysocki
@ 2013-04-03  5:25                                               ` Viresh Kumar
  0 siblings, 0 replies; 55+ messages in thread
From: Viresh Kumar @ 2013-04-03  5:25 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Nathan Zimmer, cpufreq, linux-pm, linux-kernel

On 3 April 2013 04:27, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Tuesday, April 02, 2013 08:29:12 PM Viresh Kumar wrote:
>> On 2 April 2013 20:25, Nathan Zimmer <nzimmer@sgi.com> wrote:
>> > The lock is unneeded if we expect register and unregister driver to not be
>> > called from muliple threads at once.  I didn't make that assumption.
>>
>> Hmm.. But doesn't rcu part take care of that too?? Two writers
>> updating stuff simultaneously?
>
> RCU doesn't cover that in general.  Additional locking is needed to provide
> synchronization between writers.

Hmm.. I read the same from rcu documentation now...

Nathan, What about using a single spinlock (instead of two) that will take care
of all locking requirements of cpufreq.c ... i.e. both cpufreq_cpu_data and
cpufreq_driver_{register|unregister}... We don't need two locks actually.

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2013-04-03  5:25 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-04 22:45 [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
2013-02-04 22:45 ` [PATCH 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
2013-02-05  8:11   ` Viresh Kumar
2013-02-04 22:45 ` [PATCH 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
2013-02-05  1:07 ` [PATCH 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
2013-02-05  8:28   ` Viresh Kumar
2013-02-05 10:03     ` Rafael J. Wysocki
2013-02-05  9:58       ` Viresh Kumar
2013-02-05 10:13         ` Rafael J. Wysocki
2013-02-05 14:58           ` Nathan Zimmer
2013-02-05 22:00             ` Rafael J. Wysocki
2013-02-06  2:04             ` [PATCH v2 linux-next " Nathan Zimmer
2013-02-06  2:04               ` [PATCH v2 linux-next 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
2013-02-06  2:47                 ` Viresh Kumar
2013-02-06  2:04               ` [PATCH v2 linux-next 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
2013-02-06  2:52                 ` Viresh Kumar
2013-02-06  8:51                   ` Viresh Kumar
2013-02-06 13:00                     ` Rafael J. Wysocki
2013-02-07 23:29                 ` Rafael J. Wysocki
2013-02-11 17:13                   ` Nathan Zimmer
2013-02-11 19:36                     ` Rafael J. Wysocki
2013-02-12  4:03                       ` Nathan Zimmer
2013-02-12 15:59                       ` Paul E. McKenney
2013-02-13 13:20                         ` Rafael J. Wysocki
2013-02-20 23:56               ` [PATCH v3 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
2013-02-20 23:56                 ` [PATCH v3 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
2013-02-20 23:56                 ` [PATCH v3 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
2013-02-21  5:50                   ` Viresh Kumar
2013-02-21 17:49                     ` Nathan Zimmer
2013-02-22 16:24                       ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
2013-02-22 16:24                         ` [PATCH v4 1/2] cpufreq: Convert the cpufreq_driver_lock to a rwlock Nathan Zimmer
2013-02-23  3:57                           ` Viresh Kumar
2013-02-22 16:24                         ` [PATCH v4 2/2] cpufreq: Convert the cpufreq_driver_lock to use the rcu Nathan Zimmer
2013-02-23  3:39                           ` Viresh Kumar
2013-02-25 20:07                             ` Nathan Zimmer
2013-03-11 23:23                         ` [PATCH v4 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Rafael J. Wysocki
2013-03-13 20:50                           ` Nathan Zimmer
2013-04-01 15:33                             ` [PATCH v5] cpufreq: split the cpufreq_driver_lock and use the rcu (was cpufreq: cpufreq_driver_lock is hot on large systems) Nathan Zimmer
2013-04-01 16:28                               ` Viresh Kumar
2013-04-01 17:17                                 ` Nathan Zimmer
2013-04-01 20:11                                   ` [PATCH v6 0/2] cpufreq: cpufreq_driver_lock is hot on large systems Nathan Zimmer
2013-04-01 20:11                                     ` [PATCH v6 1/2] cpufreq: split the cpufreq_driver_lock and use the rcu Nathan Zimmer
2013-04-02  5:05                                       ` Viresh Kumar
2013-04-02 14:55                                         ` Nathan Zimmer
2013-04-02 14:59                                           ` Viresh Kumar
2013-04-02 15:40                                             ` Nathan Zimmer
2013-04-02 15:52                                               ` Viresh Kumar
2013-04-02 22:57                                             ` Rafael J. Wysocki
2013-04-03  5:25                                               ` Viresh Kumar
2013-04-01 20:11                                     ` [PATCH v6 2/2] cpufreq: covert the cpufreq_data_lock to a spinlock Nathan Zimmer
2013-04-01 20:41                                       ` Rafael J. Wysocki
2013-04-02  0:56                                         ` Nathan Zimmer
2013-04-02  5:04                                           ` Viresh Kumar
2013-04-02 12:48                                             ` Rafael J. Wysocki
2013-04-02 14:58                                               ` Nathan Zimmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).