linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] cpufreq: Fix a circular lock dependency problem
@ 2018-07-23 17:49 Waiman Long
  2018-07-23 17:49 ` [PATCH 1/2] cpu/hotplug: Add a cpus_read_trylock() function Waiman Long
  2018-07-23 17:49 ` [PATCH 2/2] cpufreq: Fix a circular lock dependency problem Waiman Long
  0 siblings, 2 replies; 7+ messages in thread
From: Waiman Long @ 2018-07-23 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar
  Cc: linux-kernel, linux-pm, Paul E. McKenney, Greg Kroah-Hartman,
	Konrad Rzeszutek Wilk, Waiman Long

This patchset works around a circular lock dependency issue in the
cpufreq driver reported by lockdep. The two locks involved are the
cpu_hotplup_lock and the reference count of a sysfs file.

The cpufreq_register_driver() function uses the lock sequence:

  cpus_read_lock --> kn->count

Whereas the cpufreq sysfs store method uses the sequence:

  kn->count --> cpus_read_lock

This is not really an issue as a shared lock is used on the
cpu_hotplup_lock. However, the lockdep code isn't able to handle
shared locking. So one way to work around this is to define a
cpus_read_trylock() function and uses it in the store method instead.

Waiman Long (2):
  cpu/hotplug: Add a cpus_read_trylock() function
  cpufreq: Fix a circular lock dependency problem

 drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
 include/linux/cpu.h       |  2 ++
 kernel/cpu.c              |  6 ++++++
 3 files changed, 23 insertions(+), 1 deletion(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] cpu/hotplug: Add a cpus_read_trylock() function
  2018-07-23 17:49 [PATCH 0/2] cpufreq: Fix a circular lock dependency problem Waiman Long
@ 2018-07-23 17:49 ` Waiman Long
  2018-07-23 17:49 ` [PATCH 2/2] cpufreq: Fix a circular lock dependency problem Waiman Long
  1 sibling, 0 replies; 7+ messages in thread
From: Waiman Long @ 2018-07-23 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar
  Cc: linux-kernel, linux-pm, Paul E. McKenney, Greg Kroah-Hartman,
	Konrad Rzeszutek Wilk, Waiman Long

There are use cases where it can be useful to have a cpus_read_trylock()
function to work around circular lock dependency problem involving
the cpu_hotplug_lock.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/cpu.h | 2 ++
 kernel/cpu.c        | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index a97a63e..e850bfe 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -103,6 +103,7 @@ static inline void cpu_maps_update_done(void)
 extern void cpus_write_unlock(void);
 extern void cpus_read_lock(void);
 extern void cpus_read_unlock(void);
+extern int  cpus_read_trylock(void);
 extern void lockdep_assert_cpus_held(void);
 extern void cpu_hotplug_disable(void);
 extern void cpu_hotplug_enable(void);
@@ -115,6 +116,7 @@ static inline void cpus_write_lock(void) { }
 static inline void cpus_write_unlock(void) { }
 static inline void cpus_read_lock(void) { }
 static inline void cpus_read_unlock(void) { }
+static inline int  cpus_read_trylock(void) { return true; }
 static inline void lockdep_assert_cpus_held(void) { }
 static inline void cpu_hotplug_disable(void) { }
 static inline void cpu_hotplug_enable(void) { }
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 0db8938..307486b 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -290,6 +290,12 @@ void cpus_read_lock(void)
 }
 EXPORT_SYMBOL_GPL(cpus_read_lock);
 
+int cpus_read_trylock(void)
+{
+	return percpu_down_read_trylock(&cpu_hotplug_lock);
+}
+EXPORT_SYMBOL_GPL(cpus_read_trylock);
+
 void cpus_read_unlock(void)
 {
 	percpu_up_read(&cpu_hotplug_lock);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] cpufreq: Fix a circular lock dependency problem
  2018-07-23 17:49 [PATCH 0/2] cpufreq: Fix a circular lock dependency problem Waiman Long
  2018-07-23 17:49 ` [PATCH 1/2] cpu/hotplug: Add a cpus_read_trylock() function Waiman Long
@ 2018-07-23 17:49 ` Waiman Long
  2018-07-23 19:16   ` Peter Zijlstra
  1 sibling, 1 reply; 7+ messages in thread
From: Waiman Long @ 2018-07-23 17:49 UTC (permalink / raw)
  To: Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar
  Cc: linux-kernel, linux-pm, Paul E. McKenney, Greg Kroah-Hartman,
	Konrad Rzeszutek Wilk, Waiman Long

With lockdep turned on, the following circular lock dependency problem
was reported:

[   57.470040] ======================================================
[   57.502900] WARNING: possible circular locking dependency detected
[   57.535208] 4.18.0-0.rc3.1.el8+7.x86_64+debug #1 Tainted: G
[   57.577761] ------------------------------------------------------
[   57.609714] tuned/1505 is trying to acquire lock:
[   57.633808] 00000000559deec5 (cpu_hotplug_lock.rw_sem){++++}, at: store+0x27/0x120
[   57.672880]
[   57.672880] but task is already holding lock:
[   57.702184] 000000002136ca64 (kn->count#118){++++}, at: kernfs_fop_write+0x1d0/0x410
[   57.742176]
[   57.742176] which lock already depends on the new lock.
[   57.742176]
[   57.785220]
[   57.785220] the existing dependency chain (in reverse order) is:
    :
[   58.932512] other info that might help us debug this:
[   58.932512]
[   58.973344] Chain exists of:
[   58.973344]   cpu_hotplug_lock.rw_sem --> subsys mutex#5 --> kn->count#118
[   58.973344]
[   59.030795]  Possible unsafe locking scenario:
[   59.030795]
[   59.061248]        CPU0                    CPU1
[   59.085377]        ----                    ----
[   59.108160]   lock(kn->count#118);
[   59.124935]                                lock(subsys mutex#5);
[   59.156330]                                lock(kn->count#118);
[   59.186088]   lock(cpu_hotplug_lock.rw_sem);
[   59.208541]
[   59.208541]  *** DEADLOCK ***

In the cpufreq_register_driver() function, the lock sequence is:

  cpus_read_lock --> kn->count

For the cpufreq sysfs store method, the lock sequence is:

  kn->count --> cpus_read_lock

These sequences are actually safe as they are taking a share lock on
cpu_hotplug_lock. However, the current lockdep code doesn't check for
share locking when detecting circular lock dependency.  Fixing that
could be a substantial effort.

Instead, we can work around this problem by using cpus_read_trylock()
in the store method which is much simpler. The chance of not getting
the read lock is extremely small. If that happens, the userspace
application that writes the sysfs file will get an error.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b0dfd32..9cf02d7 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -922,8 +922,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
 	struct cpufreq_policy *policy = to_policy(kobj);
 	struct freq_attr *fattr = to_attr(attr);
 	ssize_t ret = -EINVAL;
+	int retries = 3;
 
-	cpus_read_lock();
+	/*
+	 * cpus_read_trylock() is used here to work around a circular lock
+	 * dependency problem with respect to the cpufreq_register_driver().
+	 * With a simple retry loop, the chance of not able to get the
+	 * read lock is extremely small.
+	 */
+	while (!cpus_read_trylock()) {
+		if (retries-- <= 0)
+			return -EBUSY;
+		/*
+		 * Sleep for about 50ms and retry again.
+		 */
+		msleep(50);
+	}
 
 	if (cpu_online(policy->cpu)) {
 		down_write(&policy->rwsem);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] cpufreq: Fix a circular lock dependency problem
  2018-07-23 17:49 ` [PATCH 2/2] cpufreq: Fix a circular lock dependency problem Waiman Long
@ 2018-07-23 19:16   ` Peter Zijlstra
  2018-07-23 19:27     ` Waiman Long
  2018-07-24  8:31     ` Rafael J. Wysocki
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2018-07-23 19:16 UTC (permalink / raw)
  To: Waiman Long
  Cc: Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Ingo Molnar,
	linux-kernel, linux-pm, Paul E. McKenney, Greg Kroah-Hartman,
	Konrad Rzeszutek Wilk

On Mon, Jul 23, 2018 at 01:49:39PM -0400, Waiman Long wrote:
>  drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index b0dfd32..9cf02d7 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -922,8 +922,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
>  	struct cpufreq_policy *policy = to_policy(kobj);
>  	struct freq_attr *fattr = to_attr(attr);
>  	ssize_t ret = -EINVAL;
> +	int retries = 3;
>  
> -	cpus_read_lock();
> +	/*
> +	 * cpus_read_trylock() is used here to work around a circular lock
> +	 * dependency problem with respect to the cpufreq_register_driver().
> +	 * With a simple retry loop, the chance of not able to get the
> +	 * read lock is extremely small.
> +	 */
> +	while (!cpus_read_trylock()) {
> +		if (retries-- <= 0)
> +			return -EBUSY;
> +		/*
> +		 * Sleep for about 50ms and retry again.
> +		 */
> +		msleep(50);
> +	}

That's atrocious.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] cpufreq: Fix a circular lock dependency problem
  2018-07-23 19:16   ` Peter Zijlstra
@ 2018-07-23 19:27     ` Waiman Long
  2018-07-24  8:36       ` Rafael J. Wysocki
  2018-07-24  8:31     ` Rafael J. Wysocki
  1 sibling, 1 reply; 7+ messages in thread
From: Waiman Long @ 2018-07-23 19:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner, Ingo Molnar,
	linux-kernel, linux-pm, Paul E. McKenney, Greg Kroah-Hartman,
	Konrad Rzeszutek Wilk

On 07/23/2018 03:16 PM, Peter Zijlstra wrote:
> On Mon, Jul 23, 2018 at 01:49:39PM -0400, Waiman Long wrote:
>>  drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index b0dfd32..9cf02d7 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -922,8 +922,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
>>  	struct cpufreq_policy *policy = to_policy(kobj);
>>  	struct freq_attr *fattr = to_attr(attr);
>>  	ssize_t ret = -EINVAL;
>> +	int retries = 3;
>>  
>> -	cpus_read_lock();
>> +	/*
>> +	 * cpus_read_trylock() is used here to work around a circular lock
>> +	 * dependency problem with respect to the cpufreq_register_driver().
>> +	 * With a simple retry loop, the chance of not able to get the
>> +	 * read lock is extremely small.
>> +	 */
>> +	while (!cpus_read_trylock()) {
>> +		if (retries-- <= 0)
>> +			return -EBUSY;
>> +		/*
>> +		 * Sleep for about 50ms and retry again.
>> +		 */
>> +		msleep(50);
>> +	}
> That's atrocious.
>
>
I had thought about just returning an error if the trylock fails as CPU
hotplug rarely happened. I can revert to that simple case if others have
no objection.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] cpufreq: Fix a circular lock dependency problem
  2018-07-23 19:16   ` Peter Zijlstra
  2018-07-23 19:27     ` Waiman Long
@ 2018-07-24  8:31     ` Rafael J. Wysocki
  1 sibling, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2018-07-24  8:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner,
	Ingo Molnar, Linux Kernel Mailing List, Linux PM,
	Paul E. McKenney, Greg Kroah-Hartman, Konrad Rzeszutek Wilk

On Mon, Jul 23, 2018 at 9:16 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Jul 23, 2018 at 01:49:39PM -0400, Waiman Long wrote:
>>  drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index b0dfd32..9cf02d7 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -922,8 +922,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
>>       struct cpufreq_policy *policy = to_policy(kobj);
>>       struct freq_attr *fattr = to_attr(attr);
>>       ssize_t ret = -EINVAL;
>> +     int retries = 3;
>>
>> -     cpus_read_lock();
>> +     /*
>> +      * cpus_read_trylock() is used here to work around a circular lock
>> +      * dependency problem with respect to the cpufreq_register_driver().
>> +      * With a simple retry loop, the chance of not able to get the
>> +      * read lock is extremely small.
>> +      */
>> +     while (!cpus_read_trylock()) {
>> +             if (retries-- <= 0)
>> +                     return -EBUSY;
>> +             /*
>> +              * Sleep for about 50ms and retry again.
>> +              */
>> +             msleep(50);
>> +     }
>
> That's atrocious.

Agreed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] cpufreq: Fix a circular lock dependency problem
  2018-07-23 19:27     ` Waiman Long
@ 2018-07-24  8:36       ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2018-07-24  8:36 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Rafael J. Wysocki, Viresh Kumar, Thomas Gleixner,
	Ingo Molnar, Linux Kernel Mailing List, Linux PM,
	Paul E. McKenney, Greg Kroah-Hartman, Konrad Rzeszutek Wilk

On Mon, Jul 23, 2018 at 9:27 PM, Waiman Long <longman@redhat.com> wrote:
> On 07/23/2018 03:16 PM, Peter Zijlstra wrote:
>> On Mon, Jul 23, 2018 at 01:49:39PM -0400, Waiman Long wrote:
>>>  drivers/cpufreq/cpufreq.c | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index b0dfd32..9cf02d7 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -922,8 +922,22 @@ static ssize_t store(struct kobject *kobj, struct attribute *attr,
>>>      struct cpufreq_policy *policy = to_policy(kobj);
>>>      struct freq_attr *fattr = to_attr(attr);
>>>      ssize_t ret = -EINVAL;
>>> +    int retries = 3;
>>>
>>> -    cpus_read_lock();
>>> +    /*
>>> +     * cpus_read_trylock() is used here to work around a circular lock
>>> +     * dependency problem with respect to the cpufreq_register_driver().
>>> +     * With a simple retry loop, the chance of not able to get the
>>> +     * read lock is extremely small.
>>> +     */
>>> +    while (!cpus_read_trylock()) {
>>> +            if (retries-- <= 0)
>>> +                    return -EBUSY;
>>> +            /*
>>> +             * Sleep for about 50ms and retry again.
>>> +             */
>>> +            msleep(50);
>>> +    }
>> That's atrocious.
>>
>>
> I had thought about just returning an error if the trylock fails as CPU
> hotplug rarely happened. I can revert to that simple case if others have
> no objection.

Yes, you can return -EBUSY or -EAGAIN right away from here if the
cpus_read_trylock() is not successful.  There is not much reason for
the sysfs operation to continue in that case.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-07-24  8:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-23 17:49 [PATCH 0/2] cpufreq: Fix a circular lock dependency problem Waiman Long
2018-07-23 17:49 ` [PATCH 1/2] cpu/hotplug: Add a cpus_read_trylock() function Waiman Long
2018-07-23 17:49 ` [PATCH 2/2] cpufreq: Fix a circular lock dependency problem Waiman Long
2018-07-23 19:16   ` Peter Zijlstra
2018-07-23 19:27     ` Waiman Long
2018-07-24  8:36       ` Rafael J. Wysocki
2018-07-24  8:31     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).