* [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
@ 2021-12-02 10:12 Marco Elver
2021-12-02 11:53 ` Marco Elver
0 siblings, 1 reply; 5+ messages in thread
From: Marco Elver @ 2021-12-02 10:12 UTC (permalink / raw)
To: elver, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
Boqun Feng, linux-kernel
Cc: kasan-dev, Thomas Gleixner, Mark Rutland, Paul E. McKenney
One of the more frequent data races reported by KCSAN is the racy read
in mutex_spin_on_owner(), which is usually reported as "race of unknown
origin" without showing the writer. This is due to the racing write
occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:
| write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
| finish_task kernel/sched/core.c:4632 [inline]
| finish_task_switch kernel/sched/core.c:4848
| context_switch kernel/sched/core.c:4975 [inline]
| __schedule kernel/sched/core.c:6253
| schedule kernel/sched/core.c:6326
| schedule_preempt_disabled kernel/sched/core.c:6385
| __mutex_lock_common kernel/locking/mutex.c:680
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
| mutex_spin_on_owner kernel/locking/mutex.c:370
| mutex_optimistic_spin kernel/locking/mutex.c:480
| __mutex_lock_common kernel/locking/mutex.c:610
| __mutex_lock kernel/locking/mutex.c:740 [inline]
| __mutex_lock_slowpath kernel/locking/mutex.c:1028
| mutex_lock kernel/locking/mutex.c:283
| tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
| ...
|
| value changed: 0x00000001 -> 0x00000000
This race is clearly intentional, and the potential for miscompilation
is slim due to surrounding barrier() and cpu_relax(), and the value
being used as a boolean.
Nevertheless, marking this reader would more clearly denote intent and
make it obvious that concurrency is expected. Use READ_ONCE() to avoid
having to reason about compiler optimizations now and in future.
Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
which immediately precedes the loop executing mutex_spin_on_owner().
Signed-off-by: Marco Elver <elver@google.com>
---
I decided to send this out now due to the discussion at [1], because it
is one of the first things that people notice when enabling KCSAN.
[1] https://lkml.kernel.org/r/811af0bc-0c99-37f6-a39a-095418b10661@huawei.com
It had been reported before, but never with the 2nd stack trace -- so at
the very least this patch can now serve as a reference.
Thanks,
-- Marco
---
kernel/locking/mutex.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index db1913611192..50c03a3fa61e 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -367,7 +367,7 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner,
/*
* Use vcpu_is_preempted to detect lock holder preemption issue.
*/
- if (!owner->on_cpu || need_resched() ||
+ if (!READ_ONCE(owner->on_cpu) || need_resched() ||
vcpu_is_preempted(task_cpu(owner))) {
ret = false;
break;
@@ -410,7 +410,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
*/
if (owner)
- retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
+ retval = READ_ONCE(owner->on_cpu) && !vcpu_is_preempted(task_cpu(owner));
/*
* If lock->owner is not set, the mutex has been released. Return true
--
2.34.0.rc2.393.gf8c9666880-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
2021-12-02 10:12 [PATCH] locking/mutex: Mark racy reads of owner->on_cpu Marco Elver
@ 2021-12-02 11:53 ` Marco Elver
2021-12-02 14:46 ` Peter Zijlstra
2021-12-02 15:46 ` Waiman Long
0 siblings, 2 replies; 5+ messages in thread
From: Marco Elver @ 2021-12-02 11:53 UTC (permalink / raw)
To: elver, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long,
Boqun Feng, linux-kernel
Cc: kasan-dev, Thomas Gleixner, Mark Rutland, Paul E. McKenney, Kefeng Wang
On Thu, 2 Dec 2021 at 11:13, Marco Elver <elver@google.com> wrote:
> One of the more frequent data races reported by KCSAN is the racy read
> in mutex_spin_on_owner(), which is usually reported as "race of unknown
> origin" without showing the writer. This is due to the racing write
> occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:
>
> | write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
> | finish_task kernel/sched/core.c:4632 [inline]
> | finish_task_switch kernel/sched/core.c:4848
> | context_switch kernel/sched/core.c:4975 [inline]
> | __schedule kernel/sched/core.c:6253
> | schedule kernel/sched/core.c:6326
> | schedule_preempt_disabled kernel/sched/core.c:6385
> | __mutex_lock_common kernel/locking/mutex.c:680
> | __mutex_lock kernel/locking/mutex.c:740 [inline]
> | __mutex_lock_slowpath kernel/locking/mutex.c:1028
> | mutex_lock kernel/locking/mutex.c:283
> | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
> | ...
> |
> | read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
> | mutex_spin_on_owner kernel/locking/mutex.c:370
> | mutex_optimistic_spin kernel/locking/mutex.c:480
> | __mutex_lock_common kernel/locking/mutex.c:610
> | __mutex_lock kernel/locking/mutex.c:740 [inline]
> | __mutex_lock_slowpath kernel/locking/mutex.c:1028
> | mutex_lock kernel/locking/mutex.c:283
> | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
> | ...
> |
> | value changed: 0x00000001 -> 0x00000000
>
> This race is clearly intentional, and the potential for miscompilation
> is slim due to surrounding barrier() and cpu_relax(), and the value
> being used as a boolean.
>
> Nevertheless, marking this reader would more clearly denote intent and
> make it obvious that concurrency is expected. Use READ_ONCE() to avoid
> having to reason about compiler optimizations now and in future.
>
> Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
> which immediately precedes the loop executing mutex_spin_on_owner().
>
> Signed-off-by: Marco Elver <elver@google.com>
[...]
Kefeng kindly pointed out that there is an alternative, which would
refactor owner_on_cpu() from rwsem that would address both mutex and
rwsem:
https://lore.kernel.org/all/b641f1ea-6def-0fe4-d273-03c35c4aa7d6@huawei.com/
Preferences?
Thanks,
-- Marco
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
2021-12-02 11:53 ` Marco Elver
@ 2021-12-02 14:46 ` Peter Zijlstra
2021-12-02 15:46 ` Waiman Long
1 sibling, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2021-12-02 14:46 UTC (permalink / raw)
To: Marco Elver
Cc: Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng, linux-kernel,
kasan-dev, Thomas Gleixner, Mark Rutland, Paul E. McKenney,
Kefeng Wang
On Thu, Dec 02, 2021 at 12:53:14PM +0100, Marco Elver wrote:
> On Thu, 2 Dec 2021 at 11:13, Marco Elver <elver@google.com> wrote:
> > One of the more frequent data races reported by KCSAN is the racy read
> > in mutex_spin_on_owner(), which is usually reported as "race of unknown
> > origin" without showing the writer. This is due to the racing write
> > occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:
> >
> > | write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
> > | finish_task kernel/sched/core.c:4632 [inline]
> > | finish_task_switch kernel/sched/core.c:4848
> > | context_switch kernel/sched/core.c:4975 [inline]
> > | __schedule kernel/sched/core.c:6253
> > | schedule kernel/sched/core.c:6326
> > | schedule_preempt_disabled kernel/sched/core.c:6385
> > | __mutex_lock_common kernel/locking/mutex.c:680
> > | __mutex_lock kernel/locking/mutex.c:740 [inline]
> > | __mutex_lock_slowpath kernel/locking/mutex.c:1028
> > | mutex_lock kernel/locking/mutex.c:283
> > | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
> > | ...
> > |
> > | read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
> > | mutex_spin_on_owner kernel/locking/mutex.c:370
> > | mutex_optimistic_spin kernel/locking/mutex.c:480
> > | __mutex_lock_common kernel/locking/mutex.c:610
> > | __mutex_lock kernel/locking/mutex.c:740 [inline]
> > | __mutex_lock_slowpath kernel/locking/mutex.c:1028
> > | mutex_lock kernel/locking/mutex.c:283
> > | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
> > | ...
> > |
> > | value changed: 0x00000001 -> 0x00000000
> >
> > This race is clearly intentional, and the potential for miscompilation
> > is slim due to surrounding barrier() and cpu_relax(), and the value
> > being used as a boolean.
> >
> > Nevertheless, marking this reader would more clearly denote intent and
> > make it obvious that concurrency is expected. Use READ_ONCE() to avoid
> > having to reason about compiler optimizations now and in future.
> >
> > Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
> > which immediately precedes the loop executing mutex_spin_on_owner().
> >
> > Signed-off-by: Marco Elver <elver@google.com>
> [...]
>
> Kefeng kindly pointed out that there is an alternative, which would
> refactor owner_on_cpu() from rwsem that would address both mutex and
> rwsem:
> https://lore.kernel.org/all/b641f1ea-6def-0fe4-d273-03c35c4aa7d6@huawei.com/
That seems to make sense, except it should probably go under CONFIG_SMP,
since ->on_cpu doesn't otherwise exist.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
2021-12-02 11:53 ` Marco Elver
2021-12-02 14:46 ` Peter Zijlstra
@ 2021-12-02 15:46 ` Waiman Long
2021-12-17 21:41 ` Thomas Gleixner
1 sibling, 1 reply; 5+ messages in thread
From: Waiman Long @ 2021-12-02 15:46 UTC (permalink / raw)
To: Marco Elver, Peter Zijlstra, Ingo Molnar, Will Deacon,
Boqun Feng, linux-kernel
Cc: kasan-dev, Thomas Gleixner, Mark Rutland, Paul E. McKenney, Kefeng Wang
On 12/2/21 06:53, Marco Elver wrote:
> On Thu, 2 Dec 2021 at 11:13, Marco Elver <elver@google.com> wrote:
>> One of the more frequent data races reported by KCSAN is the racy read
>> in mutex_spin_on_owner(), which is usually reported as "race of unknown
>> origin" without showing the writer. This is due to the racing write
>> occurring in kernel/sched. Locally enabling KCSAN in kernel/sched shows:
>>
>> | write (marked) to 0xffff97f205079934 of 4 bytes by task 316 on cpu 6:
>> | finish_task kernel/sched/core.c:4632 [inline]
>> | finish_task_switch kernel/sched/core.c:4848
>> | context_switch kernel/sched/core.c:4975 [inline]
>> | __schedule kernel/sched/core.c:6253
>> | schedule kernel/sched/core.c:6326
>> | schedule_preempt_disabled kernel/sched/core.c:6385
>> | __mutex_lock_common kernel/locking/mutex.c:680
>> | __mutex_lock kernel/locking/mutex.c:740 [inline]
>> | __mutex_lock_slowpath kernel/locking/mutex.c:1028
>> | mutex_lock kernel/locking/mutex.c:283
>> | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
>> | ...
>> |
>> | read to 0xffff97f205079934 of 4 bytes by task 322 on cpu 3:
>> | mutex_spin_on_owner kernel/locking/mutex.c:370
>> | mutex_optimistic_spin kernel/locking/mutex.c:480
>> | __mutex_lock_common kernel/locking/mutex.c:610
>> | __mutex_lock kernel/locking/mutex.c:740 [inline]
>> | __mutex_lock_slowpath kernel/locking/mutex.c:1028
>> | mutex_lock kernel/locking/mutex.c:283
>> | tty_open_by_driver drivers/tty/tty_io.c:2062 [inline]
>> | ...
>> |
>> | value changed: 0x00000001 -> 0x00000000
>>
>> This race is clearly intentional, and the potential for miscompilation
>> is slim due to surrounding barrier() and cpu_relax(), and the value
>> being used as a boolean.
>>
>> Nevertheless, marking this reader would more clearly denote intent and
>> make it obvious that concurrency is expected. Use READ_ONCE() to avoid
>> having to reason about compiler optimizations now and in future.
>>
>> Similarly, mark the read to owner->on_cpu in mutex_can_spin_on_owner(),
>> which immediately precedes the loop executing mutex_spin_on_owner().
>>
>> Signed-off-by: Marco Elver <elver@google.com>
> [...]
>
> Kefeng kindly pointed out that there is an alternative, which would
> refactor owner_on_cpu() from rwsem that would address both mutex and
> rwsem:
> https://lore.kernel.org/all/b641f1ea-6def-0fe4-d273-03c35c4aa7d6@huawei.com/
>
> Preferences?
I would like to see owner_on_cpu() extracted out from
kernel/locking/rwsem.c into include/linux/sched.h right after
vcpu_is_preempted(), for instance, and with READ_ONCE() added. Then it
can be used in mutex.c as well. This problem is common to both mutex and
rwsem.
Cheers,
Longman
Thanks,
-- Marco
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] locking/mutex: Mark racy reads of owner->on_cpu
2021-12-02 15:46 ` Waiman Long
@ 2021-12-17 21:41 ` Thomas Gleixner
0 siblings, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2021-12-17 21:41 UTC (permalink / raw)
To: Waiman Long, Marco Elver, Peter Zijlstra, Ingo Molnar,
Will Deacon, Boqun Feng, linux-kernel
Cc: kasan-dev, Mark Rutland, Paul E. McKenney, Kefeng Wang
On Thu, Dec 02 2021 at 10:46, Waiman Long wrote:
> On 12/2/21 06:53, Marco Elver wrote:
> I would like to see owner_on_cpu() extracted out from
> kernel/locking/rwsem.c into include/linux/sched.h right after
> vcpu_is_preempted(), for instance, and with READ_ONCE() added. Then it
> can be used in mutex.c as well. This problem is common to both mutex and
> rwsem.
And rtmutex.c
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-12-17 21:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 10:12 [PATCH] locking/mutex: Mark racy reads of owner->on_cpu Marco Elver
2021-12-02 11:53 ` Marco Elver
2021-12-02 14:46 ` Peter Zijlstra
2021-12-02 15:46 ` Waiman Long
2021-12-17 21:41 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).