* [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
@ 2017-04-29 2:32 Junaid Shahid
2017-05-30 22:37 ` Junaid Shahid
2017-09-27 12:17 ` Peter Zijlstra
0 siblings, 2 replies; 9+ messages in thread
From: Junaid Shahid @ 2017-04-29 2:32 UTC (permalink / raw)
To: linux-kernel; +Cc: andreslc, gthelen, rusty
In general, if kthread_unpark() and kthread_parkme() execute together,
the kthread is supposed to be in an unparked state. This is because
kthread_unpark() either wakes up the thread if it already got parked,
or it cancels a prior kthread_park() call and hence renders the
kthread_parkme() moot.
However, if kthread_unpark() happens to execute between the time that
kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
thread and it will remain in a parked state.
This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
Signed-off-by: Junaid Shahid <junaids@google.com>
---
kernel/kthread.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 26db528c1d88..651f03baf62f 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
static void __kthread_parkme(struct kthread *self)
{
+ ulong flags;
+
__set_current_state(TASK_PARKED);
- while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
- if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
- complete(&self->parked);
- schedule();
+ flags = self->flags;
+
+ while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {
+ if (cmpxchg(&self->flags, flags,
+ flags | (1 << KTHREAD_IS_PARKED)) == flags) {
+ if (!test_bit(KTHREAD_IS_PARKED, &flags))
+ complete(&self->parked);
+ schedule();
+ }
__set_current_state(TASK_PARKED);
+ flags = self->flags;
}
clear_bit(KTHREAD_IS_PARKED, &self->flags);
__set_current_state(TASK_RUNNING);
--
2.13.0.rc0.306.g87b477812d-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-04-29 2:32 [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark() Junaid Shahid
@ 2017-05-30 22:37 ` Junaid Shahid
2017-05-30 22:40 ` Andrew Morton
2017-09-27 12:17 ` Peter Zijlstra
1 sibling, 1 reply; 9+ messages in thread
From: Junaid Shahid @ 2017-05-30 22:37 UTC (permalink / raw)
To: linux-kernel, akpm; +Cc: andreslc, gthelen, rusty
(Resending)
On Friday, April 28, 2017 07:32:36 PM Junaid Shahid wrote:
> In general, if kthread_unpark() and kthread_parkme() execute together,
> the kthread is supposed to be in an unparked state. This is because
> kthread_unpark() either wakes up the thread if it already got parked,
> or it cancels a prior kthread_park() call and hence renders the
> kthread_parkme() moot.
>
> However, if kthread_unpark() happens to execute between the time that
> kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
> KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
> thread and it will remain in a parked state.
>
> This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
> setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
>
> Signed-off-by: Junaid Shahid <junaids@google.com>
> ---
> kernel/kthread.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 26db528c1d88..651f03baf62f 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
>
> static void __kthread_parkme(struct kthread *self)
> {
> + ulong flags;
> +
> __set_current_state(TASK_PARKED);
> - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
> - if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
> - complete(&self->parked);
> - schedule();
> + flags = self->flags;
> +
> + while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {
> + if (cmpxchg(&self->flags, flags,
> + flags | (1 << KTHREAD_IS_PARKED)) == flags) {
> + if (!test_bit(KTHREAD_IS_PARKED, &flags))
> + complete(&self->parked);
> + schedule();
> + }
> __set_current_state(TASK_PARKED);
> + flags = self->flags;
> }
> clear_bit(KTHREAD_IS_PARKED, &self->flags);
> __set_current_state(TASK_RUNNING);
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-05-30 22:37 ` Junaid Shahid
@ 2017-05-30 22:40 ` Andrew Morton
2017-07-18 0:13 ` Junaid Shahid
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2017-05-30 22:40 UTC (permalink / raw)
To: Junaid Shahid
Cc: linux-kernel, andreslc, gthelen, rusty, Thomas Gleixner, Tejun Heo
(cc tglx & tj)
On Tue, 30 May 2017 15:37:23 -0700 Junaid Shahid <junaids@google.com> wrote:
> (Resending)
>
> On Friday, April 28, 2017 07:32:36 PM Junaid Shahid wrote:
> > In general, if kthread_unpark() and kthread_parkme() execute together,
> > the kthread is supposed to be in an unparked state. This is because
> > kthread_unpark() either wakes up the thread if it already got parked,
> > or it cancels a prior kthread_park() call and hence renders the
> > kthread_parkme() moot.
> >
> > However, if kthread_unpark() happens to execute between the time that
> > kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
> > KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
> > thread and it will remain in a parked state.
> >
> > This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
> > setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
> >
> > Signed-off-by: Junaid Shahid <junaids@google.com>
> > ---
> > kernel/kthread.c | 16 ++++++++++++----
> > 1 file changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index 26db528c1d88..651f03baf62f 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
> >
> > static void __kthread_parkme(struct kthread *self)
> > {
> > + ulong flags;
> > +
> > __set_current_state(TASK_PARKED);
> > - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
> > - if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
> > - complete(&self->parked);
> > - schedule();
> > + flags = self->flags;
> > +
> > + while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {
> > + if (cmpxchg(&self->flags, flags,
> > + flags | (1 << KTHREAD_IS_PARKED)) == flags) {
> > + if (!test_bit(KTHREAD_IS_PARKED, &flags))
> > + complete(&self->parked);
> > + schedule();
> > + }
> > __set_current_state(TASK_PARKED);
> > + flags = self->flags;
> > }
> > clear_bit(KTHREAD_IS_PARKED, &self->flags);
> > __set_current_state(TASK_RUNNING);
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-05-30 22:40 ` Andrew Morton
@ 2017-07-18 0:13 ` Junaid Shahid
2017-07-18 6:35 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Junaid Shahid @ 2017-07-18 0:13 UTC (permalink / raw)
To: Andrew Morton, Thomas Gleixner, Tejun Heo
Cc: linux-kernel, andreslc, gthelen, rusty
Hi,
Has anyone been able to take a look at this?
Thanks,
Junaid
On Tuesday, May 30, 2017 03:40:02 PM Andrew Morton wrote:
> (cc tglx & tj)
>
> On Tue, 30 May 2017 15:37:23 -0700 Junaid Shahid <junaids@google.com> wrote:
>
> > (Resending)
> >
> > On Friday, April 28, 2017 07:32:36 PM Junaid Shahid wrote:
> > > In general, if kthread_unpark() and kthread_parkme() execute together,
> > > the kthread is supposed to be in an unparked state. This is because
> > > kthread_unpark() either wakes up the thread if it already got parked,
> > > or it cancels a prior kthread_park() call and hence renders the
> > > kthread_parkme() moot.
> > >
> > > However, if kthread_unpark() happens to execute between the time that
> > > kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
> > > KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
> > > thread and it will remain in a parked state.
> > >
> > > This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
> > > setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
> > >
> > > Signed-off-by: Junaid Shahid <junaids@google.com>
> > > ---
> > > kernel/kthread.c | 16 ++++++++++++----
> > > 1 file changed, 12 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > > index 26db528c1d88..651f03baf62f 100644
> > > --- a/kernel/kthread.c
> > > +++ b/kernel/kthread.c
> > > @@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
> > >
> > > static void __kthread_parkme(struct kthread *self)
> > > {
> > > + ulong flags;
> > > +
> > > __set_current_state(TASK_PARKED);
> > > - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
> > > - if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
> > > - complete(&self->parked);
> > > - schedule();
> > > + flags = self->flags;
> > > +
> > > + while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {
> > > + if (cmpxchg(&self->flags, flags,
> > > + flags | (1 << KTHREAD_IS_PARKED)) == flags) {
> > > + if (!test_bit(KTHREAD_IS_PARKED, &flags))
> > > + complete(&self->parked);
> > > + schedule();
> > > + }
> > > __set_current_state(TASK_PARKED);
> > > + flags = self->flags;
> > > }
> > > clear_bit(KTHREAD_IS_PARKED, &self->flags);
> > > __set_current_state(TASK_RUNNING);
> > >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-07-18 0:13 ` Junaid Shahid
@ 2017-07-18 6:35 ` Thomas Gleixner
0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2017-07-18 6:35 UTC (permalink / raw)
To: Junaid Shahid
Cc: Andrew Morton, Tejun Heo, linux-kernel, andreslc, gthelen, rusty
On Mon, 17 Jul 2017, Junaid Shahid wrote:
> Hi,
>
> Has anyone been able to take a look at this?
Yes. It's in my pile of stuff to look at.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-04-29 2:32 [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark() Junaid Shahid
2017-05-30 22:37 ` Junaid Shahid
@ 2017-09-27 12:17 ` Peter Zijlstra
[not found] ` <CAL-GctE0BLVk=ps09dnir8pcpnfzg88jMK4CKkRVmY7-ho87nQ@mail.gmail.com>
1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2017-09-27 12:17 UTC (permalink / raw)
To: Junaid Shahid; +Cc: linux-kernel, andreslc, gthelen, rusty, Thomas Gleixner
On Fri, Apr 28, 2017 at 07:32:36PM -0700, Junaid Shahid wrote:
> In general, if kthread_unpark() and kthread_parkme() execute together,
> the kthread is supposed to be in an unparked state. This is because
> kthread_unpark() either wakes up the thread if it already got parked,
> or it cancels a prior kthread_park() call and hence renders the
> kthread_parkme() moot.
>
> However, if kthread_unpark() happens to execute between the time that
> kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
> KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
> thread and it will remain in a parked state.
>
> This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
> setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
>
> Signed-off-by: Junaid Shahid <junaids@google.com>
> ---
> kernel/kthread.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 26db528c1d88..651f03baf62f 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
>
> static void __kthread_parkme(struct kthread *self)
> {
> + ulong flags;
That should be: "unsigned long".
> +
> __set_current_state(TASK_PARKED);
> - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
> - if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
> - complete(&self->parked);
> - schedule();
> + flags = self->flags;
And at the very least, that should be:
flags = READ_ONCE(self->flags);
> +
> + while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {
Otherwise there's very little that stops the compiler from doing a
reload here.
> + if (cmpxchg(&self->flags, flags,
> + flags | (1 << KTHREAD_IS_PARKED)) == flags) {
> + if (!test_bit(KTHREAD_IS_PARKED, &flags))
> + complete(&self->parked);
> + schedule();
> + }
> __set_current_state(TASK_PARKED);
> + flags = self->flags;
> }
That said, this is really quite horrible.
The alternative is of course changing kthread_unpark() to use a cmpxchg
to clear SHOULD_PARK and then check IS_PARKED on its return value, but
that's equally horrible.
How about something like the below? I still think its overly fragile,
but barring more invasive changes the below more or less does as you
suggest.
---
kernel/kthread.c | 40 ++++++++++++++++++++++++++++++++++------
1 file changed, 34 insertions(+), 6 deletions(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 1c19edf82427..f437f53f6fbc 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -169,14 +169,38 @@ void *kthread_probe_data(struct task_struct *task)
static void __kthread_parkme(struct kthread *self)
{
- __set_current_state(TASK_PARKED);
- while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
- if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
- complete(&self->parked);
- schedule();
+ unsigned long flags;
+
+ for (;;) {
__set_current_state(TASK_PARKED);
+
+ /*
+ * CPU0 CPU1
+ *
+ * [S] ->state = PARKED [S] ->flags &= ~SHOULD_PARK
+ * [L] ->flags [RmW] ->flags &= ~IS_PARKED
+ * MB MB
+ * [RmW] ->flags |= IS_PARKED wakeup()
+ *
+ * Thus, either CPU0 sets IS_PARKED and CPU1's wakeup must observe PARKED
+ * or we observe !SHOULD_PARK and terminate.
+ */
+
+ flags = READ_ONCE(self->flags);
+ if (!test_bit(KTHREAD_SHOULD_PARK, &flags))
+ break;
+
+ /*
+ * Ensure SHOULD_PARK remains set while we set IS_PARKED.
+ * Thereby we ensure kthread_unpark() cannot clear SHOULD_PARK
+ * without us noticing.
+ */
+
+ if (try_cmpxchg(&self->flags, &flags, flags | (1UL << KTHREAD_IS_PARKED))) {
+ complete(&self->parked);
+ schedule();
+ }
}
- clear_bit(KTHREAD_IS_PARKED, &self->flags);
__set_current_state(TASK_RUNNING);
}
@@ -454,6 +478,10 @@ void kthread_unpark(struct task_struct *k)
/*
* Newly created kthread was parked when the CPU was offline.
* The binding was lost and we need to set it again.
+ *
+ * Since we've observed IS_PARKED, the other thread will (have)
+ * called schedule() and thus __kthread_bind()'s
+ * wait_task_inactive() will succeed.
*/
if (test_bit(KTHREAD_IS_PER_CPU, &kthread->flags))
__kthread_bind(k, kthread->cpu, TASK_PARKED);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
[not found] ` <CAL-GctE0BLVk=ps09dnir8pcpnfzg88jMK4CKkRVmY7-ho87nQ@mail.gmail.com>
@ 2017-09-29 7:59 ` Thomas Gleixner
2017-09-29 8:28 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2017-09-29 7:59 UTC (permalink / raw)
To: Junaid Shahid
Cc: Peter Zijlstra, linux-kernel, Andres Lagar-Cavilla, Greg Thelen,
Rusty Russell
On Thu, 28 Sep 2017, Junaid Shahid wrote:
> Hi Peter,
>
> It looks like try_cmpxchg is not available on non-x86 archs, but other than
> that the version that you proposed looks good.
>
> One thing that I am a bit curious about is that the original code, before
> either patch, had a test_and_set_bit for KTHREAD_IS_PARKED rather than just
> a set_bit. I can't think of any reason why that was needed, since it
> doesn't look like TASK_PARKED tasks are susceptible to spurious wakeups. Do
> you by any chance happen to know if there was any specific reason for it?
Everything is susceptible to spurious wakeups and has to deal with it.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-09-29 7:59 ` Thomas Gleixner
@ 2017-09-29 8:28 ` Peter Zijlstra
2017-09-29 20:58 ` Junaid Shahid
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2017-09-29 8:28 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Junaid Shahid, linux-kernel, Andres Lagar-Cavilla, Greg Thelen,
Rusty Russell
On Fri, Sep 29, 2017 at 09:59:55AM +0200, Thomas Gleixner wrote:
> On Thu, 28 Sep 2017, Junaid Shahid wrote:
>
> > Hi Peter,
> >
> > It looks like try_cmpxchg is not available on non-x86 archs, but other than
> > that the version that you proposed looks good.
> >
> > One thing that I am a bit curious about is that the original code, before
> > either patch, had a test_and_set_bit for KTHREAD_IS_PARKED rather than just
> > a set_bit. I can't think of any reason why that was needed, since it
> > doesn't look like TASK_PARKED tasks are susceptible to spurious wakeups. Do
> > you by any chance happen to know if there was any specific reason for it?
>
> Everything is susceptible to spurious wakeups and has to deal with it.
Right, we should code as if they are at all times possible. Currently,
for TASK_PARKED, I don't think they can happen, but I've had patches
that introduce them on purpose (regardless the state) just to stress the
code.
IIRC only TASK_STOPPED and/or TASK_TRACED hard rely on not getting any.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()
2017-09-29 8:28 ` Peter Zijlstra
@ 2017-09-29 20:58 ` Junaid Shahid
0 siblings, 0 replies; 9+ messages in thread
From: Junaid Shahid @ 2017-09-29 20:58 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, linux-kernel, Andres Lagar-Cavilla, Greg Thelen,
Rusty Russell
Thanks for the clarification. But in that case, shouldn’t the patch check whether IS_PARKED was already set before calling complete(&self->parked)? Otherwise, the completion count for self->parked could be more than 1 as a result of spurious wakeups, which could make a future call to kthread_park complete prematurely.
Thanks,
Junaid
On Friday, September 29, 2017 10:28:38 AM Peter Zijlstra wrote:
> On Fri, Sep 29, 2017 at 09:59:55AM +0200, Thomas Gleixner wrote:
> > On Thu, 28 Sep 2017, Junaid Shahid wrote:
> >
> > > Hi Peter,
> > >
> > > It looks like try_cmpxchg is not available on non-x86 archs, but other than
> > > that the version that you proposed looks good.
> > >
> > > One thing that I am a bit curious about is that the original code, before
> > > either patch, had a test_and_set_bit for KTHREAD_IS_PARKED rather than just
> > > a set_bit. I can't think of any reason why that was needed, since it
> > > doesn't look like TASK_PARKED tasks are susceptible to spurious wakeups. Do
> > > you by any chance happen to know if there was any specific reason for it?
> >
> > Everything is susceptible to spurious wakeups and has to deal with it.
>
> Right, we should code as if they are at all times possible. Currently,
> for TASK_PARKED, I don't think they can happen, but I've had patches
> that introduce them on purpose (regardless the state) just to stress the
> code.
>
> IIRC only TASK_STOPPED and/or TASK_TRACED hard rely on not getting any.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-09-29 20:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-29 2:32 [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark() Junaid Shahid
2017-05-30 22:37 ` Junaid Shahid
2017-05-30 22:40 ` Andrew Morton
2017-07-18 0:13 ` Junaid Shahid
2017-07-18 6:35 ` Thomas Gleixner
2017-09-27 12:17 ` Peter Zijlstra
[not found] ` <CAL-GctE0BLVk=ps09dnir8pcpnfzg88jMK4CKkRVmY7-ho87nQ@mail.gmail.com>
2017-09-29 7:59 ` Thomas Gleixner
2017-09-29 8:28 ` Peter Zijlstra
2017-09-29 20:58 ` Junaid Shahid
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.