* [PATCH v3] psi: get poll_work to run when calling poll syscall next time
@ 2019-08-21 3:26 Joseph Qi
2019-08-22 22:21 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Joseph Qi @ 2019-08-21 3:26 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, linux-kernel
Cc: Johannes Weiner, Suren Baghdasaryan, Jason Xing, Caspar Zhang, Joseph Qi
From: Jason Xing <kerneljasonxing@linux.alibaba.com>
Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.
Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.
The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag.
Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
---
v3: Change the description as Johannes Weiner suggested.
kernel/sched/psi.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 23fbbcc..6e52b67 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,15 @@ static void psi_trigger_destroy(struct kref *ref)
* deadlock while waiting for psi_poll_work to acquire trigger_lock
*/
if (kworker_to_destroy) {
+ /*
+ * After the RCU grace period has expired, the worker
+ * can no longer be found through group->poll_kworker.
+ * But it might have been already scheduled before
+ * that - deschedule it cleanly before destroying it.
+ */
kthread_cancel_delayed_work_sync(&group->poll_work);
+ atomic_set(&group->poll_scheduled, 0);
+
kthread_destroy_worker(kworker_to_destroy);
}
kfree(t);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-21 3:26 [PATCH v3] psi: get poll_work to run when calling poll syscall next time Joseph Qi
@ 2019-08-22 22:21 ` Andrew Morton
2019-08-22 23:11 ` Suren Baghdasaryan
2019-08-23 0:53 ` Joseph Qi
0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2019-08-22 22:21 UTC (permalink / raw)
To: Joseph Qi
Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Johannes Weiner,
Suren Baghdasaryan, Jason Xing, Caspar Zhang
On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag.
Should this be backported into -stable kernels?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-22 22:21 ` Andrew Morton
@ 2019-08-22 23:11 ` Suren Baghdasaryan
2019-08-22 23:17 ` Andrew Morton
2019-08-23 0:53 ` Joseph Qi
1 sibling, 1 reply; 7+ messages in thread
From: Suren Baghdasaryan @ 2019-08-22 23:11 UTC (permalink / raw)
To: Andrew Morton, Greg Kroah-Hartman
Cc: Joseph Qi, Ingo Molnar, Peter Zijlstra, LKML, Johannes Weiner,
Jason Xing, Caspar Zhang, stable
On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
>
> > Only when calling the poll syscall the first time can user
> > receive POLLPRI correctly. After that, user always fails to
> > acquire the event signal.
> >
> > Reproduce case:
> > 1. Get the monitor code in Documentation/accounting/psi.txt
> > 2. Run it, and wait for the event triggered.
> > 3. Kill and restart the process.
> >
> > The question is why we can end up with poll_scheduled = 1 but the work
> > not running (which would reset it to 0). And the answer is because the
> > scheduling side sees group->poll_kworker under RCU protection and then
> > schedules it, but here we cancel the work and destroy the worker. The
> > cancel needs to pair with resetting the poll_scheduled flag.
>
> Should this be backported into -stable kernels?
Adding GregKH and stable@vger.kernel.org
I was able to cleanly apply this patch to stable master and
linux-5.2.y branches (these are the only branches that have psi
triggers).
Greg, Andrew got this patch into -mm tree. Please advise on how we
should proceed to land it in stable 5.2.y and master.
Thanks,
Suren.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-22 23:11 ` Suren Baghdasaryan
@ 2019-08-22 23:17 ` Andrew Morton
2019-08-22 23:21 ` Suren Baghdasaryan
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2019-08-22 23:17 UTC (permalink / raw)
To: Suren Baghdasaryan
Cc: Greg Kroah-Hartman, Joseph Qi, Ingo Molnar, Peter Zijlstra, LKML,
Johannes Weiner, Jason Xing, Caspar Zhang, stable
On Thu, 22 Aug 2019 16:11:15 -0700 Suren Baghdasaryan <surenb@google.com> wrote:
> On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> >
> > > Only when calling the poll syscall the first time can user
> > > receive POLLPRI correctly. After that, user always fails to
> > > acquire the event signal.
> > >
> > > Reproduce case:
> > > 1. Get the monitor code in Documentation/accounting/psi.txt
> > > 2. Run it, and wait for the event triggered.
> > > 3. Kill and restart the process.
> > >
> > > The question is why we can end up with poll_scheduled = 1 but the work
> > > not running (which would reset it to 0). And the answer is because the
> > > scheduling side sees group->poll_kworker under RCU protection and then
> > > schedules it, but here we cancel the work and destroy the worker. The
> > > cancel needs to pair with resetting the poll_scheduled flag.
> >
> > Should this be backported into -stable kernels?
>
> Adding GregKH and stable@vger.kernel.org
>
> I was able to cleanly apply this patch to stable master and
> linux-5.2.y branches (these are the only branches that have psi
> triggers).
> Greg, Andrew got this patch into -mm tree. Please advise on how we
> should proceed to land it in stable 5.2.y and master.
That isn't the point - we know how to merge patches ;)
What I'm asking is whether it is desirable that -stable kernels have
this patch. It certainly sounds like it from the changelog, so I'm
wondering if the omission of cc:stable was intentional?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-22 23:17 ` Andrew Morton
@ 2019-08-22 23:21 ` Suren Baghdasaryan
0 siblings, 0 replies; 7+ messages in thread
From: Suren Baghdasaryan @ 2019-08-22 23:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Greg Kroah-Hartman, Joseph Qi, Ingo Molnar, Peter Zijlstra, LKML,
Johannes Weiner, Jason Xing, Caspar Zhang, stable
On Thu, Aug 22, 2019 at 4:17 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 22 Aug 2019 16:11:15 -0700 Suren Baghdasaryan <surenb@google.com> wrote:
>
> > On Thu, Aug 22, 2019 at 3:21 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> > >
> > > > Only when calling the poll syscall the first time can user
> > > > receive POLLPRI correctly. After that, user always fails to
> > > > acquire the event signal.
> > > >
> > > > Reproduce case:
> > > > 1. Get the monitor code in Documentation/accounting/psi.txt
> > > > 2. Run it, and wait for the event triggered.
> > > > 3. Kill and restart the process.
> > > >
> > > > The question is why we can end up with poll_scheduled = 1 but the work
> > > > not running (which would reset it to 0). And the answer is because the
> > > > scheduling side sees group->poll_kworker under RCU protection and then
> > > > schedules it, but here we cancel the work and destroy the worker. The
> > > > cancel needs to pair with resetting the poll_scheduled flag.
> > >
> > > Should this be backported into -stable kernels?
> >
> > Adding GregKH and stable@vger.kernel.org
> >
> > I was able to cleanly apply this patch to stable master and
> > linux-5.2.y branches (these are the only branches that have psi
> > triggers).
> > Greg, Andrew got this patch into -mm tree. Please advise on how we
> > should proceed to land it in stable 5.2.y and master.
>
> That isn't the point - we know how to merge patches ;)
>
> What I'm asking is whether it is desirable that -stable kernels have
> this patch. It certainly sounds like it from the changelog, so I'm
> wondering if the omission of cc:stable was intentional?
Sorry for my misunderstanding. I believe cc:stable omission was
unintentional. It's a fix for a bug which exists in stable branches I
mentioned above.
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-22 22:21 ` Andrew Morton
2019-08-22 23:11 ` Suren Baghdasaryan
@ 2019-08-23 0:53 ` Joseph Qi
2019-08-24 20:51 ` Andrew Morton
1 sibling, 1 reply; 7+ messages in thread
From: Joseph Qi @ 2019-08-23 0:53 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Johannes Weiner,
Suren Baghdasaryan, Jason Xing, Caspar Zhang
On 19/8/23 06:21, Andrew Morton wrote:
> On Wed, 21 Aug 2019 11:26:25 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
>
>> Only when calling the poll syscall the first time can user
>> receive POLLPRI correctly. After that, user always fails to
>> acquire the event signal.
>>
>> Reproduce case:
>> 1. Get the monitor code in Documentation/accounting/psi.txt
>> 2. Run it, and wait for the event triggered.
>> 3. Kill and restart the process.
>>
>> The question is why we can end up with poll_scheduled = 1 but the work
>> not running (which would reset it to 0). And the answer is because the
>> scheduling side sees group->poll_kworker under RCU protection and then
>> schedules it, but here we cancel the work and destroy the worker. The
>> cancel needs to pair with resetting the poll_scheduled flag.
>
> Should this be backported into -stable kernels?
>
Sorry for missing that, should I resend it with cc stable tag?
Thanks,
Joseph
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3] psi: get poll_work to run when calling poll syscall next time
2019-08-23 0:53 ` Joseph Qi
@ 2019-08-24 20:51 ` Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2019-08-24 20:51 UTC (permalink / raw)
To: Joseph Qi
Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Johannes Weiner,
Suren Baghdasaryan, Jason Xing, Caspar Zhang
On Fri, 23 Aug 2019 08:53:09 +0800 Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> > Should this be backported into -stable kernels?
> >
> Sorry for missing that, should I resend it with cc stable tag?
I added cc:stable to this patch.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-24 20:51 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-21 3:26 [PATCH v3] psi: get poll_work to run when calling poll syscall next time Joseph Qi
2019-08-22 22:21 ` Andrew Morton
2019-08-22 23:11 ` Suren Baghdasaryan
2019-08-22 23:17 ` Andrew Morton
2019-08-22 23:21 ` Suren Baghdasaryan
2019-08-23 0:53 ` Joseph Qi
2019-08-24 20:51 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).