linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] psi: get poll_work to run when calling poll syscall next time
@ 2019-07-23  6:45 Jason Xing
  2019-07-23 10:02 ` Caspar Zhang
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jason Xing @ 2019-07-23  6:45 UTC (permalink / raw)
  To: hannes, surenb, mingo, peterz
  Cc: dennis, axboe, lizefan, tj, kerneljasonxing, linux-kernel

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

If the user doesn't kill the monitor process, it seems the
poll_work works fine. After killing and restarting the monitor,
the poll_work in kernel will never run again due to the wrong
value of poll_scheduled. Therefore, we should reset the value
as group_init() does after the last trigger is destroyed.

Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
---
 kernel/sched/psi.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632..66f4385 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
 	if (kworker_to_destroy) {
 		kthread_cancel_delayed_work_sync(&group->poll_work);
 		kthread_destroy_worker(kworker_to_destroy);
+		/*
+		 * The poll_work should have the chance to be put into the
+		 * kthread queue when calling poll syscall next time. So
+		 * reset poll_scheduled to zero as group_init() does
+		 */
+		atomic_set(&group->poll_scheduled, 0);
 	}
 	kfree(t);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] psi: get poll_work to run when calling poll syscall next time
  2019-07-23  6:45 [PATCH] psi: get poll_work to run when calling poll syscall next time Jason Xing
@ 2019-07-23 10:02 ` Caspar Zhang
  2019-07-29  8:12 ` Jason Xing
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Caspar Zhang @ 2019-07-23 10:02 UTC (permalink / raw)
  To: Jason Xing
  Cc: hannes, surenb, mingo, peterz, dennis, axboe, lizefan, tj,
	linux-kernel, caspar, joseph.qi

On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>

Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>

> ---
>  kernel/sched/psi.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
>  	if (kworker_to_destroy) {
>  		kthread_cancel_delayed_work_sync(&group->poll_work);
>  		kthread_destroy_worker(kworker_to_destroy);
> +		/*
> +		 * The poll_work should have the chance to be put into the
> +		 * kthread queue when calling poll syscall next time. So
> +		 * reset poll_scheduled to zero as group_init() does
> +		 */
> +		atomic_set(&group->poll_scheduled, 0);
>  	}
>  	kfree(t);
>  }
> --
> 1.8.3.1
>

--
        Thanks,
        Caspar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] psi: get poll_work to run when calling poll syscall next time
  2019-07-23  6:45 [PATCH] psi: get poll_work to run when calling poll syscall next time Jason Xing
  2019-07-23 10:02 ` Caspar Zhang
@ 2019-07-29  8:12 ` Jason Xing
  2019-07-29 15:29 ` Johannes Weiner
  2019-07-30  5:16 ` [PATCH v2] " Jason Xing
  3 siblings, 0 replies; 8+ messages in thread
From: Jason Xing @ 2019-07-29  8:12 UTC (permalink / raw)
  To: hannes, surenb, mingo, peterz; +Cc: dennis, axboe, lizefan, tj, linux-kernel

Hello,

Could someone take a quick look at this patch? It's not complicated at 
all, just one line added into PSI which can make the poll() run in the 
right way.

Thanks,
Jason

On 2019/7/23 下午2:45, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
> 
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
> 
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
> ---
>   kernel/sched/psi.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
>   	if (kworker_to_destroy) {
>   		kthread_cancel_delayed_work_sync(&group->poll_work);
>   		kthread_destroy_worker(kworker_to_destroy);
> +		/*
> +		 * The poll_work should have the chance to be put into the
> +		 * kthread queue when calling poll syscall next time. So
> +		 * reset poll_scheduled to zero as group_init() does
> +		 */
> +		atomic_set(&group->poll_scheduled, 0);
>   	}
>   	kfree(t);
>   }
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] psi: get poll_work to run when calling poll syscall next time
  2019-07-23  6:45 [PATCH] psi: get poll_work to run when calling poll syscall next time Jason Xing
  2019-07-23 10:02 ` Caspar Zhang
  2019-07-29  8:12 ` Jason Xing
@ 2019-07-29 15:29 ` Johannes Weiner
  2019-07-29 16:27   ` Suren Baghdasaryan
  2019-07-30  5:16 ` [PATCH v2] " Jason Xing
  3 siblings, 1 reply; 8+ messages in thread
From: Johannes Weiner @ 2019-07-29 15:29 UTC (permalink / raw)
  To: Jason Xing
  Cc: surenb, mingo, peterz, dennis, axboe, lizefan, tj, linux-kernel

Hi Jason,

On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
> 
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
> 
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>

Good catch, and the fix makes sense to me. However, it was a bit hard
to understand how the problem occurs:

> ---
>  kernel/sched/psi.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
>  	if (kworker_to_destroy) {
>  		kthread_cancel_delayed_work_sync(&group->poll_work);
>  		kthread_destroy_worker(kworker_to_destroy);
> +		/*
> +		 * The poll_work should have the chance to be put into the
> +		 * kthread queue when calling poll syscall next time. So
> +		 * reset poll_scheduled to zero as group_init() does
> +		 */
> +		atomic_set(&group->poll_scheduled, 0);

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag:

	if (kworker_to_destroy) {
		/*
		 * After the RCU grace period has expired, the worker
		 * can no longer be found through group->poll_kworker.
		 * But it might have been already scheduled before
		 * that - deschedule it cleanly before destroying it.
		 */
		kthread_cancel_delayed_work_sync(&group->poll_work);
		atomic_set(&group->poll_scheduled, 0);

		kthread_destroy_worker(kworker_to_destroy);
	}

With that change, please add:

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] psi: get poll_work to run when calling poll syscall next time
  2019-07-29 15:29 ` Johannes Weiner
@ 2019-07-29 16:27   ` Suren Baghdasaryan
  0 siblings, 0 replies; 8+ messages in thread
From: Suren Baghdasaryan @ 2019-07-29 16:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Jason Xing, Ingo Molnar, Peter Zijlstra, Dennis Zhou, axboe,
	lizefan, Tejun Heo, LKML

On Mon, Jul 29, 2019 at 8:30 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> Hi Jason,
>
> On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> > Only when calling the poll syscall the first time can user
> > receive POLLPRI correctly. After that, user always fails to
> > acquire the event signal.
> >
> > Reproduce case:
> > 1. Get the monitor code in Documentation/accounting/psi.txt
> > 2. Run it, and wait for the event triggered.
> > 3. Kill and restart the process.
> >
> > If the user doesn't kill the monitor process, it seems the
> > poll_work works fine. After killing and restarting the monitor,
> > the poll_work in kernel will never run again due to the wrong
> > value of poll_scheduled. Therefore, we should reset the value
> > as group_init() does after the last trigger is destroyed.
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
>
> Good catch, and the fix makes sense to me. However, it was a bit hard
> to understand how the problem occurs:
>
> > ---
> >  kernel/sched/psi.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index 7acc632..66f4385 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
> >       if (kworker_to_destroy) {
> >               kthread_cancel_delayed_work_sync(&group->poll_work);
> >               kthread_destroy_worker(kworker_to_destroy);
> > +             /*
> > +              * The poll_work should have the chance to be put into the
> > +              * kthread queue when calling poll syscall next time. So
> > +              * reset poll_scheduled to zero as group_init() does
> > +              */
> > +             atomic_set(&group->poll_scheduled, 0);
>
> The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag:
>
>         if (kworker_to_destroy) {
>                 /*
>                  * After the RCU grace period has expired, the worker
>                  * can no longer be found through group->poll_kworker.
>                  * But it might have been already scheduled before
>                  * that - deschedule it cleanly before destroying it.
>                  */
>                 kthread_cancel_delayed_work_sync(&group->poll_work);
>                 atomic_set(&group->poll_scheduled, 0);
>
>                 kthread_destroy_worker(kworker_to_destroy);
>         }
>
> With that change, please add:
>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>
> Thanks!

The changes makes sense to me as well. Thanks!

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] psi: get poll_work to run when calling poll syscall next time
  2019-07-23  6:45 [PATCH] psi: get poll_work to run when calling poll syscall next time Jason Xing
                   ` (2 preceding siblings ...)
  2019-07-29 15:29 ` Johannes Weiner
@ 2019-07-30  5:16 ` Jason Xing
  2019-08-02  6:20   ` Jason Xing
  2019-08-15  1:59   ` Jason Xing
  3 siblings, 2 replies; 8+ messages in thread
From: Jason Xing @ 2019-07-30  5:16 UTC (permalink / raw)
  To: hannes, surenb
  Cc: dennis, mingo, axboe, lizefan, peterz, tj, kerneljasonxing,
	linux-kernel, caspar, joseph.qi

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

If the user doesn't kill the monitor process, it seems the
poll_work works fine. After killing and restarting the monitor,
the poll_work in kernel will never run again due to the wrong
value of poll_scheduled. Therefore, we should reset the value
as group_init() does after the last trigger is destroyed.

[PATCH V2]
In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
into the right place.
Here I quoted from Johannes as the best explaination:
"The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag."

Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 kernel/sched/psi.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632..acdada0 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
 	 * deadlock while waiting for psi_poll_work to acquire trigger_lock
 	 */
 	if (kworker_to_destroy) {
+		/*
+		 * After the RCU grace period has expired, the worker
+		 * can no longer be found through group->poll_kworker.
+		 * But it might have been already scheduled before
+		 * that - deschedule it cleanly before destroying it.
+		 */
 		kthread_cancel_delayed_work_sync(&group->poll_work);
+		atomic_set(&group->poll_scheduled, 0);
 		kthread_destroy_worker(kworker_to_destroy);
 	}
 	kfree(t);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] psi: get poll_work to run when calling poll syscall next time
  2019-07-30  5:16 ` [PATCH v2] " Jason Xing
@ 2019-08-02  6:20   ` Jason Xing
  2019-08-15  1:59   ` Jason Xing
  1 sibling, 0 replies; 8+ messages in thread
From: Jason Xing @ 2019-08-02  6:20 UTC (permalink / raw)
  To: hannes, surenb
  Cc: dennis, mingo, axboe, lizefan, peterz, tj, linux-kernel, caspar,
	joseph.qi

Hi all,

According to the reviews from Johoannes, I've changed the old patch and 
then submitted the version 2 patch a few days ago.

Please let me know if all this sounds good, or if there are any issues.

Thanks,
Jason

On 2019/7/30 下午1:16, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
> 
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
> 
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
> 
> [PATCH V2]
> In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
> into the right place.
> Here I quoted from Johannes as the best explaination:
> "The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag."
> 
> Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
> Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>   kernel/sched/psi.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..acdada0 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
>   	 * deadlock while waiting for psi_poll_work to acquire trigger_lock
>   	 */
>   	if (kworker_to_destroy) {
> +		/*
> +		 * After the RCU grace period has expired, the worker
> +		 * can no longer be found through group->poll_kworker.
> +		 * But it might have been already scheduled before
> +		 * that - deschedule it cleanly before destroying it.
> +		 */
>   		kthread_cancel_delayed_work_sync(&group->poll_work);
> +		atomic_set(&group->poll_scheduled, 0);
>   		kthread_destroy_worker(kworker_to_destroy);
>   	}
>   	kfree(t);
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] psi: get poll_work to run when calling poll syscall next time
  2019-07-30  5:16 ` [PATCH v2] " Jason Xing
  2019-08-02  6:20   ` Jason Xing
@ 2019-08-15  1:59   ` Jason Xing
  1 sibling, 0 replies; 8+ messages in thread
From: Jason Xing @ 2019-08-15  1:59 UTC (permalink / raw)
  To: hannes, surenb
  Cc: dennis, mingo, axboe, lizefan, peterz, tj, linux-kernel, caspar,
	joseph.qi

Hello,

It's been delayed for no reason a couple of days. Any comments and 
suggestions on this patch V2 would be appreciated.

Thanks,
Jason

On 2019/7/30 下午1:16, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
> 
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
> 
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
> 
> [PATCH V2]
> In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
> into the right place.
> Here I quoted from Johannes as the best explaination:
> "The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag."
> 
> Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
> Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>   kernel/sched/psi.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..acdada0 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
>   	 * deadlock while waiting for psi_poll_work to acquire trigger_lock
>   	 */
>   	if (kworker_to_destroy) {
> +		/*
> +		 * After the RCU grace period has expired, the worker
> +		 * can no longer be found through group->poll_kworker.
> +		 * But it might have been already scheduled before
> +		 * that - deschedule it cleanly before destroying it.
> +		 */
>   		kthread_cancel_delayed_work_sync(&group->poll_work);
> +		atomic_set(&group->poll_scheduled, 0);
>   		kthread_destroy_worker(kworker_to_destroy);
>   	}
>   	kfree(t);
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-08-15  2:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23  6:45 [PATCH] psi: get poll_work to run when calling poll syscall next time Jason Xing
2019-07-23 10:02 ` Caspar Zhang
2019-07-29  8:12 ` Jason Xing
2019-07-29 15:29 ` Johannes Weiner
2019-07-29 16:27   ` Suren Baghdasaryan
2019-07-30  5:16 ` [PATCH v2] " Jason Xing
2019-08-02  6:20   ` Jason Xing
2019-08-15  1:59   ` Jason Xing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).