All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
	mingo@redhat.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, matthias.bgg@gmail.com, minchan@google.com,
	timmurray@google.com, yt.chang@mediatek.com,
	wenju.xu@mediatek.com, jonathan.jmchen@mediatek.com,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, kernel-team@android.com
Subject: Re: [PATCH 1/1] psi: stop relying on timer_pending for poll_work rescheduling
Date: Tue, 22 Jun 2021 11:07:51 -0400	[thread overview]
Message-ID: <YNH8x9HRyfvF53Pl@cmpxchg.org> (raw)
In-Reply-To: <YNHr81D/fPD2Q8kM@hirez.programming.kicks-ass.net>

On Tue, Jun 22, 2021 at 03:56:03PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 02:26:54PM -0700, Suren Baghdasaryan wrote:
> > Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism")
> > Reported-by: Kathleen Chang <yt.chang@mediatek.com>
> > Reported-by: Wenju Xu <wenju.xu@mediatek.com>
> > Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> 
> Johannes?

Looks generally good to me, I'd just want to get to the bottom of the
memory ordering before acking...

> > -/* Schedule polling if it's not already scheduled. */
> > -static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay)
> > +/* Schedule polling if it's not already scheduled or forced. */
> > +static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay,
> > +				   bool force)
> >  {
> >  	struct task_struct *task;
> >  
> > -	/*
> > -	 * Do not reschedule if already scheduled.
> > -	 * Possible race with a timer scheduled after this check but before
> > -	 * mod_timer below can be tolerated because group->polling_next_update
> > -	 * will keep updates on schedule.
> > -	 */
> > -	if (timer_pending(&group->poll_timer))
> > +	/* cmpxchg should be called even when !force to set poll_scheduled */
> > +	if (atomic_cmpxchg(&group->poll_scheduled, 0, 1) != 0 && !force)
> 
> Do you care about memory ordering here? Afaict the whole thing is
> supposed to be ordered by ->trigger_lock, so you don't.

It's not always held when we get here.

The worker holds it when it reschedules itself, to serialize against
userspace destroying the trigger itself. But the scheduler doesn't
hold it when it kicks the worker on an actionable task change.

That said, I think the ordering we care about there is that when the
scheduler side sees the worker still queued, the worker must see the
scheduler's updates to the percpu states and process them correctly.
But that should be ensured already by the ordering implied by the
seqcount sections around both the writer and the reader side.

Is there another possible race that I'm missing?

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
	mingo@redhat.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, matthias.bgg@gmail.com, minchan@google.com,
	timmurray@google.com, yt.chang@mediatek.com,
	wenju.xu@mediatek.com, jonathan.jmchen@mediatek.com,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, kernel-team@android.com
Subject: Re: [PATCH 1/1] psi: stop relying on timer_pending for poll_work rescheduling
Date: Tue, 22 Jun 2021 11:07:51 -0400	[thread overview]
Message-ID: <YNH8x9HRyfvF53Pl@cmpxchg.org> (raw)
In-Reply-To: <YNHr81D/fPD2Q8kM@hirez.programming.kicks-ass.net>

On Tue, Jun 22, 2021 at 03:56:03PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 02:26:54PM -0700, Suren Baghdasaryan wrote:
> > Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism")
> > Reported-by: Kathleen Chang <yt.chang@mediatek.com>
> > Reported-by: Wenju Xu <wenju.xu@mediatek.com>
> > Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> 
> Johannes?

Looks generally good to me, I'd just want to get to the bottom of the
memory ordering before acking...

> > -/* Schedule polling if it's not already scheduled. */
> > -static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay)
> > +/* Schedule polling if it's not already scheduled or forced. */
> > +static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay,
> > +				   bool force)
> >  {
> >  	struct task_struct *task;
> >  
> > -	/*
> > -	 * Do not reschedule if already scheduled.
> > -	 * Possible race with a timer scheduled after this check but before
> > -	 * mod_timer below can be tolerated because group->polling_next_update
> > -	 * will keep updates on schedule.
> > -	 */
> > -	if (timer_pending(&group->poll_timer))
> > +	/* cmpxchg should be called even when !force to set poll_scheduled */
> > +	if (atomic_cmpxchg(&group->poll_scheduled, 0, 1) != 0 && !force)
> 
> Do you care about memory ordering here? Afaict the whole thing is
> supposed to be ordered by ->trigger_lock, so you don't.

It's not always held when we get here.

The worker holds it when it reschedules itself, to serialize against
userspace destroying the trigger itself. But the scheduler doesn't
hold it when it kicks the worker on an actionable task change.

That said, I think the ordering we care about there is that when the
scheduler side sees the worker still queued, the worker must see the
scheduler's updates to the percpu states and process them correctly.
But that should be ensured already by the ordering implied by the
seqcount sections around both the writer and the reader side.

Is there another possible race that I'm missing?

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>,
	mingo@redhat.com, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	bristot@redhat.com, matthias.bgg@gmail.com, minchan@google.com,
	timmurray@google.com, yt.chang@mediatek.com,
	wenju.xu@mediatek.com, jonathan.jmchen@mediatek.com,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, kernel-team@android.com
Subject: Re: [PATCH 1/1] psi: stop relying on timer_pending for poll_work rescheduling
Date: Tue, 22 Jun 2021 11:07:51 -0400	[thread overview]
Message-ID: <YNH8x9HRyfvF53Pl@cmpxchg.org> (raw)
In-Reply-To: <YNHr81D/fPD2Q8kM@hirez.programming.kicks-ass.net>

On Tue, Jun 22, 2021 at 03:56:03PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 02:26:54PM -0700, Suren Baghdasaryan wrote:
> > Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism")
> > Reported-by: Kathleen Chang <yt.chang@mediatek.com>
> > Reported-by: Wenju Xu <wenju.xu@mediatek.com>
> > Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com>
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> 
> Johannes?

Looks generally good to me, I'd just want to get to the bottom of the
memory ordering before acking...

> > -/* Schedule polling if it's not already scheduled. */
> > -static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay)
> > +/* Schedule polling if it's not already scheduled or forced. */
> > +static void psi_schedule_poll_work(struct psi_group *group, unsigned long delay,
> > +				   bool force)
> >  {
> >  	struct task_struct *task;
> >  
> > -	/*
> > -	 * Do not reschedule if already scheduled.
> > -	 * Possible race with a timer scheduled after this check but before
> > -	 * mod_timer below can be tolerated because group->polling_next_update
> > -	 * will keep updates on schedule.
> > -	 */
> > -	if (timer_pending(&group->poll_timer))
> > +	/* cmpxchg should be called even when !force to set poll_scheduled */
> > +	if (atomic_cmpxchg(&group->poll_scheduled, 0, 1) != 0 && !force)
> 
> Do you care about memory ordering here? Afaict the whole thing is
> supposed to be ordered by ->trigger_lock, so you don't.

It's not always held when we get here.

The worker holds it when it reschedules itself, to serialize against
userspace destroying the trigger itself. But the scheduler doesn't
hold it when it kicks the worker on an actionable task change.

That said, I think the ordering we care about there is that when the
scheduler side sees the worker still queued, the worker must see the
scheduler's updates to the percpu states and process them correctly.
But that should be ensured already by the ordering implied by the
seqcount sections around both the writer and the reader side.

Is there another possible race that I'm missing?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-22 15:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 21:26 [PATCH 1/1] psi: stop relying on timer_pending for poll_work rescheduling Suren Baghdasaryan
2021-06-17 21:26 ` Suren Baghdasaryan
2021-06-17 21:26 ` Suren Baghdasaryan
2021-06-21 12:30 ` YT Chang
2021-06-21 12:30   ` YT Chang
2021-06-21 12:30   ` YT Chang
2021-06-21 17:49   ` Suren Baghdasaryan
2021-06-21 17:49     ` Suren Baghdasaryan
2021-06-21 17:49     ` Suren Baghdasaryan
2021-06-22 13:56 ` Peter Zijlstra
2021-06-22 13:56   ` Peter Zijlstra
2021-06-22 13:56   ` Peter Zijlstra
2021-06-22 15:07   ` Johannes Weiner [this message]
2021-06-22 15:07     ` Johannes Weiner
2021-06-22 15:07     ` Johannes Weiner
2021-06-22 15:48     ` Suren Baghdasaryan
2021-06-22 15:48       ` Suren Baghdasaryan
2021-06-22 15:48       ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNH8x9HRyfvF53Pl@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jonathan.jmchen@mediatek.com \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=matthias.bgg@gmail.com \
    --cc=mgorman@suse.de \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=wenju.xu@mediatek.com \
    --cc=yt.chang@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.