All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tianchen Ding <dtcccc@linux.alibaba.com>
To: Peter Zijlstra <peterz@infradead.org>, Chen Yu <yu.c.chen@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: Clear ttwu_pending after enqueue_task
Date: Wed, 2 Nov 2022 14:40:36 +0800	[thread overview]
Message-ID: <9ed75cad-3718-356f-21ca-1b8ec601f335@linux.alibaba.com> (raw)
In-Reply-To: <Y2E0TeFJorjOXikX@hirez.programming.kicks-ass.net>

On 2022/11/1 22:59, Peter Zijlstra wrote:
> On Tue, Nov 01, 2022 at 09:51:25PM +0800, Chen Yu wrote:
> 
>>> Could you try the below instead? Also note the comment; since you did
>>> the work to figure out why -- best record that for posterity.
>>>
>>> @@ -3737,6 +3730,13 @@ void sched_ttwu_pending(void *arg)
>>>   			set_task_cpu(p, cpu_of(rq));
>>>   
>>>   		ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0, &rf);
>>> +		/*
>>> +		 * Must be after enqueueing at least once task such that
>>> +		 * idle_cpu() does not observe a false-negative -- if it does,
>>> +		 * it is possible for select_idle_siblings() to stack a number
>>> +		 * of tasks on this CPU during that window.
>>> +		 */
>>> +		WRITE_ONCE(rq->ttwu_pending, 0);
>> Just curious why do we put above code inside llist_for_each_entry_safe loop?
> 
>> My understanding is that once 1 task is queued, select_idle_cpu() would not
>> treat this rq as idle anymore because nr_running is not 0. But would this bring
>> overhead to write the rq->ttwu_pending multiple times, do I miss something?
> 
> So the consideration is that by clearing it late, you might also clear a
> next set; consider something like:
> 
> 
> 	cpu0			cpu1			cpu2
> 
> 	ttwu_queue()
> 	  ->ttwu_pending = 1;
> 	  llist_add()
> 
> 				sched_ttwu_pending()
> 				  llist_del_all()
> 				  ... long ...
> 							ttwu_queue()
> 							  ->ttwu_pending = 1
> 							  llist_add()
> 
> 				  ... time ...
> 				  ->ttwu_pending = 0
> 
> Which leaves you with a non-empty list but with ttwu_pending == 0.
> 
> But I suppose that's not actually better with my variant, since it keeps
> writing 0s. We can make it more complicated again, but perhaps it
> doesn't matter and your version is good enough.
> 

Yeah. Since your version repeats writting 0 to ttwu_pending, it finally reaches 
the same effect with mine. Although the performance results in my tests seem to 
be no difference, it may still bring more overhead.

IMO, according to the latest linux-next code, all callers querying 
rq->ttwu_pending only take cares about whether the cpu is idle because they 
always combine with querying nr_running. Actually no one cares about whether 
wake_entry.llist is empty. So for the use of checking cpu idle state, move 
rq->ttwu_pending=0 after enqueuing task can help fully cover the whole state.

For your case, although ttwu_pending is set to 0 with some tasks really pending, 
at this time nr_running is sure to be >0, so callers who query both ttwu_pending 
and nr_running will know this cpu is not idle.
(Now the callers querying these two values are lockless, so there may be race in 
a really small window? But this case is extremely rare, I think we should not 
make it more complicated.)

> But please update with a comment on why it needs to be after
> ttwu_do_activate().

OK. Should I send v2 or you directly add the comment?

Thanks.


  parent reply	other threads:[~2022-11-02  6:40 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-01  7:36 [PATCH] sched: Clear ttwu_pending after enqueue_task Tianchen Ding
2022-11-01 10:34 ` Peter Zijlstra
2022-11-01 13:51   ` Chen Yu
2022-11-01 14:59     ` Peter Zijlstra
2022-11-02  3:01       ` Chen Yu
2022-11-02  6:40       ` Tianchen Ding [this message]
2022-11-02  6:40   ` Tianchen Ding
2022-11-04  2:36 ` [PATCH v2] " Tianchen Ding
2022-11-04  8:00   ` Chen Yu
2022-11-14 15:27   ` Mel Gorman
2022-11-16  9:22   ` [tip: sched/core] sched: Clear ttwu_pending after enqueue_task() tip-bot2 for Tianchen Ding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ed75cad-3718-356f-21ca-1b8ec601f335@linux.alibaba.com \
    --to=dtcccc@linux.alibaba.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.