From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
Vincent Guittot <vincent.guittot@linaro.org>
Cc: mingo@redhat.com, juri.lelli@redhat.com,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
linux-kernel@vger.kernel.org, joel@joelfernandes.org
Subject: Re: [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time
Date: Mon, 15 Nov 2021 11:40:03 -0800 [thread overview]
Message-ID: <272f4cee-251a-b107-78fe-9a38b33b1084@linux.intel.com> (raw)
In-Reply-To: <YY6GfilrilzTmhZx@hirez.programming.kicks-ass.net>
On 11/12/21 7:21 AM, Peter Zijlstra wrote:
> On Fri, Nov 12, 2021 at 11:04:58AM +0100, Vincent Guittot wrote:
>> From: Tim Chen <tim.c.chen@linux.intel.com>
>>
>> In traces on newidle_balance(), this_rq->next_balance
>> time goes backward and earlier than current time jiffies, e.g.
>>
>> 11.602 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739
>> 11.624 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739
>> 13.856 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73b
>> 13.910 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73b
>> 14.637 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73c
>> 14.666 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73c
>
> No explanation of what these numbers mean, or where they're taken from.
Sorry I should have added more explanation. I put a probe on newidle_balance and dump
out the values of this_rq pointer, this_rq->next_balance and jiffies entering
newidle_balance using the following commands:
perf probe 'newidle_balance this_rq this_rq->next_balance jiffies'
perf trace -e probe:newidle_balance
In the first line of the trace, next_balance start off at 0x1004fb76c:
11.602 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739
and in the second line, next_balance actually goes backward to 0x1004fb731, and becomes less than the jiffies value 0x1004fb739.
11.624 ( ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739
>
>> It doesn't make sense to have a next_balance in the past.
>> Fix newidle_balance() and update_next_balance() so the next
>> balance time is at least jiffies+1.
>
> The changelog is deficient in that it doesn't explain how the times end
> up in the past, therefore we cannot evaluate if the provided solution is
> sufficient etc..
>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> ---
>> kernel/sched/fair.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index a162b0ec8963..1050037578a9 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -10138,7 +10138,10 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance)
>>
>> /* used by idle balance, so cpu_busy = 0 */
>> interval = get_sd_balance_interval(sd, 0);
>> - next = sd->last_balance + interval;
>> + if (time_after(jiffies+1, sd->last_balance + interval))
>> + next = jiffies+1;
>> + else
>> + next = sd->last_balance + interval;
>>
>> if (time_after(*next_balance, next))
>> *next_balance = next;
>> @@ -10974,6 +10977,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
>>
>> out:
>> /* Move the next balance forward */
>> + if (time_after(jiffies+1, this_rq->next_balance))
>> + this_rq->next_balance = jiffies+1;
>
> jiffies roll over here..
>
> Also, what's the point of the update_next_balance() addition in the face
> of this one? AFAICT this hunk completely renders the other hunk useless.
>
>> if (time_after(this_rq->next_balance, next_balance))
>> this_rq->next_balance = next_balance;
>
> and you've violated your own premise :-)
Agree that this hunk is redundant. Should only keep the update_next_balance() hunk.
>
> Now, this pattern is repeated throughout, if it's a problem here, why
> isn't it a problem in say rebalance_domains() ?
In rebalance_domains, next_balance is assigned an initial value of jiffies+60*HZ
and could only increase.
So when we update this_rq-next_balance with next_balance computed,
it should always be more than current jiffies.
>
> Can we please unify the code across sites instead of growing different
> hacks in different places?
>
I'll take a closer look at the next_balance computation in other places.
Tim
prev parent reply other threads:[~2021-11-16 0:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-12 10:04 [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time Vincent Guittot
2021-11-12 15:21 ` Peter Zijlstra
2021-11-15 19:40 ` Tim Chen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=272f4cee-251a-b107-78fe-9a38b33b1084@linux.intel.com \
--to=tim.c.chen@linux.intel.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joel@joelfernandes.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).