linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time
@ 2021-11-12 10:04 Vincent Guittot
  2021-11-12 15:21 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Vincent Guittot @ 2021-11-12 10:04 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, linux-kernel, tim.c.chen
  Cc: joel, Vincent Guittot

From: Tim Chen <tim.c.chen@linux.intel.com>

In traces on newidle_balance(), this_rq->next_balance
time goes backward and earlier than current time jiffies, e.g.

11.602 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739
11.624 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739
13.856 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73b
13.910 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73b
14.637 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73c
14.666 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73c

It doesn't make sense to have a next_balance in the past.
Fix newidle_balance() and update_next_balance() so the next
balance time is at least jiffies+1.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
[Rebase]
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a162b0ec8963..1050037578a9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10138,7 +10138,10 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance)
 
 	/* used by idle balance, so cpu_busy = 0 */
 	interval = get_sd_balance_interval(sd, 0);
-	next = sd->last_balance + interval;
+	if (time_after(jiffies+1, sd->last_balance + interval))
+		next = jiffies+1;
+	else
+		next = sd->last_balance + interval;
 
 	if (time_after(*next_balance, next))
 		*next_balance = next;
@@ -10974,6 +10977,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
 
 out:
 	/* Move the next balance forward */
+	if (time_after(jiffies+1, this_rq->next_balance))
+		this_rq->next_balance = jiffies+1;
 	if (time_after(this_rq->next_balance, next_balance))
 		this_rq->next_balance = next_balance;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time
  2021-11-12 10:04 [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time Vincent Guittot
@ 2021-11-12 15:21 ` Peter Zijlstra
  2021-11-15 19:40   ` Tim Chen
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2021-11-12 15:21 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, linux-kernel, tim.c.chen, joel

On Fri, Nov 12, 2021 at 11:04:58AM +0100, Vincent Guittot wrote:
> From: Tim Chen <tim.c.chen@linux.intel.com>
> 
> In traces on newidle_balance(), this_rq->next_balance
> time goes backward and earlier than current time jiffies, e.g.
> 
> 11.602 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739
> 11.624 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739
> 13.856 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73b
> 13.910 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73b
> 14.637 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73c
> 14.666 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73c

No explanation of what these numbers mean, or where they're taken from.

> It doesn't make sense to have a next_balance in the past.
> Fix newidle_balance() and update_next_balance() so the next
> balance time is at least jiffies+1.

The changelog is deficient in that it doesn't explain how the times end
up in the past, therefore we cannot evaluate if the provided solution is
sufficient etc..

> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  kernel/sched/fair.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a162b0ec8963..1050037578a9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10138,7 +10138,10 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance)
>  
>  	/* used by idle balance, so cpu_busy = 0 */
>  	interval = get_sd_balance_interval(sd, 0);
> -	next = sd->last_balance + interval;
> +	if (time_after(jiffies+1, sd->last_balance + interval))
> +		next = jiffies+1;
> +	else
> +		next = sd->last_balance + interval;
>  
>  	if (time_after(*next_balance, next))
>  		*next_balance = next;
> @@ -10974,6 +10977,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
>  
>  out:
>  	/* Move the next balance forward */
> +	if (time_after(jiffies+1, this_rq->next_balance))
> +		this_rq->next_balance = jiffies+1;

jiffies roll over here..

Also, what's the point of the update_next_balance() addition in the face
of this one? AFAICT this hunk completely renders the other hunk useless.

>  	if (time_after(this_rq->next_balance, next_balance))
>  		this_rq->next_balance = next_balance;

and you've violated your own premise :-)

Now, this pattern is repeated throughout, if it's a problem here, why
isn't it a problem in say rebalance_domains() ?

Can we please unify the code across sites instead of growing different
hacks in different places?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time
  2021-11-12 15:21 ` Peter Zijlstra
@ 2021-11-15 19:40   ` Tim Chen
  0 siblings, 0 replies; 3+ messages in thread
From: Tim Chen @ 2021-11-15 19:40 UTC (permalink / raw)
  To: Peter Zijlstra, Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, linux-kernel, joel

On 11/12/21 7:21 AM, Peter Zijlstra wrote:
> On Fri, Nov 12, 2021 at 11:04:58AM +0100, Vincent Guittot wrote:
>> From: Tim Chen <tim.c.chen@linux.intel.com>
>>
>> In traces on newidle_balance(), this_rq->next_balance
>> time goes backward and earlier than current time jiffies, e.g.
>>
>> 11.602 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739
>> 11.624 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739
>> 13.856 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73b
>> 13.910 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73b
>> 14.637 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb73c
>> 14.666 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb73c
> 
> No explanation of what these numbers mean, or where they're taken from.

Sorry I should have added more explanation.  I put a probe on newidle_balance and dump
out the values of this_rq pointer, this_rq->next_balance and jiffies entering
newidle_balance using the following commands:

perf probe 'newidle_balance this_rq this_rq->next_balance jiffies'
perf trace -e probe:newidle_balance

In the first line of the trace, next_balance start off at 0x1004fb76c: 

11.602 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb76c jiffies=0x1004fb739

and in the second line, next_balance actually goes backward to 0x1004fb731, and becomes less than the jiffies value 0x1004fb739.

11.624 (         ): probe:newidle_balance:(ffffffff810d2470) this_rq=0xffff88fe7f8aae00 next_balance=0x1004fb731 jiffies=0x1004fb739


> 
>> It doesn't make sense to have a next_balance in the past.
>> Fix newidle_balance() and update_next_balance() so the next
>> balance time is at least jiffies+1.
> 
> The changelog is deficient in that it doesn't explain how the times end
> up in the past, therefore we cannot evaluate if the provided solution is
> sufficient etc..
> 
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> ---
>>  kernel/sched/fair.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index a162b0ec8963..1050037578a9 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -10138,7 +10138,10 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance)
>>  
>>  	/* used by idle balance, so cpu_busy = 0 */
>>  	interval = get_sd_balance_interval(sd, 0);
>> -	next = sd->last_balance + interval;
>> +	if (time_after(jiffies+1, sd->last_balance + interval))
>> +		next = jiffies+1;
>> +	else
>> +		next = sd->last_balance + interval;
>>  
>>  	if (time_after(*next_balance, next))
>>  		*next_balance = next;
>> @@ -10974,6 +10977,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf)
>>  
>>  out:
>>  	/* Move the next balance forward */
>> +	if (time_after(jiffies+1, this_rq->next_balance))
>> +		this_rq->next_balance = jiffies+1;
> 
> jiffies roll over here..
> 
> Also, what's the point of the update_next_balance() addition in the face
> of this one? AFAICT this hunk completely renders the other hunk useless.
> 
>>  	if (time_after(this_rq->next_balance, next_balance))
>>  		this_rq->next_balance = next_balance;
> 
> and you've violated your own premise :-)

Agree that this hunk is redundant.  Should only keep the update_next_balance() hunk.

> 
> Now, this pattern is repeated throughout, if it's a problem here, why
> isn't it a problem in say rebalance_domains() ?

In rebalance_domains, next_balance is assigned an initial value of jiffies+60*HZ
and could only increase.

So when we update this_rq-next_balance with next_balance computed,
it should always be more than current jiffies.

> 
> Can we please unify the code across sites instead of growing different
> hacks in different places?
> 

I'll take a closer look at the next_balance computation in other places.

Tim

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-16  0:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-12 10:04 [PATCH 2/2] sched: sched: Fix rq->next_balance time updated to earlier than current time Vincent Guittot
2021-11-12 15:21 ` Peter Zijlstra
2021-11-15 19:40   ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).