linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: shrikanth hegde <sshegde@linux.vnet.ibm.com>
To: Benjamin Segall <bsegall@google.com>
Cc: mingo@redhat.com, peterz@infradead.org,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	tglx@linutronix.de, srikar@linux.vnet.ibm.com,
	arjan@linux.intel.com, svaidy@linux.ibm.com,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] sched/fair: Interleave cfs bandwidth timers for improved single thread performance at low utilization
Date: Wed, 15 Feb 2023 16:31:29 +0530	[thread overview]
Message-ID: <cd37483e-bf11-ec74-c240-74935bb44809@linux.vnet.ibm.com> (raw)
In-Reply-To: <xm268rh06i97.fsf@google.com>

>>
>>              6.2.rc5                           with patch
>>         1CG    power   2CG    power   | 1CG  power     2CG        power
>> 1Core   218     44     315      46    | 219    45    277(+12%)    47(-2%)
>>         219     43     315      45    | 219    44    244(+22%)    48(-6%)
>> 	                              |
>> 2Core   108     48     158      52    | 109    50    114(+26%)    59(-13%)
>>         109     49     157      52    | 109    49    136(+13%)    56(-7%)
>>                                       |
>> 4Core    60     59      89      65    |  62    58     72(+19%)    68(-5%)
>>          61     61      90      65    |  62    60     68(+24%)    73(-12%)
>>                                       |
>> 8Core    33     77      48      83    |  33    77     37(+23%)    91(-10%)
>>          33     77      48      84    |  33    77     38(+21%)    90(-7%)
>>
>> There is no benefit at higher utilization of 50% or more. There is no
>> degradation also.
>>
>> This is RFC PATCH V2, where the code has been shifted from hrtimer to
>> sched. This patch sets an initial value as multiple of period/10.
>> Here timers can still align if the time started the cgroup is within the
>> period/10 interval. On a real life workload, time gives sufficient
>> randomness. There can be a better interleaving by being more
>> deterministic. For example, when there are 2 cgroups, they should
>> have initial value of 0/50ms or 10/60ms so on. When there are 3 cgroups,
>> 0/3/6ms or 1/4/7ms etc. That is more complicated as it has to account
>> for cgroup addition/deletion and accuracy w.r.t to period/quota.
>> If that approach is better here, then will come up with that patch.
> 
> This does seem vaguely reasonable, though the power argument of
> consolidating wakeups and such is something that we intentionally do in
> other situations.
> 
Thank you Benjamin for taking a look and spending time in reviewing this.
> How reasonable do you think it is to just say (and what do the
> equivalent numbers look like on your particular benchmark) "put some
> variance on your period config if you want variance"?
>Run to run variance is expected with this patch as the patch depends
on time upto last period/10 as the basis for interleaving. 
What i could infer from this comment about variance. Please correct if not.

>>
>> Signed-off-by: Shrikanth Hegde<sshegde@linux.vnet.ibm.com>
>> ---
>>  kernel/sched/fair.c | 17 ++++++++++++++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index ff4dbbae3b10..7b69c329e05d 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5939,14 +5939,25 @@ static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>>
>>  void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
>>  {
>> -	lockdep_assert_held(&cfs_b->lock);
>> +	struct hrtimer *period_timer = &cfs_b->period_timer;
>> +	s64 incr = ktime_to_ns(cfs_b->period) / 10;
>> +	ktime_t delta;
>> +	u64 orun = 1;
>>
>> +	lockdep_assert_held(&cfs_b->lock);
>>  	if (cfs_b->period_active)
>>  		return;
>>
>>  	cfs_b->period_active = 1;
>> -	hrtimer_forward_now(&cfs_b->period_timer, cfs_b->period);
>> -	hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED);
>> +	delta = ktime_sub(period_timer->base->get_time(),
>> +			hrtimer_get_expires(period_timer));
>> +	if (unlikely(delta >= cfs_b->period)) {
> 
> Probably could have a short comment here that's something like "forward
> the hrtimer by period / 10 to reduce synchronized wakeups"
> 
Sure. Will do in the next version of this patch. 

>> +		orun = ktime_divns(delta, incr);
>> +		hrtimer_add_expires_ns(period_timer, incr * orun);
>> +	}
>> +
>> +	hrtimer_forward_now(period_timer, cfs_b->period);
>> +	hrtimer_start_expires(period_timer, HRTIMER_MODE_ABS_PINNED);
>>  }
>>
>>  static void destroy_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
>> --
>> 2.31.1

  reply	other threads:[~2023-02-15 11:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230214120502.934324-1-sshegde@linux.vnet.ibm.com>
2023-02-14 21:37 ` [RFC PATCH] sched/fair: Interleave cfs bandwidth timers for improved single thread performance at low utilization Benjamin Segall
2023-02-15 11:01   ` shrikanth hegde [this message]
2023-02-15 21:32     ` Benjamin Segall
2023-02-16 19:57       ` shrikanth hegde
2023-02-14 15:24 shrikanth hegde
2023-02-20 17:38 ` Peter Zijlstra
2023-02-21 18:53   ` shrikanth hegde
2023-02-21 21:43     ` Benjamin Segall
2023-02-22  9:36       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd37483e-bf11-ec74-c240-74935bb44809@linux.vnet.ibm.com \
    --to=sshegde@linux.vnet.ibm.com \
    --cc=arjan@linux.intel.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=svaidy@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).