All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chiluk <chiluk+linux@indeed.com>
To: Ben Segall <bsegall@google.com>
Cc: Phil Auld <pauld@redhat.com>, Peter Oskolkov <posk@posk.io>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	cgroups@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Brendan Gregg <bgregg@netflix.com>, Kyle Anderson <kwa@yelp.com>,
	Gabriel Munos <gmunoz@netflix.com>,
	John Hammond <jhammond@indeed.com>,
	Cong Wang <xiyou.wangcong@gmail.com>
Subject: Re: [PATCH v4 1/1] sched/fair: Return all runtime when cfs_b has very little remaining.
Date: Wed, 26 Jun 2019 17:10:04 -0500	[thread overview]
Message-ID: <CAC=E7cUxcNkc7T7AXCr3PO6rqqxrk269JW3SDnXG-LtO-6-BVQ@mail.gmail.com> (raw)
In-Reply-To: <xm26tvcex50s.fsf@bsegall-linux.svl.corp.google.com>

On Mon, Jun 24, 2019 at 10:33:07AM -0700, bsegall@google.com wrote:
> This still has a similar cost as reducing min_cfs_rq_runtime to 0 - we
> now take a tg-global lock on every group se dequeue. Setting min=0 means
> that we have to take it on both enqueue and dequeue, while baseline
> means we take it once per min_cfs_rq_runtime in the worst case.
Yep, it's only slightly better than simply setting min_cfs_rq_runtime=0.
There's definitely a tradeoff of having to grab the lock every time.

The other non-obvious benefit to this is that when the application
starts hitting throttling the entire application starts hitting
throttling closer to simultaneously.  Previously, threads that don't do
much could continue sipping on their 1ms of min_cfs_rq_runtime, while
main threads were throttled.  With this the application would more or
less halt within 5ms of full quota usage.

> In addition how much this helps is very dependent on the exact pattern
> of sleep/wake - you can still strand all but 15ms of runtime with a
> pretty reasonable pattern.

I thought this change would have an upper bound stranded time of:
NUMCPUs * min_cfs_rq_runtime - 3 * sched_cfs_bandwidth_slice().
From the stranding of 1ms on each queue minues the amount that we'd
return when we forcibly hit the 3 x bandwidth slice Am I missing
something here? Additionally that's worst case, and would require that
threads be scheduled on distinct cpus mostly sequentially, and then
never run again.

> If the cost of taking this global lock across all cpus without a
> ratelimit was somehow not a problem, I'd much prefer to just set
> min_cfs_rq_runtime = 0. (Assuming it is, I definitely prefer the "lie
> and sorta have 2x period 2x runtime" solution of removing expiration)

Can I take this as an technical ack of the v3 patchset?  Do you have any
other comments for that patchset you'd like to see corrected before it
gets integrated?  If you are indeed good with that patchset, I'll clean
up the comment and Documentation text and re-submit as a v5 patchset. I

actually like that patchset best as well, for all the reasons already
discussed.  Especially the one where it's been running like this for 5
years prior to 512ac999.

Also given my new-found understanding of min_cfs_rq_runtime, doesn't
this mean that if we don't expire runtime we could potentially have a
worst case of 1ms * NUMCPUs of extra runtime instead of 2x runtime?
This is because min_cfs_rq_runtime could theoretically live on a per-cpu
indefinitely?  Still this is actually much smaller than I previously had
feared.  Also it still holds that average runtime is still bounded by
runtime/period on average.

Thanks

  reply	other threads:[~2019-06-26 22:10 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17 19:30 [PATCH] sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu slices Dave Chiluk
2019-05-23 18:44 ` [PATCH v2 0/1] sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices Dave Chiluk
2019-05-23 18:44   ` [PATCH v2 1/1] " Dave Chiluk
2019-05-23 21:01     ` Peter Oskolkov
2019-05-24 14:32       ` Phil Auld
2019-05-24 15:14         ` Dave Chiluk
2019-05-24 15:59           ` Phil Auld
2019-05-24 16:28           ` Peter Oskolkov
2019-05-24 21:35             ` Dave Chiluk
2019-05-24 22:07               ` Peter Oskolkov
2019-05-28 22:25                 ` Dave Chiluk
2019-05-24  8:55     ` Peter Zijlstra
2019-05-29 19:08 ` [PATCH v3 0/1] " Dave Chiluk
2019-05-29 19:08   ` [PATCH v3 1/1] " Dave Chiluk
2019-05-29 19:28     ` Phil Auld
2019-05-29 19:50     ` bsegall
2019-05-29 21:05     ` bsegall
2019-05-30 17:53       ` Dave Chiluk
2019-05-30 20:44         ` bsegall
     [not found] ` <1561391404-14450-1-git-send-email-chiluk+linux@indeed.com>
2019-06-24 15:50   ` [PATCH v4 1/1] sched/fair: Return all runtime when cfs_b has very little remaining Dave Chiluk
2019-06-24 17:33     ` bsegall
2019-06-26 22:10       ` Dave Chiluk [this message]
2019-06-27 20:18         ` bsegall
2019-06-27 19:09 ` [PATCH] sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices Dave Chiluk
2019-06-27 19:49 ` [PATCH v5 0/1] " Dave Chiluk
2019-06-27 19:49   ` [PATCH v5 1/1] " Dave Chiluk
2019-07-01 20:15     ` bsegall
2019-07-11  9:51       ` Peter Zijlstra
2019-07-11 17:46         ` bsegall
     [not found]           ` <CAC=E7cV4sO50NpYOZ06n_BkZTcBqf1KQp83prc+oave3ircBrw@mail.gmail.com>
2019-07-12 18:01             ` bsegall
2019-07-12 22:09             ` bsegall
2019-07-15 15:44               ` Dave Chiluk
2019-07-16 19:58     ` bsegall
2019-07-23 16:44 ` [PATCH v6 0/1] " Dave Chiluk
2019-07-23 16:44   ` [PATCH v6 1/1] " Dave Chiluk
2019-07-23 17:13     ` Phil Auld
2019-07-23 22:12       ` Dave Chiluk
2019-07-23 23:26         ` Phil Auld
2019-07-26 18:14       ` Peter Zijlstra
2019-08-08 10:53     ` [tip:sched/core] " tip-bot for Dave Chiluk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAC=E7cUxcNkc7T7AXCr3PO6rqqxrk269JW3SDnXG-LtO-6-BVQ@mail.gmail.com' \
    --to=chiluk+linux@indeed.com \
    --cc=bgregg@netflix.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=gmunoz@netflix.com \
    --cc=jhammond@indeed.com \
    --cc=kwa@yelp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=posk@posk.io \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.