From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Xuewei Zhang <xueweiz@google.com>, Phil Auld <pauld@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Anton Blanchard <anton@ozlabs.org>,
Ben Segall <bsegall@google.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Juri Lelli <juri.lelli@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Mel Gorman <mgorman@suse.de>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vincent Guittot <vincent.guittot@linaro.org>,
Ingo Molnar <mingo@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 5.3 63/89] sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
Date: Fri, 18 Oct 2019 18:02:58 -0400 [thread overview]
Message-ID: <20191018220324.8165-63-sashal@kernel.org> (raw)
In-Reply-To: <20191018220324.8165-1-sashal@kernel.org>
From: Xuewei Zhang <xueweiz@google.com>
[ Upstream commit 4929a4e6faa0f13289a67cae98139e727f0d4a97 ]
The quota/period ratio is used to ensure a child task group won't get
more bandwidth than the parent task group, and is calculated as:
normalized_cfs_quota() = [(quota_us << 20) / period_us]
If the quota/period ratio was changed during this scaling due to
precision loss, it will cause inconsistency between parent and child
task groups.
See below example:
A userspace container manager (kubelet) does three operations:
1) Create a parent cgroup, set quota to 1,000us and period to 10,000us.
2) Create a few children cgroups.
3) Set quota to 1,000us and period to 10,000us on a child cgroup.
These operations are expected to succeed. However, if the scaling of
147/128 happens before step 3, quota and period of the parent cgroup
will be changed:
new_quota: 1148437ns, 1148us
new_period: 11484375ns, 11484us
And when step 3 comes in, the ratio of the child cgroup will be
104857, which will be larger than the parent cgroup ratio (104821),
and will fail.
Scaling them by a factor of 2 will fix the problem.
Tested-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Xuewei Zhang <xueweiz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Phil Auld <pauld@redhat.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: 2e8e19226398 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup")
Link: https://lkml.kernel.org/r/20191004001243.140897-1-xueweiz@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/sched/fair.c | 36 ++++++++++++++++++++++--------------
1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 86cfc5d5129ce..16b5d29bd7300 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4995,20 +4995,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
if (++count > 3) {
u64 new, old = ktime_to_ns(cfs_b->period);
- new = (old * 147) / 128; /* ~115% */
- new = min(new, max_cfs_quota_period);
-
- cfs_b->period = ns_to_ktime(new);
-
- /* since max is 1s, this is limited to 1e9^2, which fits in u64 */
- cfs_b->quota *= new;
- cfs_b->quota = div64_u64(cfs_b->quota, old);
-
- pr_warn_ratelimited(
- "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n",
- smp_processor_id(),
- div_u64(new, NSEC_PER_USEC),
- div_u64(cfs_b->quota, NSEC_PER_USEC));
+ /*
+ * Grow period by a factor of 2 to avoid losing precision.
+ * Precision loss in the quota/period ratio can cause __cfs_schedulable
+ * to fail.
+ */
+ new = old * 2;
+ if (new < max_cfs_quota_period) {
+ cfs_b->period = ns_to_ktime(new);
+ cfs_b->quota *= 2;
+
+ pr_warn_ratelimited(
+ "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us = %lld, cfs_quota_us = %lld)\n",
+ smp_processor_id(),
+ div_u64(new, NSEC_PER_USEC),
+ div_u64(cfs_b->quota, NSEC_PER_USEC));
+ } else {
+ pr_warn_ratelimited(
+ "cfs_period_timer[cpu%d]: period too short, but cannot scale up without losing precision (cfs_period_us = %lld, cfs_quota_us = %lld)\n",
+ smp_processor_id(),
+ div_u64(old, NSEC_PER_USEC),
+ div_u64(cfs_b->quota, NSEC_PER_USEC));
+ }
/* reset count so we don't come right back in here */
count = 0;
--
2.20.1
next prev parent reply other threads:[~2019-10-18 22:04 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 22:01 [PATCH AUTOSEL 5.3 01/89] iio: adc: meson_saradc: Fix memory allocation order Sasha Levin
2019-10-18 22:01 ` [PATCH AUTOSEL 5.3 02/89] iio: fix center temperature of bmc150-accel-core Sasha Levin
2019-10-18 22:01 ` [PATCH AUTOSEL 5.3 03/89] libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature Sasha Levin
2019-10-18 22:01 ` [PATCH AUTOSEL 5.3 04/89] perf tests: Avoid raising SEGV using an obvious NULL dereference Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 05/89] perf map: Fix overlapped map handling Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 06/89] perf script brstackinsn: Fix recovery from LBR/binary mismatch Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 07/89] perf jevents: Fix period for Intel fixed counters Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 08/89] perf tools: Propagate get_cpuid() error Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 09/89] perf annotate: Propagate perf_env__arch() error Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 10/89] perf annotate: Fix the signedness of failure returns Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 11/89] perf annotate: Propagate the symbol__annotate() error return Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 12/89] perf annotate: Fix arch specific ->init() failure errors Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 13/89] perf annotate: Return appropriate error code for allocation failures Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 14/89] perf annotate: Don't return -1 for error when doing BPF disassembly Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 15/89] staging: rtl8188eu: fix null dereference when kzalloc fails Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 16/89] crypto: arm/aes-ce - add dependency on AES library Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 17/89] RDMA/siw: Fix serialization issue in write_space() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 18/89] RDMA/hfi1: Prevent memory leak in sdma_init Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 19/89] RDMA/iw_cxgb4: fix SRQ access from dump_qp() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 20/89] RDMA/iwcm: Fix a lock inversion issue Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 21/89] HID: hyperv: Use in-place iterator API in the channel callback Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 22/89] kselftest: exclude failed TARGETS from runlist Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 23/89] selftests/kselftest/runner.sh: Add 45 second timeout per test Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 24/89] nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 25/89] arm64: cpufeature: Effectively expose FRINT capability to userspace Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 26/89] arm64: Fix incorrect irqflag restore for priority masking for compat Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 27/89] arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419 Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 28/89] tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()' Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 29/89] tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()' Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 30/89] serial/sifive: select SERIAL_EARLYCON Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 31/89] tty: n_hdlc: fix build on SPARC Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 32/89] misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 33/89] RDMA/core: Fix an error handling path in 'res_get_common_doit()' Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 34/89] RDMA/cm: Fix memory leak in cm_add/remove_one Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 35/89] RDMA/cxgb4: Do not dma memory off of the stack Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 36/89] RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 37/89] RDMA/mlx5: Do not allow rereg of a ODP MR Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 38/89] RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 39/89] RDMA/mlx5: Add missing synchronize_srcu() for MW cases Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 40/89] gpio: max77620: Use correct unit for debounce times Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 41/89] fs: cifs: mute -Wunused-const-variable message Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 42/89] arm64: vdso32: Fix broken compat vDSO build warnings Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 43/89] arm64: vdso32: Detect binutils support for dmb ishld Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 44/89] serial: mctrl_gpio: Check for NULL pointer Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 45/89] serial: 8250_omap: Fix gpio check for auto RTS/CTS Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 46/89] arm64: Default to building compat vDSO with clang when CONFIG_CC_IS_CLANG Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 47/89] arm64: vdso32: Don't use KBUILD_CPPFLAGS unconditionally Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 48/89] efi/cper: Fix endianness of PCIe class code Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 49/89] efi/x86: Do not clean dummy variable in kexec path Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 50/89] kbuild: fix build error of 'make nsdeps' in clean tree Sasha Levin
2019-10-19 0:14 ` Masahiro Yamada
2019-10-29 9:09 ` Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 51/89] MIPS: include: Mark __cmpxchg as __always_inline Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 52/89] riscv: avoid kernel hangs when trapped in BUG() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 53/89] riscv: avoid sending a SIGTRAP to a user thread trapped in WARN() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 54/89] riscv: Correct the handling of unexpected ebreak in do_trap_break() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 55/89] x86/xen: Return from panic notifier Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 56/89] ocfs2: clear zero in unaligned direct IO Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 57/89] fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 58/89] fs: ocfs2: fix a possible null-pointer dereference in ocfs2_write_end_nolock() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 59/89] fs: ocfs2: fix a possible null-pointer dereference in ocfs2_info_scan_inode_alloc() Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 60/89] btrfs: silence maybe-uninitialized warning in clone_range Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 61/89] arm64: armv8_deprecated: Checking return value for memory allocation Sasha Levin
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 62/89] x86/cpu: Add Comet Lake to the Intel CPU models header Sasha Levin
2019-10-18 22:02 ` Sasha Levin [this message]
2019-10-18 22:02 ` [PATCH AUTOSEL 5.3 64/89] sched/vtime: Fix guest/system mis-accounting on task switch Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 65/89] perf/core: Rework memory accounting in perf_mmap() Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 66/89] perf/core: Fix corner case in perf_rotate_context() Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 67/89] perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 68/89] drm/amdgpu: fix memory leak Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 69/89] iio: adc: hx711: fix bug in sampling of data Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 70/89] iio: accel: adxl372: Fix/remove limitation for FIFO samples Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 71/89] iio: accel: adxl372: Fix push to buffers lost samples Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 72/89] iio: accel: adxl372: Perform a reset at start up Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 73/89] iio: imu: adis16400: release allocated memory on failure Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 74/89] iio: imu: adis16400: fix memory leak Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 75/89] iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 76/89] iio: light: fix vcnl4000 devicetree hooks Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 77/89] iio: light: add missing vcnl4040 of_compatible Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 78/89] iio: adc: ad799x: fix probe error handling Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 79/89] iio: light: opt3001: fix mutex unlock race Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 80/89] MIPS: include: Mark __xchg as __always_inline Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 81/89] MIPS: fw: sni: Fix out of bounds init of o32 stack Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 82/89] s390/cio: fix virtio-ccw DMA without PV Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 83/89] USB: usb-skeleton: fix use-after-free after driver unbind Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 84/89] virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 85/89] nbd: fix possible sysfs duplicate warning Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 86/89] NFSv4: Fix leak of clp->cl_acceptor string Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 87/89] SUNRPC: fix race to sk_err after xs_error_report Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 88/89] s390/uaccess: avoid (false positive) compiler warnings Sasha Levin
2019-10-18 22:03 ` [PATCH AUTOSEL 5.3 89/89] tracing: Initialize iter->seq after zeroing in tracing_read_pipe() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191018220324.8165-63-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=anton@ozlabs.org \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=xueweiz@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).