linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Odin Ugedal <odin@uged.al>
To: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	Odin Ugedal <odin@uged.al>
Subject: [PATCH 0/3] sched/fair: Fix load decay issues related to throttling
Date: Tue, 18 May 2021 14:51:59 +0200	[thread overview]
Message-ID: <20210518125202.78658-1-odin@uged.al> (raw)

Here is a follow up with some fairness fixes related to throttling, PELT and
load decay in general.

It is related to the discussion in:

https://lore.kernel.org/lkml/20210425080902.11854-1-odin@uged.al and
https://lkml.kernel.org/r/20210501141950.23622-2-odin@uged.al

Tested on v5.13-rc2 (since that contain the fix from above^).

The patch descriptions should make sense in its own, and I have attached some
simple reproduction scripts at the end of this mail. I also appended a patch
fixing some ascii art that I have been looking at several times without
understanding, when it turns out it breaks if tabs is not 8 spaces. I can
submit that as a separate patch if necessary.

Also, I have no idea what to call the "insert_on_unthrottle" var, so feel
free to come with suggestions.


There are probably "better" and more reliable ways to reproduce this, but
these works for me "most of the time", and gives an ok context imo. Throttling
is not deterministic, so keep that in mind. I have been testing with
CONFIG_HZ=250, so if you use =1000 (or anything else), you might get other
results/harder to reproduce.

Reprod script for "Add tg_load_contrib cfs_rq decay checking":
--- bash start
CGROUP=/sys/fs/cgroup/slice

function run_sandbox {
  local CG="$1"
  local LCPU="$2"
  local SHARES="$3"
  local CMD="$4"

  local PIPE="$(mktemp -u)"
  mkfifo "$PIPE"
  sh -c "read < $PIPE ; exec $CMD" &
  local TASK="$!"
  mkdir -p "$CG/sub"
  tee "$CG"/cgroup.subtree_control <<< "+cpuset +cpu" 
  tee "$CG"/sub/cgroup.procs <<< "$TASK"
  tee "$CG"/sub/cpuset.cpus <<< "$LCPU"
  tee "$CG"/sub/cpu.weight <<< "$SHARES"
  tee "$CG"/cpu.max <<< "10000 100000"

  sleep .1
  tee "$PIPE" <<< sandox_done
  rm "$PIPE"
}

mkdir -p "$CGROUP"
tee "$CGROUP"/cgroup.subtree_control <<< "+cpuset +cpu" 

run_sandbox "$CGROUP/cg-1" "0" 100 "stress --cpu 1"
run_sandbox "$CGROUP/cg-2" "3" 100 "stress --cpu 1"
sleep 1.02
tee "$CGROUP"/cg-1/sub/cpuset.cpus <<< "1"
sleep 1.05
tee "$CGROUP"/cg-1/sub/cpuset.cpus <<< "2"
sleep 1.07
tee "$CGROUP"/cg-1/sub/cpuset.cpus <<< "3"

sleep 2

tee "$CGROUP"/cg-1/cpu.max <<< "max"
tee "$CGROUP"/cg-2/cpu.max <<< "max"

read
killall stress
sleep .2
rmdir /sys/fs/cgroup/slice/{cg-{1,2}{/sub,},}

# Often gives:
# cat /sys/kernel/debug/sched/debug | grep ":/slice" -A 28 | egrep "(:/slice)|tg_load_avg"                                                                                                           odin@4670k
# 
# cfs_rq[3]:/slice/cg-2/sub
#   .tg_load_avg_contrib           : 1024
#   .tg_load_avg                   : 1024
# cfs_rq[3]:/slice/cg-1/sub
#   .tg_load_avg_contrib           : 1023
#   .tg_load_avg                   : 1023
# cfs_rq[3]:/slice/cg-1
#   .tg_load_avg_contrib           : 1040
#   .tg_load_avg                   : 2062
# cfs_rq[3]:/slice/cg-2
#   .tg_load_avg_contrib           : 1013
#   .tg_load_avg                   : 1013
# cfs_rq[3]:/slice
#   .tg_load_avg_contrib           : 1540
#   .tg_load_avg                   : 1540
--- bash end


Reprod for "sched/fair: Correctly insert cfs_rqs to list on unthrottle":
--- bash start
CGROUP=/sys/fs/cgroup/slice
TMP_CG=/sys/fs/cgroup/tmp
OLD_CG=/sys/fs/cgroup"$(cat /proc/self/cgroup | cut -c4-)"
function run_sandbox {
  local CG="$1"
  local LCPU="$2"
  local SHARES="$3"
  local CMD="$4"

  local PIPE="$(mktemp -u)"
  mkfifo "$PIPE"
  sh -c "read < $PIPE ; exec $CMD" &
  local TASK="$!"
  mkdir -p "$CG/sub"
  tee "$CG"/cgroup.subtree_control <<< "+cpuset +cpu" 
  tee "$CG"/sub/cpuset.cpus <<< "$LCPU"
  tee "$CG"/sub/cgroup.procs <<< "$TASK"
  tee "$CG"/sub/cpu.weight <<< "$SHARES"

  sleep .01
  tee "$PIPE" <<< sandox_done
  rm "$PIPE"
}

mkdir -p "$CGROUP"
mkdir -p "$TMP_CG"
tee "$CGROUP"/cgroup.subtree_control <<< "+cpuset +cpu" 

echo $$ | tee "$TMP_CG"/cgroup.procs
tee "$TMP_CG"/cpuset.cpus <<< "0"
sleep .1

tee "$CGROUP"/cpu.max <<< "1000 4000"

run_sandbox "$CGROUP/cg-0" "0" 10000 "stress --cpu 1"
run_sandbox "$CGROUP/cg-3" "3" 1 "stress --cpu 1"

sleep 2
tee "$CGROUP"/cg-0/sub/cpuset.cpus <<< "3"

tee "$CGROUP"/cpu.max <<< "max"

read
killall stress
sleep .2
echo $$ | tee "$OLD_CG"/cgroup.procs
rmdir "$TMP_CG" /sys/fs/cgroup/slice/{cg-{0,3}{/sub,},}

# Often gives:
# cat /sys/kernel/debug/sched/debug | grep ":/slice" -A 28 | egrep "(:/slice)|tg_load_avg"                                                                                                           odin@4670k
#
# cfs_rq[3]:/slice/cg-3/sub
#   .tg_load_avg_contrib           : 1039
#   .tg_load_avg                   : 2036
# cfs_rq[3]:/slice/cg-0/sub
#   .tg_load_avg_contrib           : 1023
#   .tg_load_avg                   : 1023
# cfs_rq[3]:/slice/cg-0
#   .tg_load_avg_contrib           : 102225
#   .tg_load_avg                   : 102225
# cfs_rq[3]:/slice/cg-3
#   .tg_load_avg_contrib           : 4
#   .tg_load_avg                   : 1001
# cfs_rq[3]:/slice
#   .tg_load_avg_contrib           : 1038
#   .tg_load_avg                   : 1038
--- bash end

Thanks
Odin

Odin Ugedal (3):
  sched/fair: Add tg_load_contrib cfs_rq decay checking
  sched/fair: Correctly insert cfs_rq's to list on unthrottle
  sched/fair: Fix ascii art by relpacing tabs

 kernel/sched/fair.c  | 22 +++++++++++++---------
 kernel/sched/sched.h |  1 +
 2 files changed, 14 insertions(+), 9 deletions(-)

-- 
2.31.1


             reply	other threads:[~2021-05-18 12:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18 12:51 Odin Ugedal [this message]
2021-05-18 12:52 ` [PATCH 1/3] sched/fair: Add tg_load_contrib cfs_rq decay checking Odin Ugedal
2021-05-25  9:58   ` Vincent Guittot
2021-05-25 10:33     ` Odin Ugedal
2021-05-25 14:30       ` Vincent Guittot
2021-05-26 10:50         ` Vincent Guittot
2021-05-27  7:50           ` Odin Ugedal
2021-05-27  9:35             ` Vincent Guittot
2021-05-27  9:45               ` Odin Ugedal
2021-05-27 10:49                 ` Vincent Guittot
2021-05-27 11:04                   ` Odin Ugedal
2021-05-27 12:37                     ` Vincent Guittot
2021-05-27 12:37                   ` Odin Ugedal
2021-05-27 12:39                     ` Odin Ugedal
2021-05-27 12:49                     ` Vincent Guittot
2021-05-18 12:52 ` [PATCH 2/3] sched/fair: Correctly insert cfs_rq's to list on unthrottle Odin Ugedal
2021-05-28 14:24   ` Vincent Guittot
2021-05-28 15:06     ` Odin Ugedal
2021-05-28 15:27       ` Vincent Guittot
2021-05-29  9:33         ` Odin Ugedal
2021-05-31 12:14           ` Vincent Guittot
2021-05-18 12:52 ` [PATCH 3/3] sched/fair: Fix ascii art by relpacing tabs Odin Ugedal
2021-05-27 13:27   ` Vincent Guittot
2021-06-01 14:04   ` [tip: sched/core] " tip-bot2 for Odin Ugedal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210518125202.78658-1-odin@uged.al \
    --to=odin@uged.al \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).