linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] Scheduler updates for v6.0
@ 2022-08-01 14:02 Ingo Molnar
  2022-08-01 19:55 ` pr-tracker-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Ingo Molnar @ 2022-08-01 14:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Thomas Gleixner

Linus,

Please pull the latest sched/core git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2022-08-01

   # HEAD: c17a6ff9321355487d7d5ccaa7d406a0ea06b6c4 rseq: Kill process when unknown flags are encountered in ABI structures

This cycle's scheduler updates for v6.0 are:

Load-balancing improvements:
============================

- Improve NUMA balancing on AMD Zen systems for affine workloads.

- Improve the handling of reduced-capacity CPUs in load-balancing.

- Energy Model improvements: fix & refine all the energy fairness metrics (PELT),
  and remove the conservative threshold requiring 6% energy savings to
  migrate a task. Doing this improves power efficiency for most workloads,
  and also increases the reliability of energy-efficiency scheduling.

- Optimize/tweak select_idle_cpu() to spend (much) less time searching
  for an idle CPU on overloaded systems. There's reports of several
  milliseconds spent there on large systems with large workloads ...

  [ Since the search logic changed, there might be behavioral side effects. ]

- Improve NUMA imbalance behavior. On certain systems
  with spare capacity, initial placement of tasks is non-deterministic,
  and such an artificial placement imbalance can persist for a long time,
  hurting (and sometimes helping) performance.

  The fix is to make fork-time task placement consistent with runtime
  NUMA balancing placement.

  Note that some performance regressions were reported against this,
  caused by workloads that are not memory bandwith limited, which benefit
  from the artificial locality of the placement bug(s). Mel Gorman's
  conclusion, with which we concur, was that consistency is better than
  random workload benefits from non-deterministic bugs:

     "Given there is no crystal ball and it's a tradeoff, I think it's
      better to be consistent and use similar logic at both fork time
      and runtime even if it doesn't have universal benefit."

- Improve core scheduling by fixing a bug in sched_core_update_cookie() that
  caused unnecessary forced idling.

- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs for newly
  woken tasks.

- Fix a newidle balancing bug that introduced unnecessary wakeup latencies.

ABI improvements/fixes:
=======================

- Do not check capabilities and do not issue capability check denial messages
  when a scheduler syscall doesn't require privileges. (Such as increasing niceness.)

- Add forced-idle accounting to cgroups too.

- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
  (No existing tooling is known to have learned to rely on the previous behavior.)

- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.

Optimizations:
==============

- Optimize & simplify leaf_cfs_rq_list()

- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().

Misc fixes & cleanups:
======================

- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.

- Fix a full-NOHZ bug that can in some cases result in the tick not being
  re-enabled when the last SCHED_RT task is gone from a runqueue but there's
  still SCHED_OTHER tasks around.

- Various PREEMPT_RT related fixes.

- Misc cleanups & smaller fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
 Thanks,

	Ingo

------------------>
Chen Yu (1):
      sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg

Chengming Zhou (1):
      sched/fair: Optimize and simplify rq leaf_cfs_rq_list

Christian Göttsche (1):
      sched: only perform capability check on privileged operation

Cruz Zhao (1):
      sched/core: Fix the bug that task won't enqueue into core tree when update cookie

Dietmar Eggemann (3):
      sched, drivers: Remove max param from effective_cpu_util()/sched_cpu_util()
      sched/fair: Rename select_idle_mask to select_rq_mask
      sched/fair: Use the same cpumask per-PD throughout find_energy_efficient_cpu()

John Keeping (1):
      sched/core: Always flush pending blk_plug

Josh Don (2):
      sched: Allow newidle balancing to bail out of load_balance
      sched/core: add forced idle accounting for cgroups

K Prateek Nayak (1):
      sched/fair: Consider CPU affinity when allowing NUMA imbalance in find_idlest_group()

Mathieu Desnoyers (2):
      rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags
      rseq: Kill process when unknown flags are encountered in ABI structures

Mel Gorman (4):
      sched/numa: Initialise numa_migrate_retry
      sched/numa: Do not swap tasks between nodes when spare capacity is available
      sched/numa: Apply imbalance limitations consistently
      sched/numa: Adjust imb_numa_nr to a better approximation of memory channels

Michael Jeanson (3):
      selftests/rseq: riscv: use rseq_get_abi() helper
      selftests/rseq: riscv: fix 'literal-suffix' warning
      selftests/rseq: check if libc rseq support is registered

Nicolas Saenz Julienne (1):
      nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()

Tianchen Ding (2):
      sched: Fix the check of nr_running at queue wakelist
      sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle

Uros Bizjak (1):
      sched/core: Use try_cmpxchg in set_nr_{and_not,if}_polling

Vincent Donnefort (4):
      sched/fair: Provide u64 read for 32-bits arch helper
      sched/fair: Decay task PELT values during wakeup migration
      sched/fair: Remove task_util from effective utilization in feec()
      sched/fair: Remove the energy margin in feec()

Vincent Guittot (1):
      sched/fair: fix case with reduced capacity CPU

Yajun Deng (1):
      sched/deadline: Use proc_douintvec_minmax() limit minimum value

Zhang Qiao (2):
      sched/fair: Remove redundant word " *"
      sched: Remove unused function group_first_cpu()


 drivers/powercap/dtpm_cpu.c               |  33 +-
 drivers/thermal/cpufreq_cooling.c         |   6 +-
 include/linux/cgroup-defs.h               |   4 +
 include/linux/kernel_stat.h               |   7 +
 include/linux/sched.h                     |   2 +-
 include/linux/sched/rt.h                  |   8 -
 include/linux/sched/topology.h            |   1 +
 kernel/cgroup/rstat.c                     |  44 +-
 kernel/rseq.c                             |  23 +-
 kernel/sched/core.c                       | 215 ++++----
 kernel/sched/core_sched.c                 |  15 +-
 kernel/sched/cpufreq_schedutil.c          |   5 +-
 kernel/sched/cputime.c                    |  15 +
 kernel/sched/deadline.c                   |   6 +-
 kernel/sched/fair.c                       | 818 +++++++++++++++++++-----------
 kernel/sched/features.h                   |   3 +-
 kernel/sched/pelt.h                       |  40 +-
 kernel/sched/rt.c                         |  15 +-
 kernel/sched/sched.h                      |  63 ++-
 kernel/sched/topology.c                   |  23 +-
 tools/testing/selftests/rseq/rseq-riscv.h |  50 +-
 tools/testing/selftests/rseq/rseq.c       |   3 +-
 22 files changed, 888 insertions(+), 511 deletions(-)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-01 19:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-01 14:02 [GIT PULL] Scheduler updates for v6.0 Ingo Molnar
2022-08-01 19:55 ` pr-tracker-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).