All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 00/32] Nohz cpusets v2 (adaptive tickless kernel)
@ 2012-03-21 13:58 Frederic Weisbecker
  2012-03-21 13:58 ` Frederic Weisbecker
                   ` (34 more replies)
  0 siblings, 35 replies; 97+ messages in thread
From: Frederic Weisbecker @ 2012-03-21 13:58 UTC (permalink / raw)
  To: LKML, linaro-sched-sig
  Cc: Frederic Weisbecker, Alessio Igor Bogani, Andrew Morton,
	Avi Kivity, Chris Metcalf, Christoph Lameter, Daniel Lezcano,
	Geoff Levand, Gilad Ben Yossef, Ingo Molnar, Max Krasnyansky,
	Paul E. McKenney, Peter Zijlstra, Stephen Hemminger,
	Steven Rostedt, Sven-Thorsten Dietrich, Thomas Gleixner, Zen Lin

Hi all,

A summary of what this is about can be found here:
  https://lkml.org/lkml/2011/8/15/245

There are still a lot of things to handle. Especially about
what is done by scheduler_tick() but we also need to:

- completely handle cputime accounting (need to find every "reader"
of cputime and flush cputimes for all of them).
-handle  perf
- handle irqtime finegrained accounting
- handle ilb load balancing
- etc...

Nonetheless this is time to post a new iteration of the patchset
because the design has changed a bit, some bugs have been fixed,
more simplification, more unification with dynticks-idle code,
namespace fixes, various improvements here and there...

The git branch can be fetched from:

git://github.com/fweisbec/linux-dynticks.git
	nohz/cpuset-v2

Changelog since v1:

- Rebase against 3.3-rc7 + tip:timers/core branch targeted
for 3.4-rc1

- Refine some changelogs

- Adapt against latest rcu changes: introduce new APIs
  rcu_user_enter(), rcu_user_exit(), rcu_user_enter_irq()
  and rcu_user_exit_irq()

- Handle RCU idle mode with do_notify_resume() path

- Fix deadlock after double rq lock on schedule:
  schedule() -> rq_lock -> next is idle task ->
  tick_nohz_restart_sched_tick() -> wake up softirq ->
  rq lock

- Fix lockup while issuing flush times IPI on exit path:

  CPU 0	     	   	   	 CPU 1

  read_lock(tasklist_lock)
				write_lock_irq(tasklist_lock)
				smp_call_function(CPU 1)
				* deadlock *

- Many namespace renames (cpuset_* to tick_nohz_*) and code migration
from sched.c to tick-sched.c

- Seperate code that determine if we can stop the idle tick and don't
use it for adaptive tickless mode.

- Fix adaptive tickless mode set on idle incidentally. TIF_NOHZ was
then missing on the following task that ran tickless, issuing some
illegal uses of RCU

- Restart the tick anytime more than one task is on the runqueue. We were previously
only covering wake ups, now we also handle migration and any other source of task enqueuing

- Handle use of RCU in schedule() when called right before resuming userspace
(new schedule_user() API)

- Take the decision to stop the tick from irq exit instead of the middle of the timer
interrupt. This gives more opportunity to stop it and is one step more to unify idle
and adaptive tickless.

- Unify tickless idle and tickless user/system CPU time accounting infrastructures.

- If the tick is stopped adaptively and we are going to schedule the idle
task, don't restart the tick.

- Remove task_nohz_mode per cpu var and use ts->tick_stopped instead. This
leads to more unification between idle tickless and adaptive tickless.



Frederic Weisbecker (32):
  nohz: Separate idle sleeping time accounting from nohz logic
  nohz: Make nohz API agnostic against idle ticks cputime accounting
  nohz: Rename ts->idle_tick to ts->last_tick
  nohz: Move nohz load balancer selection into idle logic
  nohz: Move ts->idle_calls incrementation into strict idle logic
  nohz: Move next idle expiry time record into idle logic area
  cpuset: Set up interface for nohz flag
  nohz: Try not to give the timekeeping duty to an adaptive tickless
    cpu
  x86: New cpuset nohz irq vector
  nohz: Adaptive tick stop and restart on nohz cpuset
  nohz/cpuset: Don't turn off the tick if rcu needs it
  nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued
  nohz/cpuset: Don't stop the tick if posix cpu timers are running
  nohz/cpuset: Restart tick when nohz flag is cleared on cpuset
  nohz/cpuset: Restart the tick if printk needs it
  rcu: Restart the tick on non-responding adaptive nohz CPUs
  rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU
  nohz: Generalize tickless cpu time accounting
  nohz/cpuset: Account user and system times in adaptive nohz mode
  nohz/cpuset: New API to flush cputimes on nohz cpusets
  nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting
    leader
  nohz/cpuset: Flush cputimes on procfs stat file read
  nohz/cpuset: Flush cputimes for getrusage() and times() syscalls
  x86: Syscall hooks for nohz cpusets
  x86: Exception hooks for nohz cpusets
  x86: Add adaptive tickless hooks on do_notify_resume()
  nohz: Don't restart the tick before scheduling to idle
  rcu: New rcu_user_enter() and rcu_user_exit() APIs
  rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs
  rcu: Switch to extended quiescent state in userspace from nohz cpuset
  nohz: Exit RCU idle mode when we schedule before resuming userspace
  nohz/cpuset: Disable under some configs

 arch/Kconfig                       |    3 +
 arch/x86/Kconfig                   |    1 +
 arch/x86/include/asm/entry_arch.h  |    3 +
 arch/x86/include/asm/hw_irq.h      |    7 +
 arch/x86/include/asm/irq_vectors.h |    2 +
 arch/x86/include/asm/smp.h         |   11 +
 arch/x86/include/asm/thread_info.h |   10 +-
 arch/x86/kernel/entry_64.S         |   12 +-
 arch/x86/kernel/irqinit.c          |    4 +
 arch/x86/kernel/ptrace.c           |   10 +
 arch/x86/kernel/signal.c           |    3 +
 arch/x86/kernel/smp.c              |   26 ++
 arch/x86/kernel/traps.c            |   20 +-
 arch/x86/mm/fault.c                |   13 +-
 fs/proc/array.c                    |    2 +
 include/linux/cpuset.h             |   29 ++
 include/linux/kernel_stat.h        |    2 +
 include/linux/posix-timers.h       |    1 +
 include/linux/rcupdate.h           |    8 +
 include/linux/sched.h              |   10 +-
 include/linux/tick.h               |   75 ++++--
 init/Kconfig                       |    8 +
 kernel/cpuset.c                    |  107 +++++++
 kernel/exit.c                      |    8 +
 kernel/posix-cpu-timers.c          |   12 +
 kernel/printk.c                    |   15 +-
 kernel/rcutree.c                   |  150 ++++++++--
 kernel/sched/core.c                |   83 ++++++-
 kernel/sched/sched.h               |   23 ++
 kernel/softirq.c                   |    6 +-
 kernel/sys.c                       |    6 +
 kernel/time/tick-sched.c           |  540 +++++++++++++++++++++++++++++-------
 kernel/time/timer_list.c           |    7 +-
 kernel/timer.c                     |    2 +-
 34 files changed, 1042 insertions(+), 177 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 97+ messages in thread
* [PATCH 00/32] [RFC] nohz/cpuset: Start discussions on nohz CPUs
@ 2012-10-29 20:27 Steven Rostedt
  2012-10-29 20:27 ` [PATCH 30/32] rcu: Switch to extended quiescent state in userspace from nohz cpuset Steven Rostedt
  0 siblings, 1 reply; 97+ messages in thread
From: Steven Rostedt @ 2012-10-29 20:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Thomas Gleixner, Peter Zijlstra, Clark Williams,
	Frederic Weisbecker, Li Zefan, Ingo Molnar, Paul E. McKenney,
	Mike Galbraith

A while ago Frederic posted a series of patches to get an idea on
how to implement nohz cpusets. Where you can add a task to a cpuset
and mark the set to be 'nohz'. When the task runs on a CPU and is
the only task scheduled (nr_running == 1), the tick will stop.
The idea is to give the task the least amount of kernel interference
as possible. If the task doesn't do any system calls (and possibly
even if it does), no timer interrupt will bother it. By using
isocpus and nohz cpuset, a task would be able to achieve true cpu
isolation.

This has been long asked for by those in the RT community. If a task
requires uninterruptible CPU time, this would be able to give a task
that, even without the full PREEMPT-RT patch set.

This patch set is not for inclusion. It is just to get the topic
at the forefront again. The design requires more work and more
discussion.

I ported Frederic's work to v3.7-rc3 and I'm posting it here so that
people can comment on it. I just did the minimal to get it to compile
and boot. I haven't done any real tests with it yet. I may have screwed
some things up during the port, but that's OK, because the patch set
will most likely require a rewrite anyway.

Please have a look, and lets get this out the door.

-- Steve


Frederic Weisbecker (31):
      nohz: Move nohz load balancer selection into idle logic
      cpuset: Set up interface for nohz flag
      nohz: Try not to give the timekeeping duty to an adaptive tickless cpu
      x86: New cpuset nohz irq vector
      nohz: Adaptive tick stop and restart on nohz cpuset
      nohz/cpuset: Don't turn off the tick if rcu needs it
      nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued
      nohz/cpuset: Don't stop the tick if posix cpu timers are running
      nohz/cpuset: Restart tick when nohz flag is cleared on cpuset
      nohz/cpuset: Restart the tick if printk needs it
      rcu: Restart the tick on non-responding adaptive nohz CPUs
      rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU
      nohz: Generalize tickless cpu time accounting
      nohz/cpuset: Account user and system times in adaptive nohz mode
      nohz/cpuset: New API to flush cputimes on nohz cpusets
      nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting leader
      nohz/cpuset: Flush cputimes on procfs stat file read
      nohz/cpuset: Flush cputimes for getrusage() and times() syscalls
      x86: Syscall hooks for nohz cpusets
      nohz: Don't restart the tick before scheduling to idle
      sched: Comment on rq->clock correctness in ttwu_do_wakeup() in nohz
      sched: Update rq clock on nohz CPU before migrating tasks
      sched: Update rq clock on nohz CPU before setting fair group shares
      sched: Update rq clock on tickless CPUs before calling check_preempt_curr()
      sched: Update rq clock earlier in unthrottle_cfs_rq
      sched: Update clock of nohz busiest rq before balancing
      sched: Update rq clock before idle balancing
      sched: Update nohz rq clock before searching busiest group on load balancing
      rcu: Switch to extended quiescent state in userspace from nohz cpuset
      nohz/cpuset: Disable under some configs
      nohz, not for merge: Add tickless tracing

Hakan Akkan (1):
      nohz/cpuset: enable addition&removal of cpus while in adaptive nohz mode

----
 arch/Kconfig                       |    3 +
 arch/x86/include/asm/entry_arch.h  |    3 +
 arch/x86/include/asm/hw_irq.h      |    7 +
 arch/x86/include/asm/irq_vectors.h |    2 +
 arch/x86/include/asm/smp.h         |   11 +-
 arch/x86/kernel/entry_64.S         |    4 +
 arch/x86/kernel/irqinit.c          |    4 +
 arch/x86/kernel/ptrace.c           |   11 +
 arch/x86/kernel/smp.c              |   28 +++
 fs/proc/array.c                    |    2 +
 include/linux/cpuset.h             |   35 ++++
 include/linux/kernel_stat.h        |    2 +
 include/linux/posix-timers.h       |    1 +
 include/linux/rcupdate.h           |    1 +
 include/linux/sched.h              |   10 +-
 include/linux/tick.h               |   72 +++++--
 init/Kconfig                       |    8 +
 kernel/cpuset.c                    |  144 ++++++++++++-
 kernel/exit.c                      |    8 +
 kernel/posix-cpu-timers.c          |   12 ++
 kernel/printk.c                    |   15 +-
 kernel/rcutree.c                   |   28 ++-
 kernel/sched/core.c                |   82 +++++++-
 kernel/sched/cputime.c             |   22 ++
 kernel/sched/fair.c                |   41 +++-
 kernel/sched/sched.h               |   18 ++
 kernel/softirq.c                   |    6 +-
 kernel/sys.c                       |    6 +
 kernel/time/tick-sched.c           |  398 ++++++++++++++++++++++++++++++++----
 kernel/time/timer_list.c           |    3 +-
 kernel/timer.c                     |    2 +-
 31 files changed, 912 insertions(+), 77 deletions(-)

^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2012-10-29 20:39 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-21 13:58 [RFC][PATCH 00/32] Nohz cpusets v2 (adaptive tickless kernel) Frederic Weisbecker
2012-03-21 13:58 ` Frederic Weisbecker
2012-04-04 15:33   ` warning in tick_nohz_irq_exit Stephen Hemminger
2012-04-04 20:45     ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 01/32] nohz: Separate idle sleeping time accounting from nohz logic Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 02/32] nohz: Make nohz API agnostic against idle ticks cputime accounting Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 03/32] nohz: Rename ts->idle_tick to ts->last_tick Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 04/32] nohz: Move nohz load balancer selection into idle logic Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 05/32] nohz: Move ts->idle_calls incrementation into strict " Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 06/32] nohz: Move next idle expiry time record into idle logic area Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 07/32] cpuset: Set up interface for nohz flag Frederic Weisbecker
2012-03-21 14:50   ` Christoph Lameter
2012-03-22  4:03     ` Mike Galbraith
2012-03-22 16:26       ` Christoph Lameter
2012-03-22 19:20         ` Mike Galbraith
2012-03-27 11:22       ` Frederic Weisbecker
2012-03-27 11:53         ` Mike Galbraith
2012-03-27 11:56           ` Frederic Weisbecker
2012-03-27 12:31             ` Mike Galbraith
2012-03-27 11:19     ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 08/32] nohz: Try not to give the timekeeping duty to an adaptive tickless cpu Frederic Weisbecker
2012-03-21 14:52   ` Christoph Lameter
2012-03-27 10:50     ` Frederic Weisbecker
2012-03-27 16:08       ` Christoph Lameter
2012-03-27 16:47         ` Peter Zijlstra
2012-03-28  1:12           ` Christoph Lameter
2012-03-28  8:39             ` Peter Zijlstra
2012-03-28 13:11               ` Dimitri Sivanich
2012-03-28 15:51               ` Chris Metcalf
2012-03-30  1:34         ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 09/32] x86: New cpuset nohz irq vector Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 10/32] nohz: Adaptive tick stop and restart on nohz cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 11/32] nohz/cpuset: Don't turn off the tick if rcu needs it Frederic Weisbecker
2012-03-21 14:54   ` Christoph Lameter
2012-03-22  7:38     ` Gilad Ben-Yossef
2012-03-22 16:18       ` Christoph Lameter
2012-03-27 15:21         ` Gilad Ben-Yossef
2012-03-28 12:39           ` Frederic Weisbecker
2012-03-28 12:57             ` Gilad Ben-Yossef
2012-03-28 13:38               ` Frederic Weisbecker
2012-03-22 17:18       ` Chris Metcalf
2012-03-27 15:31         ` Gilad Ben-Yossef
2012-03-27 15:43           ` Chris Metcalf
2012-03-28  8:36             ` Gilad Ben-Yossef
2012-03-27 12:13     ` Frederic Weisbecker
2012-03-27 16:13       ` Christoph Lameter
2012-03-27 16:24         ` Steven Rostedt
2012-03-28  0:42           ` Christoph Lameter
2012-03-28  1:06             ` Steven Rostedt
2012-03-28  1:19               ` Christoph Lameter
2012-03-28  1:35                 ` Steven Rostedt
2012-03-28  3:17                   ` Steven Rostedt
2012-03-28  7:55                     ` Gilad Ben-Yossef
2012-03-28 12:21                       ` Frederic Weisbecker
2012-03-28 12:41                         ` Gilad Ben-Yossef
2012-03-28 14:02                       ` Steven Rostedt
2012-03-28 11:53         ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 12/32] nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 13/32] nohz/cpuset: Don't stop the tick if posix cpu timers are running Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 14/32] nohz/cpuset: Restart tick when nohz flag is cleared on cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 15/32] nohz/cpuset: Restart the tick if printk needs it Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 16/32] rcu: Restart the tick on non-responding adaptive nohz CPUs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 17/32] rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 18/32] nohz: Generalize tickless cpu time accounting Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 19/32] nohz/cpuset: Account user and system times in adaptive nohz mode Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 20/32] nohz/cpuset: New API to flush cputimes on nohz cpusets Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 21/32] nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting leader Frederic Weisbecker
2012-03-27 14:10   ` Gilad Ben-Yossef
2012-03-27 14:23     ` Gilad Ben-Yossef
2012-03-28 11:20       ` Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 22/32] nohz/cpuset: Flush cputimes on procfs stat file read Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 23/32] nohz/cpuset: Flush cputimes for getrusage() and times() syscalls Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 24/32] x86: Syscall hooks for nohz cpusets Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 25/32] x86: Exception " Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 26/32] x86: Add adaptive tickless hooks on do_notify_resume() Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 27/32] nohz: Don't restart the tick before scheduling to idle Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 28/32] rcu: New rcu_user_enter() and rcu_user_exit() APIs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 29/32] rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 30/32] rcu: Switch to extended quiescent state in userspace from nohz cpuset Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 31/32] nohz: Exit RCU idle mode when we schedule before resuming userspace Frederic Weisbecker
2012-03-21 13:58 ` [PATCH 32/32] nohz/cpuset: Disable under some configs Frederic Weisbecker
2012-03-27 15:02 ` [RFC][PATCH 00/32] Nohz cpusets v2 (adaptive tickless kernel) Gilad Ben-Yossef
2012-03-27 15:04   ` Gilad Ben-Yossef
2012-03-27 15:05     ` Gilad Ben-Yossef
2012-03-27 16:22       ` Christoph Lameter
2012-03-28  6:47         ` Gilad Ben-Yossef
2012-03-27 15:10   ` Peter Zijlstra
2012-03-27 15:18     ` Gilad Ben-Yossef
2012-05-22 21:31     ` Thomas Gleixner
2012-05-22 21:50       ` Steven Rostedt
2012-05-22 22:22         ` Thomas Gleixner
2012-03-28 11:43   ` Frederic Weisbecker
2012-03-30  0:33 ` Kevin Hilman
2012-03-30  0:45   ` Frederic Weisbecker
2012-03-30  2:07     ` Geoff Levand
2012-03-30 14:10       ` Kevin Hilman
2012-10-29 20:27 [PATCH 00/32] [RFC] nohz/cpuset: Start discussions on nohz CPUs Steven Rostedt
2012-10-29 20:27 ` [PATCH 30/32] rcu: Switch to extended quiescent state in userspace from nohz cpuset Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.