[ANNOUNCE] 3.7-nohz1

* [ANNOUNCE] 3.7-nohz1
@ 2012-12-20 18:32 Frederic Weisbecker
  2012-12-20 18:32 ` [PATCH 01/24] context_tracking: Add comments on interface and internals Frederic Weisbecker
                   ` (25 more replies)
  0 siblings, 26 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2012-12-20 18:32 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Alessio Igor Bogani, Andrew Morton,
	Avi Kivity, Chris Metcalf, Christoph Lameter, Geoff Levand,
	Gilad Ben Yossef, Hakan Akkan, Ingo Molnar, Paul E. McKenney,
	Paul Gortmaker, Peter Zijlstra, Steven Rostedt, Thomas Gleixner,
	Li Zhong

Hi,

So this is a new version of the nohz cpusets based on 3.7, except it's not using
cpusets anymore and I actually based it on the middle of the 3.8 merge window
in order to get latest upstream full dynticks preparatory work: cputime cleanups,
RCU user mode, context tracking subsystem, nohz code consolidation, ...

So the big changes since the last nohz cpuset release are:

* printk now uses irq work so it doesn't rely on the tick anymore (provided
your arch implements irq work with IPIs or alike). This chunk has been proposed
for the 3.8 merge window: https://lkml.org/lkml/2012/12/17/177
May be Linus will pull, may be not. We'll see. In any case I've included it in this tree
but I'm not reposting this part of the patchset to avoid spamming you.

* cputime doesn't rely on IPIs anymore. Now the reader does a special computation to
remotely get the tickless cputime.

* No more cpusets interface. Paul McKenney suggested me to start with a boot time
kernel parameter to define the full dynticks cpumask. And he was totally right, it
makes the code much more simple. That's a good way to start and to make the mainlining
easier. We can still add a runtime configuration later if necessary.

* Now there is always a CPU handling the timekeeping. This can be further optimized
and more power-friendly, I really did something simple-stupid. I guess we'll try to get
that into a better shape with Hakan. But at least the timekeeping now works.

* It uses the new RCU callbacks offlining feature. This way a full dynticks CPU doesn't
need to keep the tick to handle local callbacks. This is still very experimental though.

* No more specific IPI vector for full dynticks. We just use the scheduler ipi.

The branch is:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	3.7-nohz1

There is still quite some work to do.

== How to use? ==

Select:
	CONFIG_NO_HZ
	CONFIG_RCU_USER_QS
	CONFIG_VIRT_CPU_ACCOUNTING_GEN
	CONFIG_RCU_NOCB_CPU
	CONFIG_NO_HZ_FULL

You always need at least one timekeeping CPU.

Let's imagine you have 4 CPUs. We keep the CPU 0 to offline RCU callbacks there and to
handle the timekeeping. We set the rest as full dynticks. So you need the following kernel
parameters:

	rcu_nocbs=1-3 full_nohz=1-3

(Note rcu_nocbs value must always be the same as full_nohz).

Now if you want proper isolation you need to:

* Migrate your processes adequately
* Migrate your irqs to CPU 0
* Migrate the RCU nocb threads to CPU 0. Example with the above configuration:

	for p in $(ps -o pid= -C rcuo1,rcuo2,rcuo3)
	do
		taskset -cp 0 $p
	done

Then run what you want on the full dynticks CPUs. For best results, run 1 task
per CPU, mostly in userspace and mostly CPU bound (otherwise more IO = more kernel
mode execution = more chances to get IPIs, tick restarted, workqueues, kthreads, etc...)

This page contains a good reminder for those interested in CPU isolation: https://github.com/gby/linux/wiki

But keep in mind that my tree is not yet ready for serious production.

Happy Christmas, new year or whatever end of the world.
---

Frederic Weisbecker (32):
      irq_work: Fix racy IRQ_WORK_BUSY flag setting
      irq_work: Fix racy check on work pending flag
      irq_work: Remove CONFIG_HAVE_IRQ_WORK
      nohz: Add API to check tick state
      irq_work: Don't stop the tick with pending works
      irq_work: Make self-IPIs optable
      printk: Wake up klogd using irq_work
      Merge branch 'nohz/printk-v8' into 3.7-nohz1-stage
      context_tracking: Add comments on interface and internals
      cputime: Generic on-demand virtual cputime accounting
      cputime: Allow dynamic switch between tick/virtual based cputime accounting
      cputime: Use accessors to read task cputime stats
      cputime: Safely read cputime of full dynticks CPUs
      nohz: Basic full dynticks interface
      nohz: Assign timekeeping duty to a non-full-nohz CPU
      nohz: Trace timekeeping update
      nohz: Wake up full dynticks CPUs when a timer gets enqueued
      rcu: Restart the tick on non-responding full dynticks CPUs
      sched: Comment on rq->clock correctness in ttwu_do_wakeup() in nohz
      sched: Update rq clock on nohz CPU before migrating tasks
      sched: Update rq clock on nohz CPU before setting fair group shares
      sched: Update rq clock on tickless CPUs before calling check_preempt_curr()
      sched: Update rq clock earlier in unthrottle_cfs_rq
      sched: Update clock of nohz busiest rq before balancing
      sched: Update rq clock before idle balancing
      sched: Update nohz rq clock before searching busiest group on load balancing
      nohz: Move nohz load balancer selection into idle logic
      nohz: Full dynticks mode
      nohz: Only stop the tick on RCU nocb CPUs
      nohz: Don't turn off the tick if rcu needs it
      nohz: Don't stop the tick if posix cpu timers are running
      nohz: Add some tracing

Steven Rostedt (2):
      irq_work: Flush work on CPU_DYING
      irq_work: Warn if there's still work on cpu_down

 arch/alpha/Kconfig                  |    1 -
 arch/alpha/kernel/osf_sys.c         |    6 +-
 arch/arm/Kconfig                    |    1 -
 arch/arm64/Kconfig                  |    1 -
 arch/blackfin/Kconfig               |    1 -
 arch/frv/Kconfig                    |    1 -
 arch/hexagon/Kconfig                |    1 -
 arch/mips/Kconfig                   |    1 -
 arch/parisc/Kconfig                 |    1 -
 arch/powerpc/Kconfig                |    1 -
 arch/s390/Kconfig                   |    1 -
 arch/s390/kernel/vtime.c            |    4 +-
 arch/sh/Kconfig                     |    1 -
 arch/sparc/Kconfig                  |    1 -
 arch/x86/Kconfig                    |    1 -
 arch/x86/kernel/apm_32.c            |   11 +-
 drivers/isdn/mISDN/stack.c          |    7 +-
 drivers/staging/iio/trigger/Kconfig |    1 -
 fs/binfmt_elf.c                     |    8 +-
 fs/binfmt_elf_fdpic.c               |    7 +-
 include/asm-generic/cputime.h       |    1 +
 include/linux/context_tracking.h    |   28 +++++
 include/linux/hardirq.h             |    4 +-
 include/linux/init_task.h           |    9 ++
 include/linux/irq_work.h            |   20 +++
 include/linux/kernel_stat.h         |    2 +-
 include/linux/posix-timers.h        |    1 +
 include/linux/printk.h              |    3 -
 include/linux/rcupdate.h            |    8 ++
 include/linux/sched.h               |   48 +++++++-
 include/linux/tick.h                |   26 ++++-
 include/linux/vtime.h               |   47 +++++---
 init/Kconfig                        |   22 +++-
 kernel/acct.c                       |    6 +-
 kernel/context_tracking.c           |   91 +++++++++++----
 kernel/cpu.c                        |    4 +-
 kernel/delayacct.c                  |    7 +-
 kernel/exit.c                       |    6 +-
 kernel/fork.c                       |    8 +-
 kernel/irq_work.c                   |  131 ++++++++++++++++-----
 kernel/posix-cpu-timers.c           |   39 +++++-
 kernel/printk.c                     |   36 +++---
 kernel/rcutree.c                    |   19 +++-
 kernel/rcutree_plugin.h             |   13 +--
 kernel/sched/core.c                 |   69 +++++++++++-
 kernel/sched/cputime.c              |  222 ++++++++++++++++++++++++++++++-----
 kernel/sched/fair.c                 |   42 +++++++-
 kernel/sched/sched.h                |   15 +++
 kernel/signal.c                     |   12 ++-
 kernel/softirq.c                    |   11 +-
 kernel/time/Kconfig                 |    9 ++
 kernel/time/tick-broadcast.c        |    3 +-
 kernel/time/tick-common.c           |    5 +-
 kernel/time/tick-sched.c            |  142 ++++++++++++++++++++---
 kernel/timer.c                      |    3 +-
 kernel/tsacct.c                     |   19 ++-
 56 files changed, 955 insertions(+), 233 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread