[patch V2 00/20] timer: Refactor the timer wheel

* [patch V2 00/20] timer: Refactor the timer wheel
@ 2016-06-17 13:26 Thomas Gleixner
  2016-06-17 13:26 ` [patch V2 01/20] timer: Make pinned a timer property Thomas Gleixner
                   ` (23 more replies)
  0 siblings, 24 replies; 52+ messages in thread
From: Thomas Gleixner @ 2016-06-17 13:26 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Zijlstra, Paul E. McKenney, Eric Dumazet,
	Frederic Weisbecker, Chris Mason, Arjan van de Ven, rt,
	Rik van Riel, Linus Torvalds, George Spelvin, Len Brown

This is the second version of the timer wheel rework series. The first series
can be found here:

   http://lkml.kernel.org/r/20160613070440.950649741@linutronix.de

The series is also available in git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.timers

Changes vs. V1:

 - Addressed the review comments of V1

     - Fixed the fallout in tty/metag (noticed by Arjan)
     - Renamed the hlist helper (noticed by Paolo/George)
     - Used the proper mask in get_timer_base() (noticed by Richard)
     - Fixed the inverse state check in internal_add_timer() (noticed by Richard)
     - Simplified the macro maze, removed wrapper (noticed by George)
     - Reordered data retrieval in run_timer() (noticed by George)

 - Removed cascading completely

   We have a hard cutoff of expiry times at the capacity of the last wheel
   level now. Timers which insist on timeouts longer than that, i.e. ~6days,
   will expire at the cutoff, i.e. ~6 days. From our data gathering the
   largest timeouts are 5 days (networking contrack), which are well in the
   capacity.

   To achieve this capacity with HZ=1000 without increasing the storage size
   by another level, we reduced the granularity of the first wheel level from
   1ms to 4ms. According to our data, there is no user which relies on that
   1ms granularity and 99% of those timers are canceled before expiry.

   As a side effect there is the benefit of better batching in the first level
   which helps networking to avoid rearming timers in the hotpath.

We gathered more data about performance and batching. Compared to mainline the
following changes have been observed:

   - The bad outliers in mainline when the timer wheel needs to be forwarded
     after a long idle sleep are completely gone.

   - The total cpu time used for timer softirq processing is significantly
     reduced. Depending on the HZ setting and workload this ranges from factor
     2 to 6.

   - The average invocation period of the timer softirq on an idle system
     increases significantly. Depending on the HZ settings and workload this
     ranges from factor 1.5 to 5. That means that the residency in deep
     c-states should be improved. Have not yet have time to verify this with
     the power tools.

Thanks,

	tglx

---
 arch/x86/kernel/apic/x2apic_uv_x.c  |    4 
 arch/x86/kernel/cpu/mcheck/mce.c    |    4 
 block/genhd.c                       |    5 
 drivers/cpufreq/powernv-cpufreq.c   |    5 
 drivers/mmc/host/jz4740_mmc.c       |    2 
 drivers/net/ethernet/tile/tilepro.c |    4 
 drivers/power/bq27xxx_battery.c     |    5 
 drivers/tty/metag_da.c              |    4 
 drivers/tty/mips_ejtag_fdc.c        |    4 
 drivers/usb/host/ohci-hcd.c         |    1 
 drivers/usb/host/xhci.c             |    2 
 include/linux/list.h                |   10 
 include/linux/timer.h               |   30 
 kernel/time/tick-internal.h         |    1 
 kernel/time/tick-sched.c            |   46 -
 kernel/time/timer.c                 | 1099 +++++++++++++++++++++---------------
 lib/random32.c                      |    1 
 net/ipv4/inet_connection_sock.c     |    7 
 net/ipv4/inet_timewait_sock.c       |    5 
 19 files changed, 725 insertions(+), 514 deletions(-)

^ permalink raw reply	[flat|nested] 52+ messages in thread