From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755262AbbDNVIV (ORCPT ); Tue, 14 Apr 2015 17:08:21 -0400 Received: from www.linutronix.de ([62.245.132.108]:37858 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754712AbbDNVIE (ORCPT ); Tue, 14 Apr 2015 17:08:04 -0400 Message-Id: <20150414203303.702062272@linutronix.de> User-Agent: quilt/0.63-1 Date: Tue, 14 Apr 2015 21:08:23 -0000 From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , Ingo Molnar , Preeti U Murthy , Viresh Kumar , Marcelo Tosatti , Frederic Weisbecker Subject: [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When I returned from my break I got offended by a pile of patches which kill the patient with the cure. The issues at hand: - NOHZ: Get rid of the softirq invocation - hrtimer: Use the active_bases field in order to avoid evaluating inactive bases - hrtimer: Cache footprint issues Aside of that Peter and I were discussing for a long time to get rid of the hrtimer softirq. After staring at all of it for quite some time it occured to me that all issues are related in one way or the other. So I sat down and reworked the code in various ways: - Reduce the data size, so the hrtimer clock bases can be made cache line aligned. - Consolidate everything on the high resolution timer implementation and get rid of dubious optimizations for the non highres case which bloat code and data. - Implement the active_bases mechanism proper and avoid touching inactive hrtimer clock bases. This includes a conditional update mechanism for the hrtimer clock offsets to update them only when they changed, which they do seldom enough instead of polluting 4 cache lines in every tick/hrtimer interrupt. - Get rid of the softirq deferment and simply enforce a hrtimer interrupt when the timer was already expired. This allows to remove the ugly __hrtimer_start_range_ns() interface and to cleanup the usage sites (sched/perf). As a consequence this also gets rid of the forward loops in the tick nohz code. - Analogous to the hrtimer enforcement, force a tick interrupt for NOHZ non highres systems when the forwarding code tries to fire an expired tick. This allows to get rid of the softirq invocation in the NOHZ code. - A cleanup of the code which evaluates the next timer event: Use nsec based calculations instead of the jiffy magic. That makes it actually readable by some definition of readable. - While doing the above I had to audit quite some usage sites of various hrtimer interfaces, which revealed some entertaining bugs. The fixes have been posted in a seperate series already. Some other bogosities have been removed as part of this series. The total change size of this overhaul is: 37 files changed, 515 insertions(+), 794 deletions(-) The resulting text size of hrtimers.o shrinks in the range of 8-10% depending on the architecture. The cache foot print of the hrtimer per cpu data shrinks as well. x8684 i386 ARM ARM64 power64 Before: 328 248 280 328 328 Bytes 6 4 5 6 6 cache lines (64byte) After: 320 192 192 320 320 Bytes 5 3 3 5 5 cache lines (64byte) Note, that the new code avoids to touch the inactive clock bases which are now cache line aligned and therefor reduces the cache foot print in normal usage scenarios significantly. I did some perf measurements on an isolated core running - hrtimer centric workloads - idle scenarios with periodic wakeups of various length The patches reduce the number of instructions executed during the test runs between 2.5 and 6% depending on the scenario and the cache misses between 3 and 8%. For your convenience this series is also available at: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/wip Note: The branch is temporary and not meant to base other work on it. Thanks, tglx --- arch/x86/kernel/cpu/perf_event_intel_rapl.c | 5 arch/x86/kernel/cpu/perf_event_intel_uncore.c | 5 drivers/power/reset/ltc2952-poweroff.c | 18 drivers/staging/ozwpan/ozpd.c | 8 include/linux/alarmtimer.h | 4 include/linux/hrtimer.h | 101 ++--- include/linux/interrupt.h | 7 include/linux/rcupdate.h | 6 include/linux/rcutree.h | 2 include/linux/timekeeper_internal.h | 2 include/linux/timer.h | 7 include/linux/timerqueue.h | 8 include/trace/events/irq.h | 1 kernel/events/core.c | 9 kernel/futex.c | 5 kernel/locking/rtmutex.c | 5 kernel/rcu/tree_plugin.h | 14 kernel/sched/core.c | 28 - kernel/sched/deadline.c | 12 kernel/sched/fair.c | 2 kernel/softirq.c | 2 kernel/time/alarmtimer.c | 17 kernel/time/hrtimer.c | 525 +++++++++----------------- kernel/time/posix-timers.c | 17 kernel/time/tick-broadcast-hrtimer.c | 8 kernel/time/tick-internal.h | 2 kernel/time/tick-sched.c | 288 +++++--------- kernel/time/tick-sched.h | 2 kernel/time/timekeeping.c | 55 -- kernel/time/timekeeping.h | 10 kernel/time/timer.c | 79 +-- kernel/time/timer_list.c | 14 lib/timerqueue.c | 10 net/core/pktgen.c | 2 net/sched/sch_api.c | 5 sound/core/hrtimer.c | 9 sound/drivers/pcsp/pcsp.c | 15 37 files changed, 515 insertions(+), 794 deletions(-)