From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932265AbbDHUKx (ORCPT ); Wed, 8 Apr 2015 16:10:53 -0400 Received: from www.linutronix.de ([62.245.132.108]:44079 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754713AbbDHUKs (ORCPT ); Wed, 8 Apr 2015 16:10:48 -0400 Date: Wed, 8 Apr 2015 22:11:05 +0200 (CEST) From: Thomas Gleixner To: Viresh Kumar cc: Ingo Molnar , Peter Zijlstra , linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, Preeti U Murthy Subject: Re: [PATCH V2 2/2] hrtimer: Iterate only over active clock-bases In-Reply-To: Message-ID: References: User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 7 Apr 2015, Viresh Kumar wrote: > At several instances we iterate over all possible clock-bases for a > particular cpu-base. Whereas, we only need to iterate over active bases. > > We already have per cpu-base 'active_bases' field, which is updated on > addition/removal of hrtimers. > > This patch creates for_each_active_base(), which uses 'active_bases' to > iterate only over active bases. I'm really not too excited about this incomprehensible macro mess and especially not about the code it generates. x86_64 i386 ARM power Mainline 7668 6942 8077 10253 + Patch 8068 7294 8313 10861 +400 +352 +236 +608 That's insane. What's wrong with just adding if (!(cpu_base->active_bases & (1 << i))) continue; to the iterating sites? Index: linux/kernel/time/hrtimer.c =================================================================== --- linux.orig/kernel/time/hrtimer.c +++ linux/kernel/time/hrtimer.c @@ -451,6 +451,9 @@ static ktime_t __hrtimer_get_next_event( struct timerqueue_node *next; struct hrtimer *timer; + if (!(cpu_base->active_bases & (1 << i))) + continue; + next = timerqueue_getnext(&base->active); if (!next) continue; @@ -1441,6 +1444,9 @@ void hrtimer_run_queues(void) return; for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) { + if (!(cpu_base->active_bases & (1 << index))) + continue; + base = &cpu_base->clock_base[index]; if (!timerqueue_getnext(&base->active)) continue; @@ -1681,6 +1687,8 @@ static void migrate_hrtimers(int scpu) raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING); for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) { + if (!(old_base->active_bases & (1 << i))) + continue; migrate_hrtimer_list(&old_base->clock_base[i], &new_base->clock_base[i]); } Now the code size increase is in a sane range: x86_64 i386 ARM power Mainline 7668 6942 8077 10253 + patch 7748 6990 8113 10365 +80 +48 +36 +112 So your variant is at least 5 times larger than the simple and obvious solution. I told you before to look at the resulting binary code changes and sanity check whether they are in the right ball park. Thanks, tglx