From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752256AbcFNKU3 (ORCPT ); Tue, 14 Jun 2016 06:20:29 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:42837 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751842AbcFNKU1 (ORCPT ); Tue, 14 Jun 2016 06:20:27 -0400 Date: Tue, 14 Jun 2016 12:20:22 +0200 From: Peter Zijlstra To: George Spelvin Cc: tglx@linutronix.de, edumazet@google.com, linux-kernel@vger.kernel.org, richardcochran@gmail.com Subject: Re: [patch 13/20] timer: Switch to a non cascading wheel Message-ID: <20160614102022.GE30909@twins.programming.kicks-ass.net> References: <20160614081602.10755.qmail@ns.sciencehorizons.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160614081602.10755.qmail@ns.sciencehorizons.net> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 14, 2016 at 04:16:02AM -0400, George Spelvin wrote: > While I like the cleanup of just limiting long-term resolution, if > it turns out to be necessary, it's not too hard to add exact timers > back in if a need is found in future. All you need is a second > __internal_add_timer function that rounds down rather than up, and to > teach expire_timers() to cascade in the unlikely situation that a timer > does have an expiry time in the future. That did occur to me as well; however I think it would be best to eradicate all forms of cascading entirely -- if at all possible. If not; then I agree, that would clean things up. > Wouldn't this all be so much simpler as > > #define LVL_BITS 6 /* Renamed previous LVL_SHIFT */ > #define LVL_SIZE (1 << LVL_BITS) > #define LVL_MASK (LVL_BITS - 1) > #define LVL_OFFS(n) ((n) * LVL_SIZE) > #define LVL_SHIFT(n) ((n) * LVL_CLK_SHIFT) > #define LVL_GRAN(n) (1 << LVL_SHIFT(n)) > > Then you could do > +static inline unsigned calc_index(unsigned expires, unsigned level), > +{ > + /* Round up to next bin bin */ > + expires = ((expires - 1) >> LVL_SHIFT(level)) + 1; > + return LVL_OFFS(level) + (expires & LVL_MASK); > +} I like. > to be replaced with __builtin_clz or similar: Problem is for the archs that don't have that, the 5 layer branch is trivial for all arches, while software clz/fls is far more expensive. > > + timer = hlist_entry(head->first, struct timer_list, entry); > > + fn = timer->function; > > + data = timer->data; > > + > > + timer_stats_account_timer(timer); > > + > > + base->running_timer = timer; > > + detach_expired_timer(timer); > > Is there some non-obvious reason that you have to fetch fn and data > so early? It seems like a register pressure pessimization, if the > compiler can't figure out that timer_stats code can't change them. > > The cache line containing this timer was already prefetched when you > updated its entry.pprev as part of removing the previous entry from > the list. > > I see why you want to fetch them with the lock held in case there's some > freaky race, but I'd do it all after detach_timer(). Good point, ideally the compiler can move those loads around inside the lock, but its unlikely to be _that_ clever. We could indeed lower those loads manually to just before the unlock.