From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753002Ab2BVHsu (ORCPT ); Wed, 22 Feb 2012 02:48:50 -0500 Received: from mx3.mail.elte.hu ([157.181.1.138]:51877 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752225Ab2BVHst (ORCPT ); Wed, 22 Feb 2012 02:48:49 -0500 Date: Wed, 22 Feb 2012 08:48:39 +0100 From: Ingo Molnar To: "H. Peter Anvin" Cc: Jason Baron , a.p.zijlstra@chello.nl, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, davem@davemloft.net, ddaney.cavm@gmail.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Linus Torvalds Subject: Re: [PATCH 00/10] jump label: introduce very_[un]likely + cleanups + docs Message-ID: <20120222074839.GA24890@elte.hu> References: <4F43F9F0.4000605@zytor.com> <20120222065016.GA16923@elte.hu> <4F44934B.2000808@zytor.com> <20120222072538.GA17291@elte.hu> <4F449ACF.3040807@zytor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4F449ACF.3040807@zytor.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.3.1 -2.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * H. Peter Anvin wrote: > On 02/21/2012 11:25 PM, Ingo Molnar wrote: > > > > There is a fundamental assymetry, and intentionally so. You > > *really* have to think what the common case is, and make > > sure the build defaults to that. It's not the end of the > > world to have it flipped over, but there's costs and those > > costs are higher even in the branch path than a regular > > likely()/unlikely(). > > No, not really -- it's still an unconditional branch, which > means you will not tax the branch predictor in any way and > which can be followed by the front end without taking a > speculation hit. [...] You are talking about CPU level costs, I am also talking about costs introduced at build time. Fact is, jump-label unlikely branches are moved *out of line*: they are often in unlikely portions of the function (near other unlikely branches), with instruction cache granularity costs and potentially higher instruction-cache miss costs attached, etc. You are missing three important aspects: Firstly, instead of: ins1 ins2 ins3 ins4 ins5 ins-compare ins-branch ins6 ins7 ins8 ins9 ins10 We have: ins1 ins2 ins3 ins4 ins5 ins-jump [ hole ] ins6 ins7 ins8 ins9 ins10 ins-jump back Where the 'hole' fragments the instruction cache layout. Given that most of kernel execution is instruction-cache-cold, the 'straightness' of kernel code matters quite a bit. Secondly, there's build time instruction scheduling costs as well: GCC will prefer the likely branch over the unlikely one, so we might see extra instructions in the out-of-line code: ins1 ins2 ins3 ins4 ins5 ins-jump [ hole ] ins-extra-1 ins-extra-2 ins6 ins7 ins8 ins9 ins10 ins-jump back In that sense jump labels are unlikely() branches combined with a patching mechanism. Thus *both* aspects are important: if a branch is *truly* 50/50 then it's quite possibly *NOT* a correct optimization to use jump-labels as the 'uncommon' code goes through extra hoops and fragments out of the fastpath, which in quite many real life cases can outstrip the advantage of the avoidance of a single branch ... Thirdly, even if it's a correct optimization and both branches happen to outperform the pre-jump-label version, regardless of the direction of the jump label flag, it's *STILL* fundamentally assymetric: due to the hole and due to the possible extra instructions the out of line code will be slower by a few instruction and the NOP fall-through will be faster. This is fundamentally so, and any naming that tries to *hide* that assymetry and the associated micro-costs is confused. Thanks, Ingo