From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758444Ab2FVTGf (ORCPT ); Fri, 22 Jun 2012 15:06:35 -0400 Received: from alternativer.internetendpunkt.de ([88.198.24.89]:37292 "EHLO geheimer.internetendpunkt.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754002Ab2FVTGe (ORCPT ); Fri, 22 Jun 2012 15:06:34 -0400 Date: Fri, 22 Jun 2012 21:06:32 +0200 From: Hagen Paul Pfeifer To: Linus Torvalds Cc: Ingo Molnar , Steven Rostedt , linux-kernel@vger.kernel.org, Peter Zijlstra , Arnaldo Carvalho de Melo , Thomas Gleixner , Andrew Morton Subject: Re: [GIT PULL] perf fixes Message-ID: <20120622190632.GB8014@virgo.local> References: <20120622133650.GA24136@gmail.com> <20120622183827.GA8014@virgo.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Key-Id: 98350C22 X-Key-Fingerprint: 490F 557B 6C48 6D7E 5706 2EA2 4A22 8D45 9835 0C22 X-GPG-Key: gpg --recv-keys --keyserver wwwkeys.eu.pgp.net 98350C22 User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Linus Torvalds | 2012-06-22 11:52:43 [-0700]: >So even stubbed out, it's quite noticeable. The call causes the >function prologue to change quite a bit. > >That's actually especially true with newer versions of gcc that >*finally* seem to have done the "don't always generate the full >prologue if some case doesn't need it" optimization. So functions that >have early-out conditions (quite common) will exit before even having >done the prologue, and without doing the whole frame pointer setup >etc. > >Except if mcount generation is on. Then gcc will always do the >prologue and frame pointer setup before doing the mcount, because >mcount wants it. > >So it really isn't just the extra call instruction. > >I may be more sensitive to this than most, because I look at profiles >and the function prologue just looks very ugly with the call mcount >thing. Ugh. Yes, ugh. Even Stevens -mfentry replacement do not change things here. Maybe this performance problem should addressed on another level: distributors should build there kernel with disabled ftrace support. Kprobes and function tracing should be sufficient for 98% of their customers. I see no compiler (gcc) way around this performance issue. Maybe you/Steven should restart a G+ poll ... ;) Hagen