From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934053AbaDIQs4 (ORCPT ); Wed, 9 Apr 2014 12:48:56 -0400 Received: from one.firstfloor.org ([193.170.194.197]:38176 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932898AbaDIQsy (ORCPT ); Wed, 9 Apr 2014 12:48:54 -0400 Date: Wed, 9 Apr 2014 18:48:53 +0200 From: Andi Kleen To: Peter Zijlstra Cc: Andy Lutomirski , David Ahern , Andi Kleen , Stephane Eranian , "Yan, Zheng" , LKML , Ingo Molnar , Arnaldo Carvalho de Melo Subject: Re: [PATCH v3 00/14] perf, x86: Haswell LBR call stack support Message-ID: <20140409164852.GA22728@two.firstfloor.org> References: <530D53EF.9090706@amacapital.net> <20140226185513.GL22728@two.firstfloor.org> <530E3E47.8010205@gmail.com> <530E4B42.5090401@gmail.com> <20140409114857.GT11096@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140409114857.GT11096@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 09, 2014 at 01:48:57PM +0200, Peter Zijlstra wrote: > On Wed, Feb 26, 2014 at 12:26:43PM -0800, Andy Lutomirski wrote: > > Speed. FPO saves one register (a big deal on x86_32; not so important > > on x86_64) but also saves a few cycles on function entry and exit, > > which is a bigger deal for small functions. > > So I though that LTO was supposed to get rid of a lot of the small > function and inline them. It does it when it can (no indirect), thinks it's profitable and won't increase code size too much. > > I've also heard that in practise this is very 'hard', and thus we're > still stuck with a gazillion small functions (mostly C++ people suffer > from this). They need devirtualization, which we cannot do currently in the kernel. > > Can anybody give a concise explanation on why LTO doesn't rid us of > these small functions or point to a web resource that describes the > problem? It depends on the code of course. On one of my LTO builds I have ~10% less functions in System.map. Actual results will vary of course on the config. -Andi -- ak@linux.intel.com -- Speaking for myself only.