From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751202AbdFAMd0 (ORCPT ); Thu, 1 Jun 2017 08:33:26 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34303 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751089AbdFAMdY (ORCPT ); Thu, 1 Jun 2017 08:33:24 -0400 Subject: Re: [RFC PATCH 00/10] x86: undwarf unwinder To: Peter Zijlstra , Josh Poimboeuf Cc: Ingo Molnar , x86@kernel.org, linux-kernel@vger.kernel.org, live-patching@vger.kernel.org, Linus Torvalds , Andy Lutomirski , "H. Peter Anvin" References: <20170601060824.wv2go3adbvx5ptmt@gmail.com> <20170601115819.3twoowcnvtrfzjzr@treble> <20170601121721.lezoecnyah3aic6a@hirez.programming.kicks-ass.net> From: Jiri Slaby Message-ID: Date: Thu, 1 Jun 2017 14:33:20 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170601121721.lezoecnyah3aic6a@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=iso-8859-2 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/01/2017, 02:17 PM, Peter Zijlstra wrote: > On Thu, Jun 01, 2017 at 06:58:20AM -0500, Josh Poimboeuf wrote: >>> Being able to generate more optimal code in the hottest code paths of the kernel >>> is the _real_, primary upstream kernel benefit of a different debuginfo method - >>> which has to be weighed against the pain of introducing a new unwinder. But this >>> submission does not talk about that aspect at all, which should be fixed I think. >> >> Actually I devoted an entire one-sentence paragraph to performance in >> the documentation: >> >> The simpler debuginfo format also enables the unwinder to be relatively >> fast, which is important for perf and lockdep. >> >> But I'll try to highlight that a little more. > > That's relative to a DWARF unwinder. It doesn't appear to be possible to > get anywhere near a frame-pointer unwinder due to having to do this > log(n) lookup for every single frame. This is ~ 20 times faster than my DWARF unwinder by a quick measurement (20000 calls to save_stack_trace via single vfs_write). perf profile, if you care: __save_stack_trace | |--65.89%--unwind_next_frame | | | |--53.64%--__undwarf_lookup | | | --5.30%--deref_stack_reg | | | --2.32%--stack_access_ok | |--24.17%--__unwind_start | | | |--21.52%--unwind_next_frame | | | | | |--14.24%--__undwarf_lookup | | | | | --2.98%--deref_stack_reg | | | | | --1.32%--stack_access_ok | | | --1.32%--get_stack_info | | | --0.66%--in_task_stack | |--3.31%--unwind_get_return_address | __kernel_text_address | | | |--0.99%--is_ftrace_trampoline | | | |--0.99%--__is_insn_slot_addr | | | | | --0.66%--__rcu_read_unlock | | | --0.66%--is_bpf_text_address | --1.66%--save_stack_address -- js suse labs