From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751202AbdFAMd0 (ORCPT <rfc822;w@1wt.eu>);
        Thu, 1 Jun 2017 08:33:26 -0400
Received: from mail-wm0-f66.google.com ([74.125.82.66]:34303 "EHLO
        mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751089AbdFAMdY (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 1 Jun 2017 08:33:24 -0400
Subject: Re: [RFC PATCH 00/10] x86: undwarf unwinder
To: Peter Zijlstra <peterz@infradead.org>,
        Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>, x86@kernel.org,
        linux-kernel@vger.kernel.org, live-patching@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Andy Lutomirski <luto@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>
References: <cover.1496293620.git.jpoimboe@redhat.com>
 <20170601060824.wv2go3adbvx5ptmt@gmail.com>
 <20170601115819.3twoowcnvtrfzjzr@treble>
 <20170601121721.lezoecnyah3aic6a@hirez.programming.kicks-ass.net>
From: Jiri Slaby <jslaby@suse.cz>
Message-ID: <d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz>
Date: Thu, 1 Jun 2017 14:33:20 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.1
MIME-Version: 1.0
In-Reply-To: <20170601121721.lezoecnyah3aic6a@hirez.programming.kicks-ass.net>
Content-Type: text/plain; charset=iso-8859-2
Content-Language: en-GB
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 06/01/2017, 02:17 PM, Peter Zijlstra wrote:
> On Thu, Jun 01, 2017 at 06:58:20AM -0500, Josh Poimboeuf wrote:
>>> Being able to generate more optimal code in the hottest code paths of the kernel 
>>> is the _real_, primary upstream kernel benefit of a different debuginfo method - 
>>> which has to be weighed against the pain of introducing a new unwinder. But this 
>>> submission does not talk about that aspect at all, which should be fixed I think.
>>
>> Actually I devoted an entire one-sentence paragraph to performance in
>> the documentation:
>>
>>   The simpler debuginfo format also enables the unwinder to be relatively
>>   fast, which is important for perf and lockdep.
>>
>> But I'll try to highlight that a little more.
> 
> That's relative to a DWARF unwinder. It doesn't appear to be possible to
> get anywhere near a frame-pointer unwinder due to having to do this
> log(n) lookup for every single frame.

This is ~ 20 times faster than my DWARF unwinder by a quick measurement
(20000 calls to save_stack_trace via single vfs_write).

perf profile, if you care:

__save_stack_trace
|
|--65.89%--unwind_next_frame
|          |
|          |--53.64%--__undwarf_lookup
|          |
|           --5.30%--deref_stack_reg
|                     |
|                      --2.32%--stack_access_ok
|
|--24.17%--__unwind_start
|          |
|          |--21.52%--unwind_next_frame
|          |          |
|          |          |--14.24%--__undwarf_lookup
|          |          |
|          |           --2.98%--deref_stack_reg
|          |                     |
|          |                      --1.32%--stack_access_ok
|          |
|           --1.32%--get_stack_info
|                     |
|                      --0.66%--in_task_stack
|
|--3.31%--unwind_get_return_address
|          __kernel_text_address
|          |
|          |--0.99%--is_ftrace_trampoline
|          |
|          |--0.99%--__is_insn_slot_addr
|          |          |
|          |           --0.66%--__rcu_read_unlock
|          |
|           --0.66%--is_bpf_text_address
|
 --1.66%--save_stack_address


-- 
js
suse labs