All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	live-patching@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>, Jiri Slaby <jslaby@suse.cz>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC PATCH 00/10] x86: undwarf unwinder
Date: Thu, 1 Jun 2017 06:50:08 -0700	[thread overview]
Message-ID: <CALCETrWziRb2+rEPAMSBne+ao6CbFSmoBumgMy0GTULko5FrHw@mail.gmail.com> (raw)
In-Reply-To: <20170601124705.gw5snmcsetsrhw24@treble>

On Thu, Jun 1, 2017 at 5:47 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> On Thu, Jun 01, 2017 at 02:17:21PM +0200, Peter Zijlstra wrote:
>> On Thu, Jun 01, 2017 at 06:58:20AM -0500, Josh Poimboeuf wrote:
>> > > Being able to generate more optimal code in the hottest code paths of the kernel
>> > > is the _real_, primary upstream kernel benefit of a different debuginfo method -
>> > > which has to be weighed against the pain of introducing a new unwinder. But this
>> > > submission does not talk about that aspect at all, which should be fixed I think.
>> >
>> > Actually I devoted an entire one-sentence paragraph to performance in
>> > the documentation:
>> >
>> >   The simpler debuginfo format also enables the unwinder to be relatively
>> >   fast, which is important for perf and lockdep.
>> >
>> > But I'll try to highlight that a little more.
>>
>> That's relative to a DWARF unwinder.
>
> Yes.
>
>> It doesn't appear to be possible to get anywhere near a frame-pointer
>> unwinder due to having to do this log(n) lookup for every single
>> frame.
>
> Hm, is there something faster, yet not substantially bigger?  Hash?
> Trie?

You have, roughly, a set of (key_start, value) pairs where, for any
given key, you want to find the (key_start, value) with the largest
key_start that doesn't exceed key.  Binary search gives you log_2(n)
queries, but its locality of reference sucks.  Here are two
suggestions for improving it:

1. Change the data layout.  Instead of having an array of undwarf
entries, have two parallel arrays, one with the ip addresses and one
with everything else.  This has no effect on the amount of space used,
but it makes the part used during search more compact.

2. Your key space is fairly small and your table entries should be
reasonably uniformly distributed.  Let the first IP you have unwind
data for be IP0.  Make an array mapping (IP - IP0) / B to the index of
the unwind entry for that IP for some suitable block size B.  Then, to
look up an IP, you'd find the indices of the unwind entries for (IP -
IP0) / B and (IP - IP0) / B + 1 and binary search between them.  With
constant B, this gives you O(1) performance instead of O(log(n)).
With B = 1, it's very fast, but the table is huge.  With B = 64k or
so, maybe you'd get a nice tradeoff of speedup vs size.  (With
modules, you'd presumably first search an rbtree to find which
instance of this data structure you're using and then do the lookup.)

3. Use a B-tree.  B-trees are simple if you don't need to deal with
insertion and deletion.  Presumably you'd choose your internal node
size so each internal node is exactly 64 or 128 bytes for good cache
performance.  This is still O(log(n)) and it uses more comparisons
than binary search, but you touch many fewer cache lines.

I expect that, if you do #1 and #2, you'd get excellent performance at
very little cost to the complexity of the code.

  parent reply	other threads:[~2017-06-01 13:50 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01  5:44 [RFC PATCH 00/10] x86: undwarf unwinder Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 01/10] objtool: move checking code to check.c Josh Poimboeuf
2017-06-14  7:22   ` Jiri Slaby
2017-06-01  5:44 ` [RFC PATCH 02/10] objtool, x86: add several functions and files to the objtool whitelist Josh Poimboeuf
2017-06-14  7:24   ` Jiri Slaby
2017-06-14 13:03     ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 03/10] objtool: stack validation 2.0 Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 04/10] objtool: add undwarf debuginfo generation Josh Poimboeuf
2017-06-14  8:42   ` Jiri Slaby
2017-06-14 13:27     ` Josh Poimboeuf
2017-06-22  7:47       ` Jiri Slaby
2017-06-22 12:49         ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 05/10] objtool, x86: add facility for asm code to provide CFI hints Josh Poimboeuf
2017-06-01 13:57   ` Andy Lutomirski
2017-06-01 14:16     ` Josh Poimboeuf
2017-06-01 14:40       ` Andy Lutomirski
2017-06-01 15:02         ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 06/10] x86/entry: add CFI hint undwarf annotations Josh Poimboeuf
2017-06-01 14:03   ` Andy Lutomirski
2017-06-01 14:23     ` Josh Poimboeuf
2017-06-01 14:28       ` Josh Poimboeuf
2017-06-01 14:39         ` Andy Lutomirski
2017-06-01 15:01           ` Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 07/10] x86/asm: add CFI hint annotations to sync_core() Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 08/10] extable: rename 'sortextable' script to 'sorttable' Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 09/10] extable: add undwarf table sorting ability to sorttable script Josh Poimboeuf
2017-06-01  5:44 ` [RFC PATCH 10/10] x86/unwind: add undwarf unwinder Josh Poimboeuf
2017-06-01 11:05   ` Peter Zijlstra
2017-06-01 12:26     ` Josh Poimboeuf
2017-06-01 12:47       ` Jiri Slaby
2017-06-01 13:02         ` Josh Poimboeuf
2017-06-01 13:42         ` Peter Zijlstra
2017-06-01 13:10       ` Peter Zijlstra
2017-06-01 12:13   ` Peter Zijlstra
2017-06-01 12:36     ` Josh Poimboeuf
2017-06-01 13:12       ` Peter Zijlstra
2017-06-01 15:03         ` Josh Poimboeuf
2017-06-14 11:45   ` Jiri Slaby
2017-06-14 13:44     ` Josh Poimboeuf
2017-06-01  6:08 ` [RFC PATCH 00/10] x86: " Ingo Molnar
2017-06-01 11:58   ` Josh Poimboeuf
2017-06-01 12:17     ` Peter Zijlstra
2017-06-01 12:33       ` Jiri Slaby
2017-06-01 12:52         ` Josh Poimboeuf
2017-06-01 12:57           ` Jiri Slaby
2017-06-01 12:47       ` Josh Poimboeuf
2017-06-01 13:25         ` Peter Zijlstra
2017-06-06 14:14           ` Sergey Senozhatsky
2017-06-01 13:50         ` Andy Lutomirski [this message]
2017-06-01 13:50     ` Ingo Molnar
2017-06-01 13:58       ` Jiri Slaby
2017-06-02  8:30         ` Jiri Slaby
2017-06-01 14:05       ` Josh Poimboeuf
2017-06-01 14:08       ` Jiri Slaby
2017-06-02 10:40         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrWziRb2+rEPAMSBne+ao6CbFSmoBumgMy0GTULko5FrHw@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=jslaby@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.