speedup in tag lookup using hash tables

dwarves.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* speedup in tag lookup using hash tables
@ 2008-02-12  0:03 Arnaldo Carvalho de Melo
       [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-02-12  0:03 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA

Hi Ilpo,

	Today I optimized the dwarves a bit by using a per object file
hash table for tag lookup, it yelded almost 50% speedup on pahole when
running on a vmlinux file :-)

	codiff doesn't makes that much tag lookups, so we didn't got
much improvements there, but I'm doing experiments on dead tag elimination
that probably will help a lot there, but for that I have first to grok
Ulrich Drepper's libdisasm to find out what are the tags that are really
used by looking at accesses to register indexed memory areas that use as
a base pointer what is in local/global variables and function
parameters.

	This also will provide the basis for detecting access patterns
that will ultimatelly allow libdwarves_reorganize to do struct
reorganizations to improve locality of reference, etc.

	Please take a look at v1.6 that I pushed today and tell me your
impressions.

Regards,

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

[parent not found: <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>]

* Re: speedup in tag lookup using hash tables
       [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
@ 2008-02-12 11:24   ` Ilpo Järvinen
       [not found]     ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Ilpo Järvinen @ 2008-02-12 11:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA

On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote:

> 	Today I optimized the dwarves a bit by using a per object file
> hash table for tag lookup, it yelded almost 50% speedup on pahole when
> running on a vmlinux file :-)
>
> 	codiff doesn't makes that much tag lookups, so we didn't got
> much improvements there,

Because I had to do these file by file anyway, I ended up using modified 
timestamps in my scripts to avoid loading .o files altogether when one 
wasn't recompiled. It helped some.

> but I'm doing experiments on dead tag elimination
> that probably will help a lot there, but for that I have first to grok
> Ulrich Drepper's libdisasm to find out what are the tags that are really
> used by looking at accesses to register indexed memory areas that use as
> a base pointer what is in local/global variables and function
> parameters.
>
> 	This also will provide the basis for detecting access patterns
> that will ultimatelly allow libdwarves_reorganize to do struct
> reorganizations to improve locality of reference, etc.

This sounds nice, I was thinking something similar earlier but would have 
tried to do that with some kind of source analysis.

> 	Please take a look at v1.6 that I pushed today and tell me your
> impressions.

Thanks, I'll check that later on. My bruteforce inline remover run over 
all include/ stuff just finished yesterday after ~1.5 days (ie. ~5800 
inlines compile tested on ~90 slaves, next time I'll have to plug ccache 
into that site-specific distribute thing I use to speed it up a lot :-)), 
I'll post the results later on to lkml+netdev with some patches and 
thoughts, the winner seems to be this beauty :-) :
   -110805  869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
...And 22 other are in 10000+ category and 235 1kB+ (some overlap 
exists due to __-funcs). What still remains to do are the non-include/ 
headers which might reveal some candies too.


-- 
 i.

^ permalink raw reply	[flat|nested] 4+ messages in thread

[parent not found: <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org>]

* Re: speedup in tag lookup using hash tables
       [not found]     ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org>
@ 2008-02-12 12:50       ` Arnaldo Carvalho de Melo
       [not found]         ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Arnaldo Carvalho de Melo @ 2008-02-12 12:50 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA

Em Tue, Feb 12, 2008 at 01:24:44PM +0200, Ilpo Järvinen escreveu:
> On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote:
> 
> > 	Today I optimized the dwarves a bit by using a per object file
> > hash table for tag lookup, it yelded almost 50% speedup on pahole when
> > running on a vmlinux file :-)
> >
> > 	codiff doesn't makes that much tag lookups, so we didn't got
> > much improvements there,
> 
> Because I had to do these file by file anyway, I ended up using modified 
> timestamps in my scripts to avoid loading .o files altogether when one 
> wasn't recompiled. It helped some.

This reminds me that I have to add support comparing build trees, so
that we can use 'make O=old' to build a baseline and then 'make O=new',
using ccache to reuse the old (or simply copying old to new) and then
process file after file, combining the results for tree wide
comparisions/statistics.
 
> > but I'm doing experiments on dead tag elimination
> > that probably will help a lot there, but for that I have first to grok
> > Ulrich Drepper's libdisasm to find out what are the tags that are really
> > used by looking at accesses to register indexed memory areas that use as
> > a base pointer what is in local/global variables and function
> > parameters.
> >
> > 	This also will provide the basis for detecting access patterns
> > that will ultimatelly allow libdwarves_reorganize to do struct
> > reorganizations to improve locality of reference, etc.
> 
> This sounds nice, I was thinking something similar earlier but would have 
> tried to do that with some kind of source analysis.

That is possible too, and indeed I did experiments in the past with
sparse, the library used by the kernel checker (make C=[12]), but I
think that tapping into the readily available debuginfo packages, not
just for the kernel, looking at the end, object file, result is better.
 
> > 	Please take a look at v1.6 that I pushed today and tell me your
> > impressions.
> 
> Thanks, I'll check that later on. My bruteforce inline remover run over 
> all include/ stuff just finished yesterday after ~1.5 days (ie. ~5800 
> inlines compile tested on ~90 slaves, next time I'll have to plug ccache 
> into that site-specific distribute thing I use to speed it up a lot :-)), 
> I'll post the results later on to lkml+netdev with some patches and 
> thoughts, the winner seems to be this beauty :-) :
>    -110805  869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put
> ...And 22 other are in 10000+ category and 235 1kB+ (some overlap 
> exists due to __-funcs). What still remains to do are the non-include/ 
> headers which might reveal some candies too.

Excellent! Some people at some point noted that while the dwarves are
useful for showing the problems, people actually had to follow up with
patches to solve the problems, I'm very happy that you are doing just
that, thanks a lot for that.

- Arnaldo

^ permalink raw reply	[flat|nested] 4+ messages in thread

[parent not found: <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>]

* Re: speedup in tag lookup using hash tables
       [not found]         ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
@ 2008-02-12 13:24           ` Ilpo Järvinen
  0 siblings, 0 replies; 4+ messages in thread
From: Ilpo Järvinen @ 2008-02-12 13:24 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3166 bytes --]

On Tue, 12 Feb 2008, Arnaldo Carvalho de Melo wrote:

> Em Tue, Feb 12, 2008 at 01:24:44PM +0200, Ilpo Järvinen escreveu:
> > On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote:
> > >
> > > 	codiff doesn't makes that much tag lookups, so we didn't got
> > > much improvements there,
> > 
> > Because I had to do these file by file anyway, I ended up using modified 
> > timestamps in my scripts to avoid loading .o files altogether when one 
> > wasn't recompiled. It helped some.
> 
> This reminds me that I have to add support comparing build trees, so
> that we can use 'make O=old' to build a baseline and then 'make O=new',
> using ccache to reuse the old (or simply copying old to new) and then
> process file after file, combining the results for tree wide
> comparisions/statistics.

Hmm, maybe I could use the copying too to cut the times down a bit more. 
That didn't occur to me before that, thanks for the tip. :-)

> > > but I'm doing experiments on dead tag elimination
> > > that probably will help a lot there, but for that I have first to grok
> > > Ulrich Drepper's libdisasm to find out what are the tags that are really
> > > used by looking at accesses to register indexed memory areas that use as
> > > a base pointer what is in local/global variables and function
> > > parameters.
> > >
> > > 	This also will provide the basis for detecting access patterns
> > > that will ultimatelly allow libdwarves_reorganize to do struct
> > > reorganizations to improve locality of reference, etc.
> > 
> > This sounds nice, I was thinking something similar earlier but would have 
> > tried to do that with some kind of source analysis.
> 
> That is possible too, and indeed I did experiments in the past with
> sparse, the library used by the kernel checker (make C=[12]), but I
> think that tapping into the readily available debuginfo packages, not
> just for the kernel, looking at the end, object file, result is better.

Also I briefly read some sparse code but never really got it
started... :-) Besides with source it easily gets nasty with all 
the complexity of c, machine code is much more straight-forward. Besides, 
in one respect it's much better to avoid using source code, the compiled 
form represents the end result of the thing we're trying to optimize, 
source code lacks the details added by the optimizations (mainly 
reordering & dead code elimination that is possible on some paths).

> Excellent! Some people at some point noted that while the dwarves are
> useful for showing the problems, people actually had to follow up with
> patches to solve the problems, I'm very happy that you are doing just
> that, thanks a lot for that.

I even wrote a simple shell script to automate the dirty work in a simple
static inline from .h -> .c move, in the long run it definately pays of 
when you don't need to open all of them into an editor and do it by 
hand but can just do git-diff an check that the result is what one 
expects... :-) Of course it's not always that trivial task, e.g., 
libification of jhash is not a pure fire-and-forget type move but 
has to be thought some and multiple options needs to be compared.

-- 
 i.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-02-12 13:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-12  0:03 speedup in tag lookup using hash tables Arnaldo Carvalho de Melo
     [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
2008-02-12 11:24   ` Ilpo Järvinen
     [not found]     ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org>
2008-02-12 12:50       ` Arnaldo Carvalho de Melo
     [not found]         ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>
2008-02-12 13:24           ` Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).