* speedup in tag lookup using hash tables @ 2008-02-12 0:03 Arnaldo Carvalho de Melo [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-02-12 0:03 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA Hi Ilpo, Today I optimized the dwarves a bit by using a per object file hash table for tag lookup, it yelded almost 50% speedup on pahole when running on a vmlinux file :-) codiff doesn't makes that much tag lookups, so we didn't got much improvements there, but I'm doing experiments on dead tag elimination that probably will help a lot there, but for that I have first to grok Ulrich Drepper's libdisasm to find out what are the tags that are really used by looking at accesses to register indexed memory areas that use as a base pointer what is in local/global variables and function parameters. This also will provide the basis for detecting access patterns that will ultimatelly allow libdwarves_reorganize to do struct reorganizations to improve locality of reference, etc. Please take a look at v1.6 that I pushed today and tell me your impressions. Regards, - Arnaldo ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>]
* Re: speedup in tag lookup using hash tables [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> @ 2008-02-12 11:24 ` Ilpo Järvinen [not found] ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Ilpo Järvinen @ 2008-02-12 11:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote: > Today I optimized the dwarves a bit by using a per object file > hash table for tag lookup, it yelded almost 50% speedup on pahole when > running on a vmlinux file :-) > > codiff doesn't makes that much tag lookups, so we didn't got > much improvements there, Because I had to do these file by file anyway, I ended up using modified timestamps in my scripts to avoid loading .o files altogether when one wasn't recompiled. It helped some. > but I'm doing experiments on dead tag elimination > that probably will help a lot there, but for that I have first to grok > Ulrich Drepper's libdisasm to find out what are the tags that are really > used by looking at accesses to register indexed memory areas that use as > a base pointer what is in local/global variables and function > parameters. > > This also will provide the basis for detecting access patterns > that will ultimatelly allow libdwarves_reorganize to do struct > reorganizations to improve locality of reference, etc. This sounds nice, I was thinking something similar earlier but would have tried to do that with some kind of source analysis. > Please take a look at v1.6 that I pushed today and tell me your > impressions. Thanks, I'll check that later on. My bruteforce inline remover run over all include/ stuff just finished yesterday after ~1.5 days (ie. ~5800 inlines compile tested on ~90 slaves, next time I'll have to plug ccache into that site-specific distribute thing I use to speed it up a lot :-)), I'll post the results later on to lkml+netdev with some patches and thoughts, the winner seems to be this beauty :-) : -110805 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put ...And 22 other are in 10000+ category and 235 1kB+ (some overlap exists due to __-funcs). What still remains to do are the non-include/ headers which might reveal some candies too. -- i. ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org>]
* Re: speedup in tag lookup using hash tables [not found] ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org> @ 2008-02-12 12:50 ` Arnaldo Carvalho de Melo [not found] ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-02-12 12:50 UTC (permalink / raw) To: Ilpo Järvinen; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA Em Tue, Feb 12, 2008 at 01:24:44PM +0200, Ilpo Järvinen escreveu: > On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote: > > > Today I optimized the dwarves a bit by using a per object file > > hash table for tag lookup, it yelded almost 50% speedup on pahole when > > running on a vmlinux file :-) > > > > codiff doesn't makes that much tag lookups, so we didn't got > > much improvements there, > > Because I had to do these file by file anyway, I ended up using modified > timestamps in my scripts to avoid loading .o files altogether when one > wasn't recompiled. It helped some. This reminds me that I have to add support comparing build trees, so that we can use 'make O=old' to build a baseline and then 'make O=new', using ccache to reuse the old (or simply copying old to new) and then process file after file, combining the results for tree wide comparisions/statistics. > > but I'm doing experiments on dead tag elimination > > that probably will help a lot there, but for that I have first to grok > > Ulrich Drepper's libdisasm to find out what are the tags that are really > > used by looking at accesses to register indexed memory areas that use as > > a base pointer what is in local/global variables and function > > parameters. > > > > This also will provide the basis for detecting access patterns > > that will ultimatelly allow libdwarves_reorganize to do struct > > reorganizations to improve locality of reference, etc. > > This sounds nice, I was thinking something similar earlier but would have > tried to do that with some kind of source analysis. That is possible too, and indeed I did experiments in the past with sparse, the library used by the kernel checker (make C=[12]), but I think that tapping into the readily available debuginfo packages, not just for the kernel, looking at the end, object file, result is better. > > Please take a look at v1.6 that I pushed today and tell me your > > impressions. > > Thanks, I'll check that later on. My bruteforce inline remover run over > all include/ stuff just finished yesterday after ~1.5 days (ie. ~5800 > inlines compile tested on ~90 slaves, next time I'll have to plug ccache > into that site-specific distribute thing I use to speed it up a lot :-)), > I'll post the results later on to lkml+netdev with some patches and > thoughts, the winner seems to be this beauty :-) : > -110805 869 funcs, 198 +, 111003 -, diff: -110805 --- skb_put > ...And 22 other are in 10000+ category and 235 1kB+ (some overlap > exists due to __-funcs). What still remains to do are the non-include/ > headers which might reveal some candies too. Excellent! Some people at some point noted that while the dwarves are useful for showing the problems, people actually had to follow up with patches to solve the problems, I'm very happy that you are doing just that, thanks a lot for that. - Arnaldo ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org>]
* Re: speedup in tag lookup using hash tables [not found] ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> @ 2008-02-12 13:24 ` Ilpo Järvinen 0 siblings, 0 replies; 4+ messages in thread From: Ilpo Järvinen @ 2008-02-12 13:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: dwarves-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: TEXT/PLAIN, Size: 3166 bytes --] On Tue, 12 Feb 2008, Arnaldo Carvalho de Melo wrote: > Em Tue, Feb 12, 2008 at 01:24:44PM +0200, Ilpo Järvinen escreveu: > > On Mon, 11 Feb 2008, Arnaldo Carvalho de Melo wrote: > > > > > > codiff doesn't makes that much tag lookups, so we didn't got > > > much improvements there, > > > > Because I had to do these file by file anyway, I ended up using modified > > timestamps in my scripts to avoid loading .o files altogether when one > > wasn't recompiled. It helped some. > > This reminds me that I have to add support comparing build trees, so > that we can use 'make O=old' to build a baseline and then 'make O=new', > using ccache to reuse the old (or simply copying old to new) and then > process file after file, combining the results for tree wide > comparisions/statistics. Hmm, maybe I could use the copying too to cut the times down a bit more. That didn't occur to me before that, thanks for the tip. :-) > > > but I'm doing experiments on dead tag elimination > > > that probably will help a lot there, but for that I have first to grok > > > Ulrich Drepper's libdisasm to find out what are the tags that are really > > > used by looking at accesses to register indexed memory areas that use as > > > a base pointer what is in local/global variables and function > > > parameters. > > > > > > This also will provide the basis for detecting access patterns > > > that will ultimatelly allow libdwarves_reorganize to do struct > > > reorganizations to improve locality of reference, etc. > > > > This sounds nice, I was thinking something similar earlier but would have > > tried to do that with some kind of source analysis. > > That is possible too, and indeed I did experiments in the past with > sparse, the library used by the kernel checker (make C=[12]), but I > think that tapping into the readily available debuginfo packages, not > just for the kernel, looking at the end, object file, result is better. Also I briefly read some sparse code but never really got it started... :-) Besides with source it easily gets nasty with all the complexity of c, machine code is much more straight-forward. Besides, in one respect it's much better to avoid using source code, the compiled form represents the end result of the thing we're trying to optimize, source code lacks the details added by the optimizations (mainly reordering & dead code elimination that is possible on some paths). > Excellent! Some people at some point noted that while the dwarves are > useful for showing the problems, people actually had to follow up with > patches to solve the problems, I'm very happy that you are doing just > that, thanks a lot for that. I even wrote a simple shell script to automate the dirty work in a simple static inline from .h -> .c move, in the long run it definately pays of when you don't need to open all of them into an editor and do it by hand but can just do git-diff an check that the result is what one expects... :-) Of course it's not always that trivial task, e.g., libification of jhash is not a pure fire-and-forget type move but has to be thought some and multiple options needs to be compared. -- i. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-02-12 13:24 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-02-12 0:03 speedup in tag lookup using hash tables Arnaldo Carvalho de Melo [not found] ` <20080212000310.GB4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> 2008-02-12 11:24 ` Ilpo Järvinen [not found] ` <Pine.LNX.4.64.0802121301180.31652-Y/UOj9v5BLQhZigby9b+C6cUovnZ0M2TMR2xtNvyitY@public.gmane.org> 2008-02-12 12:50 ` Arnaldo Carvalho de Melo [not found] ` <20080212125037.GE4157-f8uhVLnGfZaxAyOMLChx1axOck334EZe@public.gmane.org> 2008-02-12 13:24 ` Ilpo Järvinen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).