From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Jeff King <peff@peff.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>, git@vger.kernel.org
Subject: pcre performance, was Re: git log filtering
Date: Wed, 7 Mar 2007 18:37:23 +0100 (CET) [thread overview]
Message-ID: <Pine.LNX.4.63.0703071807250.22628@wbgn013.biozentrum.uni-wuerzburg.de> (raw)
In-Reply-To: <20070208061654.GA8813@coredump.intra.peff.net>
Hi,
On Thu, 8 Feb 2007, Jeff King wrote:
> In every case there, pcre has either comparable performance, or simply
> blows away glibc.
So I tested this against external grep. For completeness' sake, I tested
these against each other: GNU regex-0.12, Git _without_ external grep
(relies on glibc's regex), Git _with_ external grep ("original"), pcre,
and for good measure, pcre with NO_MMAP=1 (to test if disk access is the
problem).
Here are the numbers:
grep-gnu-regex:
21.41user 1.08system 0:22.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
21.40user 1.06system 0:22.47elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
21.61user 1.06system 0:22.68elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
21.30user 1.10system 0:22.48elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
21.30user 1.08system 0:22.43elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
grep-no-external-grep:
6.98user 1.17system 0:08.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7120minor)pagefaults 0swaps
7.07user 1.16system 0:08.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps
6.98user 1.12system 0:08.11elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps
7.00user 1.18system 0:08.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps
grep-original:
0.82user 1.15system 0:01.97elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7090minor)pagefaults 0swaps
0.94user 1.03system 0:01.97elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7099minor)pagefaults 0swaps
0.89user 1.07system 0:01.96elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7092minor)pagefaults 0swaps
0.81user 1.15system 0:01.97elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7092minor)pagefaults 0swaps
grep-pcre:
4.04user 1.18system 0:05.24elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7205minor)pagefaults 0swaps
4.16user 1.08system 0:05.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps
4.24user 0.98system 0:05.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps
4.08user 1.14system 0:05.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps
grep-pcre-no-mmap:
4.15user 1.07system 0:05.22elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
4.01user 1.14system 0:05.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
3.94user 1.18system 0:05.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
4.11user 1.06system 0:05.18elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
BTW this was "git grep Lin.*valds" on linux-2.6, just updated.
The first test was run 5 times instead of 4 to make sure it is hot cache.
This is on a dual 1.2GHz 2GB machine.
I cannot really say anything about the pagefaults, so I'll leave that to
the wizards.
Result: external grep wins hands-down. GNU regex loses hands-down. pcre
seems to be better than glibc's regex engine, and gains ever so slightly
when using NO_MMAP.
I ran the same test on a 1GHz 256MB machine which is overloaded, and in
that case, GNU regex is still worst (~55 sec), while glibc and pcre are
equal (glibc slightly slower with ~35 sec, pcre ~34 sec), and external
grep wins (~29 sec). Of course, this is io-bound, but it shows that pcre
uses more memory than glibc.
Ciao,
Dscho
next prev parent reply other threads:[~2007-03-07 17:38 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-07 16:41 git log filtering Don Zickus
2007-02-07 16:55 ` Jakub Narebski
2007-02-07 17:01 ` Uwe Kleine-König
2007-02-07 17:12 ` Johannes Schindelin
2007-02-07 17:12 ` Linus Torvalds
2007-02-07 17:25 ` Johannes Schindelin
[not found] ` <7v64ad7l12.fsf@assigned-by-dhcp.cox.net>
2007-02-07 21:03 ` Linus Torvalds
2007-02-07 21:09 ` Junio C Hamano
2007-02-07 21:53 ` Linus Torvalds
2007-02-08 6:16 ` Jeff King
2007-02-08 18:06 ` Johannes Schindelin
2007-02-08 22:33 ` Jeff King
2007-02-09 0:18 ` Johannes Schindelin
2007-02-09 0:23 ` Shawn O. Pearce
2007-02-09 0:45 ` Johannes Schindelin
2007-02-09 10:15 ` Sergey Vlasov
2007-02-09 1:59 ` Jeff King
2007-02-09 13:13 ` Johannes Schindelin
2007-02-09 13:22 ` Jeff King
2007-02-09 15:02 ` Johannes Schindelin
2007-03-07 17:37 ` Johannes Schindelin [this message]
2007-03-07 18:03 ` pcre performance, was " Paolo Bonzini
2007-02-08 1:59 ` Horst H. von Brand
2007-02-07 18:16 ` Linus Torvalds
2007-02-07 19:49 ` Fix "git log -z" behaviour Linus Torvalds
2007-02-07 19:55 ` Junio C Hamano
2007-02-07 22:53 ` Don Zickus
2007-02-07 23:05 ` Linus Torvalds
2007-02-08 22:34 ` Junio C Hamano
2007-02-10 7:32 ` Junio C Hamano
2007-02-10 9:36 ` Junio C Hamano
2007-02-10 17:09 ` Linus Torvalds
2007-02-07 18:19 ` git log filtering Don Zickus
2007-02-07 18:27 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.63.0703071807250.22628@wbgn013.biozentrum.uni-wuerzburg.de \
--to=johannes.schindelin@gmx.de \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.