All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Jeff King <peff@peff.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>, git@vger.kernel.org
Subject: pcre performance, was Re: git log filtering
Date: Wed, 7 Mar 2007 18:37:23 +0100 (CET)	[thread overview]
Message-ID: <Pine.LNX.4.63.0703071807250.22628@wbgn013.biozentrum.uni-wuerzburg.de> (raw)
In-Reply-To: <20070208061654.GA8813@coredump.intra.peff.net>

Hi,

On Thu, 8 Feb 2007, Jeff King wrote:

> In every case there, pcre has either comparable performance, or simply 
> blows away glibc.

So I tested this against external grep. For completeness' sake, I tested 
these against each other: GNU regex-0.12, Git _without_ external grep 
(relies on glibc's regex), Git _with_ external grep ("original"), pcre, 
and for good measure, pcre with NO_MMAP=1 (to test if disk access is the 
problem).

Here are the numbers:

grep-gnu-regex:

21.41user 1.08system 0:22.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
21.40user 1.06system 0:22.47elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
21.61user 1.06system 0:22.68elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
21.30user 1.10system 0:22.48elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
21.30user 1.08system 0:22.43elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps

grep-no-external-grep:

6.98user 1.17system 0:08.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7120minor)pagefaults 0swaps
7.07user 1.16system 0:08.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps
6.98user 1.12system 0:08.11elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps
7.00user 1.18system 0:08.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7121minor)pagefaults 0swaps

grep-original:

0.82user 1.15system 0:01.97elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7090minor)pagefaults 0swaps
0.94user 1.03system 0:01.97elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7099minor)pagefaults 0swaps
0.89user 1.07system 0:01.96elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7092minor)pagefaults 0swaps
0.81user 1.15system 0:01.97elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7092minor)pagefaults 0swaps

grep-pcre:

4.04user 1.18system 0:05.24elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7205minor)pagefaults 0swaps
4.16user 1.08system 0:05.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps
4.24user 0.98system 0:05.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps
4.08user 1.14system 0:05.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7206minor)pagefaults 0swaps

grep-pcre-no-mmap:

4.15user 1.07system 0:05.22elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
4.01user 1.14system 0:05.17elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7209minor)pagefaults 0swaps
3.94user 1.18system 0:05.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps
4.11user 1.06system 0:05.18elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7210minor)pagefaults 0swaps

BTW this was "git grep Lin.*valds" on linux-2.6, just updated.

The first test was run 5 times instead of 4 to make sure it is hot cache. 
This is on a dual 1.2GHz 2GB machine.

I cannot really say anything about the pagefaults, so I'll leave that to 
the wizards.

Result: external grep wins hands-down. GNU regex loses hands-down. pcre 
seems to be better than glibc's regex engine, and gains ever so slightly 
when using NO_MMAP.

I ran the same test on a 1GHz 256MB machine which is overloaded, and in 
that case, GNU regex is still worst (~55 sec), while glibc and pcre are 
equal (glibc slightly slower with ~35 sec, pcre ~34 sec), and external 
grep wins (~29 sec). Of course, this is io-bound, but it shows that pcre 
uses more memory than glibc.

Ciao,
Dscho

  parent reply	other threads:[~2007-03-07 17:38 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-07 16:41 git log filtering Don Zickus
2007-02-07 16:55 ` Jakub Narebski
2007-02-07 17:01 ` Uwe Kleine-König
2007-02-07 17:12   ` Johannes Schindelin
2007-02-07 17:12 ` Linus Torvalds
2007-02-07 17:25   ` Johannes Schindelin
     [not found]     ` <7v64ad7l12.fsf@assigned-by-dhcp.cox.net>
2007-02-07 21:03       ` Linus Torvalds
2007-02-07 21:09         ` Junio C Hamano
2007-02-07 21:53           ` Linus Torvalds
2007-02-08  6:16             ` Jeff King
2007-02-08 18:06               ` Johannes Schindelin
2007-02-08 22:33                 ` Jeff King
2007-02-09  0:18                   ` Johannes Schindelin
2007-02-09  0:23                     ` Shawn O. Pearce
2007-02-09  0:45                       ` Johannes Schindelin
2007-02-09 10:15                       ` Sergey Vlasov
2007-02-09  1:59                     ` Jeff King
2007-02-09 13:13                       ` Johannes Schindelin
2007-02-09 13:22                         ` Jeff King
2007-02-09 15:02                           ` Johannes Schindelin
2007-03-07 17:37               ` Johannes Schindelin [this message]
2007-03-07 18:03                 ` pcre performance, was " Paolo Bonzini
2007-02-08  1:59         ` Horst H. von Brand
2007-02-07 18:16   ` Linus Torvalds
2007-02-07 19:49     ` Fix "git log -z" behaviour Linus Torvalds
2007-02-07 19:55       ` Junio C Hamano
2007-02-07 22:53       ` Don Zickus
2007-02-07 23:05         ` Linus Torvalds
2007-02-08 22:34       ` Junio C Hamano
2007-02-10  7:32         ` Junio C Hamano
2007-02-10  9:36           ` Junio C Hamano
2007-02-10 17:09             ` Linus Torvalds
2007-02-07 18:19   ` git log filtering Don Zickus
2007-02-07 18:27     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.63.0703071807250.22628@wbgn013.biozentrum.uni-wuerzburg.de \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.