All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <paolo.bonzini@lu.unisi.ch>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jeff King <peff@peff.net>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	git@vger.kernel.org
Subject: Re: pcre performance, was Re: git log filtering
Date: Wed, 07 Mar 2007 19:03:32 +0100	[thread overview]
Message-ID: <45EEFE74.1090309@lu.unisi.ch> (raw)
In-Reply-To: <Pine.LNX.4.63.0703071807250.22628@wbgn013.biozentrum.uni-wuerzburg.de>


> Result: external grep wins hands-down. GNU regex loses hands-down. pcre 
> seems to be better than glibc's regex engine, and gains ever so slightly 
> when using NO_MMAP.

Indeed GNU regex 0.12 loses, and that's why it was rewritten for (IIRC)
glibc 2.3.  Older glibc's use code derived from GNU regex 0.12; but the
old GNU regex code is dead in general (maybe it survives in Emacs -- but
I don't remember), and the glibc regex code can be used by external
programs via gnulib.

glibc is slower than PCRE mostly because it is internationalized.  So
for example it supports things like stra[.ss.]e matching both strasse
and straße in a German locale, or [[=a=]] matching aàáäâ and possibly
more variations.  In theory.  In practice I couldn't make it work
while writing this message...

External grep wins hands-down because it's a DFA engine.  If the regex
uses backreferences (or the above esoteric constructs), however, external
grep will not be able to give a definite answer using the fast engine,
and will fall back to glibc regex.

Paolo

  reply	other threads:[~2007-03-07 18:06 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-07 16:41 git log filtering Don Zickus
2007-02-07 16:55 ` Jakub Narebski
2007-02-07 17:01 ` Uwe Kleine-König
2007-02-07 17:12   ` Johannes Schindelin
2007-02-07 17:12 ` Linus Torvalds
2007-02-07 17:25   ` Johannes Schindelin
     [not found]     ` <7v64ad7l12.fsf@assigned-by-dhcp.cox.net>
2007-02-07 21:03       ` Linus Torvalds
2007-02-07 21:09         ` Junio C Hamano
2007-02-07 21:53           ` Linus Torvalds
2007-02-08  6:16             ` Jeff King
2007-02-08 18:06               ` Johannes Schindelin
2007-02-08 22:33                 ` Jeff King
2007-02-09  0:18                   ` Johannes Schindelin
2007-02-09  0:23                     ` Shawn O. Pearce
2007-02-09  0:45                       ` Johannes Schindelin
2007-02-09 10:15                       ` Sergey Vlasov
2007-02-09  1:59                     ` Jeff King
2007-02-09 13:13                       ` Johannes Schindelin
2007-02-09 13:22                         ` Jeff King
2007-02-09 15:02                           ` Johannes Schindelin
2007-03-07 17:37               ` pcre performance, was " Johannes Schindelin
2007-03-07 18:03                 ` Paolo Bonzini [this message]
2007-02-08  1:59         ` Horst H. von Brand
2007-02-07 18:16   ` Linus Torvalds
2007-02-07 19:49     ` Fix "git log -z" behaviour Linus Torvalds
2007-02-07 19:55       ` Junio C Hamano
2007-02-07 22:53       ` Don Zickus
2007-02-07 23:05         ` Linus Torvalds
2007-02-08 22:34       ` Junio C Hamano
2007-02-10  7:32         ` Junio C Hamano
2007-02-10  9:36           ` Junio C Hamano
2007-02-10 17:09             ` Linus Torvalds
2007-02-07 18:19   ` git log filtering Don Zickus
2007-02-07 18:27     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45EEFE74.1090309@lu.unisi.ch \
    --to=paolo.bonzini@lu.unisi.ch \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=bonzini@gnu.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.