All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Carlo Arenas" <carenas@gmail.com>,
	git@vger.kernel.org, "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"GNU grep developers" <grep-devel@gnu.org>
Subject: Re: improve performance of PCRE2 bug 2642 bug workaround
Date: Tue, 22 Mar 2022 14:12:43 -0700	[thread overview]
Message-ID: <325b7ba6-04a8-0010-a288-a118a820f3c3@cs.ucla.edu> (raw)
In-Reply-To: <99b0adb6-26ba-293c-3a8f-679f59e7cb4d@web.de>

[-- Attachment #1: Type: text/plain, Size: 263 bytes --]

On 3/22/22 13:26, René Scharfe wrote:

> However, the looser check works around another bug, if only by accident.

Thanks for letting me know. In that case, GNU grep should use a looser 
check too, like Git grep does. I installed the attached into GNU grep.

[-- Attachment #2: 0001-grep-work-around-another-potential-PCRE2-bug.patch --]
[-- Type: text/x-patch, Size: 1678 bytes --]

From ff2d24b08223e6c7b704a91127bac4391a9b8adb Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 22 Mar 2022 14:09:05 -0700
Subject: [PATCH] grep: work around another potential PCRE2 bug
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Potential problem reported by René Scharfe in:
https://lore.kernel.org/git/99b0adb6-26ba-293c-3a8f-679f59e7cb4d@web.de/T
* src/pcresearch.c (Pcompile): Mimic git grep’s workarounds
for PCRE2 bugs more closely; this is more conservative.
---
 src/pcresearch.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/pcresearch.c b/src/pcresearch.c
index 0cf804d..6947838 100644
--- a/src/pcresearch.c
+++ b/src/pcresearch.c
@@ -154,15 +154,16 @@ Pcompile (char *pattern, idx_t size, reg_syntax_t ignored, bool exact)
 #ifdef PCRE2_MATCH_INVALID_UTF
       /* Consider invalid UTF-8 as a barrier, instead of error.  */
       flags |= PCRE2_MATCH_INVALID_UTF;
-
-# if ! (10 < PCRE2_MAJOR + (36 <= PCRE2_MINOR))
-      /* Work around PCRE2 bug 2642.  */
-      if (flags & PCRE2_CASELESS)
-        flags |= PCRE2_NO_START_OPTIMIZE;
-# endif
 #endif
     }
 
+#if defined PCRE2_MATCH_INVALID_UTF && !(10 < PCRE2_MAJOR + (36 <= PCRE2_MINOR))
+  /* Work around PCRE2 bug 2642, and another bug reportedly fixed in
+     PCRE2 commit e0c6029a62db9c2161941ecdf459205382d4d379.  */
+  if (flags & (PCRE2_UTF | PCRE2_CASELESS))
+    flags |= PCRE2_NO_START_OPTIMIZE;
+#endif
+
   /* FIXME: Remove this restriction.  */
   if (rawmemchr (pattern, '\n') != patlim)
     die (EXIT_TROUBLE, 0, _("the -P option only supports a single pattern"));
-- 
2.32.0


  reply	other threads:[~2022-03-22 21:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 16:38 improve performance of PCRE2 bug 2642 bug workaround Paul Eggert
2022-03-22 20:26 ` René Scharfe
2022-03-22 21:12   ` Paul Eggert [this message]
2022-03-23  1:09   ` Carlo Marcelo Arenas Belón
2022-03-23  4:06     ` Paul Eggert
2022-03-23 18:37     ` René Scharfe
2022-03-23 20:24       ` Carlo Arenas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=325b7ba6-04a8-0010-a288-a118a820f3c3@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=grep-devel@gnu.org \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.