git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Revive 'pcre2-chartables-leakfix'
@ 2019-10-16 12:06 Johannes Schindelin via GitGitGadget
  2019-10-16 12:06 ` [PATCH 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
                   ` (3 more replies)
  0 siblings, 4 replies; 42+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-16 12:06 UTC (permalink / raw)
  To: git; +Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin, Junio C Hamano

As I had stated several times, I was really unhappy how the original fix
harped on nedmalloc and totally ignored runtime-configured custom
allocators.

So this is, at long last, my attempt to give this a new life. It is based
off of maint and needs trivial merge conflict resolutions relative to master
.

Range-diff relative to cb/pcre2-chartables-leakfix:

1:  770584d697e ! 1:  4feb8cc83a6 grep: make PCRE1 aware of custom allocator
    @@ Commit message
         to override the system alocator, and so it is incompatible with
         USE_NED_ALLOCATOR as reported by Dscho[1] (in similar code from PCRE2)

    +    Note that nedmalloc, as well as other custom allocators like jemalloc
    +    and mi-malloc, can be configured at runtime (via `LD_PRELOAD`),
    +    therefore we cannot know at compile time whether a custom allocator is
    +    used or not.
    +
         Make the minimum change possible to ensure this combination is supported
    -    by extending grep_init to set the PCRE1 specific functions to the NED
    -    versions (using the aliased names though) and therefore making sure all
    -    alocations are done inside PCRE1 with the same allocator than the rest
    -    of git.
    +    by extending `grep_init()` to set the PCRE1 specific functions to Git's
    +    idea of `malloc()` and `free()` and therefore making sure all
    +    allocations are done inside PCRE1 with the same allocator than the rest
    +    of Git.

    -    This change might have performance impact (hopefully for the best) and
    -    so will be good idea to test it in a platform where NED might have a
    -    positive impact (ex: Windows 7)
    +    This change has negligible performance impact: PCRE needs to allocate
    +    memory once per program run for the character table and for each pattern
    +    compilation. These are both rare events compared to matching patterns
    +    against lines. Actual measurements[2] show that the impact is lost in
    +    the noise.

         [1] https://public-inbox.org/git/pull.306.git.gitgitgadget@gmail.com
    +    [2] https://public-inbox.org/git/7f42007f-911b-c570-17f6-1c6af0429586@web.de

         Signed-off-by: Carlo Marcelo Arenas Bel├│n <carenas@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    -
    - ## Makefile ##
    -@@ Makefile: ifdef NATIVE_CRLF
    - endif
    - 
    - ifdef USE_NED_ALLOCATOR
    --    COMPAT_CFLAGS += -Icompat/nedmalloc
    -+    COMPAT_CFLAGS += -DUSE_NED_ALLOCATOR -Icompat/nedmalloc
    -     COMPAT_OBJS += compat/nedmalloc/nedmalloc.o
    -     OVERRIDE_STRDUP = YesPlease
    - endif
    +    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

      ## grep.c ##
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
    @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
       * default values from the template we read the configuration
       * information in an earlier call to git_config(grep_config).
     + *
    -+ * If using PCRE make sure that the library is configured
    -+ * to use the right allocator (ex: NED)
    ++ * If using PCRE, make sure that the library is configured
    ++ * to use the same allocator as Git (e.g. nedmalloc on Windows).
       */
      void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
      {
          struct grep_opt *def = &grep_defaults;
          int i;

    -+#ifdef USE_NED_ALLOCATOR
     +#ifdef USE_LIBPCRE1
     +    pcre_malloc = malloc;
     +    pcre_free = free;
     +#endif
    -+#endif
     +
          memset(opt, 0, sizeof(*opt));
          opt->repo = repo;
2:  cf8d36f34a2 ! 2:  39ad220c18b grep: make PCRE2 aware of custom allocator
    @@
      ## Metadata ##
    -Author: Carlo Marcelo Arenas Bel├│n <carenas@gmail.com>
    +Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>

      ## Commit message ##
         grep: make PCRE2 aware of custom allocator

         94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include
         a way to override the system allocator, and so it is incompatible with
    -    USE_NED_ALLOCATOR.  The problem was made visible when an attempt to
    -    avoid a leak in a data structure that is created by the library was
    -    passed to NED's free for disposal triggering a segfault in Windows.
    +    custom allocators (e.g. nedmalloc). This problem became obvious when we
    +    tried to plug a memory leak by `free()`ing a data structure allocated by
    +    PCRE2, triggering a segfault in Windows (where we use nedmalloc by
    +    default).

         PCRE2 requires the use of a general context to override the allocator
         and therefore, there is a lot more code needed than in PCRE1, including
    @@ Commit message
         Extend the grep API with a "destructor" that could be called to cleanup
         any objects that were created and used globally.

    -    Update builtin/grep to use that new API, but any other future
    -    users should make sure to have matching grep_init/grep_destroy calls
    -    if they are using the pattern matching functionality (currently only
    -    relevant when using both NED and PCRE2)
    +    Update `builtin/grep.c` to use that new API, but any other future users
    +    should make sure to have matching `grep_init()`/`grep_destroy()` calls
    +    if they are using the pattern matching functionality.

         Move some of the logic that was before done per thread (in the workers)
    -    into an earlier phase to avoid degrading performance, but as the use
    -    of PCRE2 with NED is better understood it is expected more of its
    -    functions will be instructed to use the custom allocator as well as
    +    into an earlier phase to avoid degrading performance, but as the use of
    +    PCRE2 with custom allocators is better understood it is expected more of
    +    its functions will be instructed to use the custom allocator as well as
         was done in the original code[1] this work was based on.

         [1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/

         Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
         Signed-off-by: Carlo Marcelo Arenas Bel├│n <carenas@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

      ## builtin/grep.c ##
     @@ builtin/grep.c: int cmd_grep(int argc, const char **argv, const char *prefix)
    @@ grep.c: static int grep_source_is_binary(struct grep_source *gs,
     +#ifdef USE_LIBPCRE2
     +static pcre2_general_context *pcre2_global_context;
     +
    -+#ifdef USE_NED_ALLOCATOR
     +static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
     +{
     +    return malloc(size);
    @@ grep.c: static int grep_source_is_binary(struct grep_source *gs,
     +    return free(pointer);
     +}
     +#endif
    -+#endif
     +
      static const char *color_grep_slots[] = {
          [GREP_COLOR_CONTEXT]        = "context",
          [GREP_COLOR_FILENAME]        = "filename",
     @@ grep.c: int grep_config(const char *var, const char *value, void *cb)
       *
    -  * If using PCRE make sure that the library is configured
    -  * to use the right allocator (ex: NED)
    -+ * if any object is created it should be cleaned up in grep_destroy()
    +  * If using PCRE, make sure that the library is configured
    +  * to use the same allocator as Git (e.g. nedmalloc on Windows).
    ++ *
    ++ * Any allocated memory needs to be released in grep_destroy().
       */
      void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
      {
    +     struct grep_opt *def = &grep_defaults;
    +     int i;
    + 
    ++#if defined(USE_LIBPCRE2)
    ++    if (!pcre2_global_context)
    ++        pcre2_global_context = pcre2_general_context_create(
    ++                    pcre2_malloc, pcre2_free, NULL);
    ++#endif
    ++
    + #ifdef USE_LIBPCRE1
    +     pcre_malloc = malloc;
    +     pcre_free = free;
     @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
              color_set(opt->colors[i], def->colors[i]);
      }
    @@ grep.c: void grep_init(struct grep_opt *opt, struct repository *repo, const char
      static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
      {
          /*
    -@@ grep.c: void append_header_grep_pattern(struct grep_opt *opt,
    - void append_grep_pattern(struct grep_opt *opt, const char *pat,
    -              const char *origin, int no, enum grep_pat_token t)
    - {
    -+#if defined(USE_LIBPCRE2) && defined(USE_NED_ALLOCATOR)
    -+    if (!pcre2_global_context && opt->ignore_case && has_non_ascii(pat))
    -+        pcre2_global_context = pcre2_general_context_create(
    -+                    pcre2_malloc, pcre2_free, NULL);
    -+#endif
    -     append_grep_pat(opt, pat, strlen(pat), origin, no, t);
    - }
    - 
     @@ grep.c: static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt

          p->pcre2_compile_context = NULL;
    @@ grep.c: static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_
          if (opt->ignore_case) {
              if (has_non_ascii(p->pattern)) {
     -            character_tables = pcre2_maketables(NULL);
    -+#ifdef USE_NED_ALLOCATOR
     +            if (!pcre2_global_context)
     +                BUG("pcre2_global_context uninitialized");
    -+#endif
     +            character_tables = pcre2_maketables(pcre2_global_context);
                  p->pcre2_compile_context = pcre2_compile_context_create(NULL);
                  pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
3:  54aef142151 ! 3:  3ced7c386a0 grep: avoid leak of chartables in PCRE2
    @@
      ## Metadata ##
    -Author: Carlo Marcelo Arenas Bel├│n <carenas@gmail.com>
    +Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>

      ## Commit message ##
         grep: avoid leak of chartables in PCRE2
    @@ Commit message
         for PCRE1.

         Signed-off-by: Carlo Marcelo Arenas Bel├│n <carenas@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

      ## grep.c ##
     @@ grep.c: static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
    @@ grep.c: static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_
          int patinforet;
          size_t jitsizearg;
     @@ grep.c: static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
    +         if (has_non_ascii(p->pattern)) {
                  if (!pcre2_global_context)
                      BUG("pcre2_global_context uninitialized");
    - #endif
     -            character_tables = pcre2_maketables(pcre2_global_context);
     +            p->pcre2_tables = pcre2_maketables(pcre2_global_context);
                  p->pcre2_compile_context = pcre2_compile_context_create(NULL);

My merge conflict resolution with master
[https://github.com/dscho/git/commit/pcre2-chartables-leakfix-merge-with-master]
:

diff --cc grep.c
index 9556d13dc1d,0bb4cbd3d82..b7ae5a442a6
--- a/grep.c
+++ b/grep.c
@@@ -533,15 -470,11 +506,15 @@@ static void compile_pcre2_pattern(struc

      p->pcre2_compile_context = NULL;

 +    /* pcre2_global_context is initialized in append_grep_pattern */
      if (opt->ignore_case) {
-         if (has_non_ascii(p->pattern)) {
+         if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
 -            character_tables = pcre2_maketables(NULL);
 +            if (!pcre2_global_context)
 +                BUG("pcre2_global_context uninitialized");
 +            p->pcre2_tables = pcre2_maketables(pcre2_global_context);
              p->pcre2_compile_context = pcre2_compile_context_create(NULL);
 -            pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
 +            pcre2_set_character_tables(p->pcre2_compile_context,
 +                            p->pcre2_tables);
          }
          options |= PCRE2_CASELESS;
      }
@@@ -643,9 -571,6 +611,7 @@@ static void free_pcre2_pattern(struct g
      pcre2_compile_context_free(p->pcre2_compile_context);
      pcre2_code_free(p->pcre2_pattern);
      pcre2_match_data_free(p->pcre2_match_data);
-     pcre2_jit_stack_free(p->pcre2_jit_stack);
-     pcre2_match_context_free(p->pcre2_match_context);
 +    free((void *)p->pcre2_tables);
  }
  #else /* !USE_LIBPCRE2 */
  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --cc grep.h
index 533165c21a5,05dc1bb98e5..811fd274c95
--- a/grep.h
+++ b/grep.h
@@@ -94,12 -78,9 +78,10 @@@ struct grep_pat 
      pcre2_code *pcre2_pattern;
      pcre2_match_data *pcre2_match_data;
      pcre2_compile_context *pcre2_compile_context;
-     pcre2_match_context *pcre2_match_context;
-     pcre2_jit_stack *pcre2_jit_stack;
 +    const uint8_t *pcre2_tables;
      uint32_t pcre2_jit_on;
-     kwset_t kws;
      unsigned fixed:1;
+     unsigned is_fixed:1;
      unsigned ignore_case:1;
      unsigned word_regexp:1;
  };

Carlo Marcelo Arenas Belón (2):
  grep: make PCRE1 aware of custom allocator
  grep: make PCRE2 aware of custom allocator

Johannes Schindelin (1):
  grep: avoid leak of chartables in PCRE2

 builtin/grep.c |  1 +
 grep.c         | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 grep.h         |  2 ++
 3 files changed, 47 insertions(+), 3 deletions(-)


base-commit: 51cf315870bbb7254ddf06c84fe03b41bc48eebd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-402%2Fdscho%2Fpcre2-chartables-leakfix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-402/dscho/pcre2-chartables-leakfix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/402
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 1/3] grep: make PCRE1 aware of custom allocator
  2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
@ 2019-10-16 12:06 ` Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:06 ` [PATCH 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 42+ messages in thread
From: Carlo Marcelo Arenas Belón via GitGitGadget @ 2019-10-16 12:06 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Carlo Marcelo Arenas Belón

From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= <carenas@gmail.com>

63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) didn't include a way
to override the system alocator, and so it is incompatible with
USE_NED_ALLOCATOR as reported by Dscho[1] (in similar code from PCRE2)

Note that nedmalloc, as well as other custom allocators like jemalloc
and mi-malloc, can be configured at runtime (via `LD_PRELOAD`),
therefore we cannot know at compile time whether a custom allocator is
used or not.

Make the minimum change possible to ensure this combination is supported
by extending `grep_init()` to set the PCRE1 specific functions to Git's
idea of `malloc()` and `free()` and therefore making sure all
allocations are done inside PCRE1 with the same allocator than the rest
of Git.

This change has negligible performance impact: PCRE needs to allocate
memory once per program run for the character table and for each pattern
compilation. These are both rare events compared to matching patterns
against lines. Actual measurements[2] show that the impact is lost in
the noise.

[1] https://public-inbox.org/git/pull.306.git.gitgitgadget@gmail.com
[2] https://public-inbox.org/git/7f42007f-911b-c570-17f6-1c6af0429586@web.de

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 grep.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/grep.c b/grep.c
index cd952ef5d3..db6e0d895f 100644
--- a/grep.c
+++ b/grep.c
@@ -150,12 +150,20 @@ int grep_config(const char *var, const char *value, void *cb)
  * Initialize one instance of grep_opt and copy the
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
+ *
+ * If using PCRE, make sure that the library is configured
+ * to use the same allocator as Git (e.g. nedmalloc on Windows).
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
 	struct grep_opt *def = &grep_defaults;
 	int i;
 
+#ifdef USE_LIBPCRE1
+	pcre_malloc = malloc;
+	pcre_free = free;
+#endif
+
 	memset(opt, 0, sizeof(*opt));
 	opt->repo = repo;
 	opt->prefix = prefix;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 2/3] grep: make PCRE2 aware of custom allocator
  2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  2019-10-16 12:06 ` [PATCH 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
@ 2019-10-16 12:06 ` Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:06 ` [PATCH 3/3] grep: avoid leak of chartables in PCRE2 Johannes Schindelin via GitGitGadget
  2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  3 siblings, 0 replies; 42+ messages in thread
From: Carlo Marcelo Arenas Belón via GitGitGadget @ 2019-10-16 12:06 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Carlo Marcelo Arenas Belón

From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= <carenas@gmail.com>

94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include
a way to override the system allocator, and so it is incompatible with
custom allocators (e.g. nedmalloc). This problem became obvious when we
tried to plug a memory leak by `free()`ing a data structure allocated by
PCRE2, triggering a segfault in Windows (where we use nedmalloc by
default).

PCRE2 requires the use of a general context to override the allocator
and therefore, there is a lot more code needed than in PCRE1, including
a couple of wrapper functions.

Extend the grep API with a "destructor" that could be called to cleanup
any objects that were created and used globally.

Update `builtin/grep.c` to use that new API, but any other future users
should make sure to have matching `grep_init()`/`grep_destroy()` calls
if they are using the pattern matching functionality.

Move some of the logic that was before done per thread (in the workers)
into an earlier phase to avoid degrading performance, but as the use of
PCRE2 with custom allocators is better understood it is expected more of
its functions will be instructed to use the custom allocator as well as
was done in the original code[1] this work was based on.

[1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/grep.c |  1 +
 grep.c         | 34 +++++++++++++++++++++++++++++++++-
 grep.h         |  1 +
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 560051784e..e49c20df60 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
 	free_grep_patterns(&opt);
+	grep_destroy();
 	return !hit;
 }
diff --git a/grep.c b/grep.c
index db6e0d895f..296edbb56f 100644
--- a/grep.c
+++ b/grep.c
@@ -16,6 +16,20 @@ static int grep_source_is_binary(struct grep_source *gs,
 
 static struct grep_opt grep_defaults;
 
+#ifdef USE_LIBPCRE2
+static pcre2_general_context *pcre2_global_context;
+
+static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
+{
+	return malloc(size);
+}
+
+static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
+{
+	return free(pointer);
+}
+#endif
+
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -153,12 +167,20 @@ int grep_config(const char *var, const char *value, void *cb)
  *
  * If using PCRE, make sure that the library is configured
  * to use the same allocator as Git (e.g. nedmalloc on Windows).
+ *
+ * Any allocated memory needs to be released in grep_destroy().
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
 	struct grep_opt *def = &grep_defaults;
 	int i;
 
+#if defined(USE_LIBPCRE2)
+	if (!pcre2_global_context)
+		pcre2_global_context = pcre2_general_context_create(
+					pcre2_malloc, pcre2_free, NULL);
+#endif
+
 #ifdef USE_LIBPCRE1
 	pcre_malloc = malloc;
 	pcre_free = free;
@@ -186,6 +208,13 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 		color_set(opt->colors[i], def->colors[i]);
 }
 
+void grep_destroy(void)
+{
+#ifdef USE_LIBPCRE2
+	pcre2_general_context_free(pcre2_global_context);
+#endif
+}
+
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
 {
 	/*
@@ -505,9 +534,12 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 
 	p->pcre2_compile_context = NULL;
 
+	/* pcre2_global_context is initialized in append_grep_pattern */
 	if (opt->ignore_case) {
 		if (has_non_ascii(p->pattern)) {
-			character_tables = pcre2_maketables(NULL);
+			if (!pcre2_global_context)
+				BUG("pcre2_global_context uninitialized");
+			character_tables = pcre2_maketables(pcre2_global_context);
 			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
 			pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
 		}
diff --git a/grep.h b/grep.h
index 1875880f37..526c2db9ef 100644
--- a/grep.h
+++ b/grep.h
@@ -189,6 +189,7 @@ struct grep_opt {
 void init_grep_defaults(struct repository *);
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_destroy(void);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 3/3] grep: avoid leak of chartables in PCRE2
  2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  2019-10-16 12:06 ` [PATCH 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:06 ` [PATCH 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
@ 2019-10-16 12:06 ` Johannes Schindelin via GitGitGadget
  2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  3 siblings, 0 replies; 42+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-16 12:06 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) introduced
a small memory leak visible with valgrind in t7813.

Complete the creation of a PCRE2 specific variable that was missing from
the original change and free the generated table just like it is done
for PCRE1.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 grep.c | 7 ++++---
 grep.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/grep.c b/grep.c
index 296edbb56f..9556d13dc1 100644
--- a/grep.c
+++ b/grep.c
@@ -525,7 +525,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	PCRE2_UCHAR errbuf[256];
 	PCRE2_SIZE erroffset;
 	int options = PCRE2_MULTILINE;
-	const uint8_t *character_tables = NULL;
 	int jitret;
 	int patinforet;
 	size_t jitsizearg;
@@ -539,9 +538,10 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 		if (has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
 				BUG("pcre2_global_context uninitialized");
-			character_tables = pcre2_maketables(pcre2_global_context);
+			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
 			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
-			pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
+			pcre2_set_character_tables(p->pcre2_compile_context,
+							p->pcre2_tables);
 		}
 		options |= PCRE2_CASELESS;
 	}
@@ -645,6 +645,7 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_match_data_free(p->pcre2_match_data);
 	pcre2_jit_stack_free(p->pcre2_jit_stack);
 	pcre2_match_context_free(p->pcre2_match_context);
+	free((void *)p->pcre2_tables);
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 526c2db9ef..533165c21a 100644
--- a/grep.h
+++ b/grep.h
@@ -96,6 +96,7 @@ struct grep_pat {
 	pcre2_compile_context *pcre2_compile_context;
 	pcre2_match_context *pcre2_match_context;
 	pcre2_jit_stack *pcre2_jit_stack;
+	const uint8_t *pcre2_tables;
 	uint32_t pcre2_jit_on;
 	kwset_t kws;
 	unsigned fixed:1;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix'
  2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
                   ` (2 preceding siblings ...)
  2019-10-16 12:06 ` [PATCH 3/3] grep: avoid leak of chartables in PCRE2 Johannes Schindelin via GitGitGadget
@ 2019-10-16 12:10 ` Johannes Schindelin via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
                     ` (2 more replies)
  3 siblings, 3 replies; 42+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2019-10-16 12:10 UTC (permalink / raw)
  To: git; +Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin, Junio C Hamano

As I had stated several times, I was really unhappy how the original fix
harped on nedmalloc and totally ignored runtime-configured custom
allocators.

So this is, at long last, my attempt to give this a new life. It is based
off of maint and needs trivial merge conflict resolutions relative to master
.

Changes since v1:

 * I managed to mess up the authorship of 3/3. Sorry for that. I fixed it,
   so that Carlo is shown as author again.

Carlo Marcelo Arenas Belón (3):
  grep: make PCRE1 aware of custom allocator
  grep: make PCRE2 aware of custom allocator
  grep: avoid leak of chartables in PCRE2

 builtin/grep.c |  1 +
 grep.c         | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 grep.h         |  2 ++
 3 files changed, 47 insertions(+), 3 deletions(-)


base-commit: 51cf315870bbb7254ddf06c84fe03b41bc48eebd
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-402%2Fdscho%2Fpcre2-chartables-leakfix-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-402/dscho/pcre2-chartables-leakfix-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/402

Range-diff vs v1:

 1:  4feb8cc83a = 1:  4feb8cc83a grep: make PCRE1 aware of custom allocator
 2:  191d3a2280 = 2:  191d3a2280 grep: make PCRE2 aware of custom allocator
 3:  f21b2c9eb5 ! 3:  f8724fb267 grep: avoid leak of chartables in PCRE2
     @@ -1,4 +1,4 @@
     -Author: Johannes Schindelin <johannes.schindelin@gmx.de>
     +Author: Carlo Marcelo Arenas Belón <carenas@gmail.com>
      
          grep: avoid leak of chartables in PCRE2
      

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator
  2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
@ 2019-10-16 12:10   ` Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 3/3] grep: avoid leak of chartables in PCRE2 Carlo Marcelo Arenas Belón via GitGitGadget
  2 siblings, 0 replies; 42+ messages in thread
From: Carlo Marcelo Arenas Belón via GitGitGadget @ 2019-10-16 12:10 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Carlo Marcelo Arenas Belón

From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= <carenas@gmail.com>

63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) didn't include a way
to override the system alocator, and so it is incompatible with
USE_NED_ALLOCATOR as reported by Dscho[1] (in similar code from PCRE2)

Note that nedmalloc, as well as other custom allocators like jemalloc
and mi-malloc, can be configured at runtime (via `LD_PRELOAD`),
therefore we cannot know at compile time whether a custom allocator is
used or not.

Make the minimum change possible to ensure this combination is supported
by extending `grep_init()` to set the PCRE1 specific functions to Git's
idea of `malloc()` and `free()` and therefore making sure all
allocations are done inside PCRE1 with the same allocator than the rest
of Git.

This change has negligible performance impact: PCRE needs to allocate
memory once per program run for the character table and for each pattern
compilation. These are both rare events compared to matching patterns
against lines. Actual measurements[2] show that the impact is lost in
the noise.

[1] https://public-inbox.org/git/pull.306.git.gitgitgadget@gmail.com
[2] https://public-inbox.org/git/7f42007f-911b-c570-17f6-1c6af0429586@web.de

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 grep.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/grep.c b/grep.c
index cd952ef5d3..db6e0d895f 100644
--- a/grep.c
+++ b/grep.c
@@ -150,12 +150,20 @@ int grep_config(const char *var, const char *value, void *cb)
  * Initialize one instance of grep_opt and copy the
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
+ *
+ * If using PCRE, make sure that the library is configured
+ * to use the same allocator as Git (e.g. nedmalloc on Windows).
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
 	struct grep_opt *def = &grep_defaults;
 	int i;
 
+#ifdef USE_LIBPCRE1
+	pcre_malloc = malloc;
+	pcre_free = free;
+#endif
+
 	memset(opt, 0, sizeof(*opt));
 	opt->repo = repo;
 	opt->prefix = prefix;
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 2/3] grep: make PCRE2 aware of custom allocator
  2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
@ 2019-10-16 12:10   ` Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-18  1:38     ` Junio C Hamano
                       ` (11 more replies)
  2019-10-16 12:10   ` [PATCH v2 3/3] grep: avoid leak of chartables in PCRE2 Carlo Marcelo Arenas Belón via GitGitGadget
  2 siblings, 12 replies; 42+ messages in thread
From: Carlo Marcelo Arenas Belón via GitGitGadget @ 2019-10-16 12:10 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Carlo Marcelo Arenas Belón

From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= <carenas@gmail.com>

94da9193a6 (grep: add support for PCRE v2, 2017-06-01) didn't include
a way to override the system allocator, and so it is incompatible with
custom allocators (e.g. nedmalloc). This problem became obvious when we
tried to plug a memory leak by `free()`ing a data structure allocated by
PCRE2, triggering a segfault in Windows (where we use nedmalloc by
default).

PCRE2 requires the use of a general context to override the allocator
and therefore, there is a lot more code needed than in PCRE1, including
a couple of wrapper functions.

Extend the grep API with a "destructor" that could be called to cleanup
any objects that were created and used globally.

Update `builtin/grep.c` to use that new API, but any other future users
should make sure to have matching `grep_init()`/`grep_destroy()` calls
if they are using the pattern matching functionality.

Move some of the logic that was before done per thread (in the workers)
into an earlier phase to avoid degrading performance, but as the use of
PCRE2 with custom allocators is better understood it is expected more of
its functions will be instructed to use the custom allocator as well as
was done in the original code[1] this work was based on.

[1] https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/

Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/grep.c |  1 +
 grep.c         | 34 +++++++++++++++++++++++++++++++++-
 grep.h         |  1 +
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 560051784e..e49c20df60 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1145,5 +1145,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
 	free_grep_patterns(&opt);
+	grep_destroy();
 	return !hit;
 }
diff --git a/grep.c b/grep.c
index db6e0d895f..296edbb56f 100644
--- a/grep.c
+++ b/grep.c
@@ -16,6 +16,20 @@ static int grep_source_is_binary(struct grep_source *gs,
 
 static struct grep_opt grep_defaults;
 
+#ifdef USE_LIBPCRE2
+static pcre2_general_context *pcre2_global_context;
+
+static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
+{
+	return malloc(size);
+}
+
+static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
+{
+	return free(pointer);
+}
+#endif
+
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -153,12 +167,20 @@ int grep_config(const char *var, const char *value, void *cb)
  *
  * If using PCRE, make sure that the library is configured
  * to use the same allocator as Git (e.g. nedmalloc on Windows).
+ *
+ * Any allocated memory needs to be released in grep_destroy().
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
 	struct grep_opt *def = &grep_defaults;
 	int i;
 
+#if defined(USE_LIBPCRE2)
+	if (!pcre2_global_context)
+		pcre2_global_context = pcre2_general_context_create(
+					pcre2_malloc, pcre2_free, NULL);
+#endif
+
 #ifdef USE_LIBPCRE1
 	pcre_malloc = malloc;
 	pcre_free = free;
@@ -186,6 +208,13 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 		color_set(opt->colors[i], def->colors[i]);
 }
 
+void grep_destroy(void)
+{
+#ifdef USE_LIBPCRE2
+	pcre2_general_context_free(pcre2_global_context);
+#endif
+}
+
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
 {
 	/*
@@ -505,9 +534,12 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 
 	p->pcre2_compile_context = NULL;
 
+	/* pcre2_global_context is initialized in append_grep_pattern */
 	if (opt->ignore_case) {
 		if (has_non_ascii(p->pattern)) {
-			character_tables = pcre2_maketables(NULL);
+			if (!pcre2_global_context)
+				BUG("pcre2_global_context uninitialized");
+			character_tables = pcre2_maketables(pcre2_global_context);
 			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
 			pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
 		}
diff --git a/grep.h b/grep.h
index 1875880f37..526c2db9ef 100644
--- a/grep.h
+++ b/grep.h
@@ -189,6 +189,7 @@ struct grep_opt {
 void init_grep_defaults(struct repository *);
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
+void grep_destroy(void);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 3/3] grep: avoid leak of chartables in PCRE2
  2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
@ 2019-10-16 12:10   ` Carlo Marcelo Arenas Belón via GitGitGadget
  2 siblings, 0 replies; 42+ messages in thread
From: Carlo Marcelo Arenas Belón via GitGitGadget @ 2019-10-16 12:10 UTC (permalink / raw)
  To: git
  Cc: Carlo Marcelo Arenas Belón, Johannes Schindelin,
	Junio C Hamano, Carlo Marcelo Arenas Belón

From: =?UTF-8?q?Carlo=20Marcelo=20Arenas=20Bel=C3=B3n?= <carenas@gmail.com>

94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) introduced
a small memory leak visible with valgrind in t7813.

Complete the creation of a PCRE2 specific variable that was missing from
the original change and free the generated table just like it is done
for PCRE1.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 grep.c | 7 ++++---
 grep.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/grep.c b/grep.c
index 296edbb56f..9556d13dc1 100644
--- a/grep.c
+++ b/grep.c
@@ -525,7 +525,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	PCRE2_UCHAR errbuf[256];
 	PCRE2_SIZE erroffset;
 	int options = PCRE2_MULTILINE;
-	const uint8_t *character_tables = NULL;
 	int jitret;
 	int patinforet;
 	size_t jitsizearg;
@@ -539,9 +538,10 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 		if (has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
 				BUG("pcre2_global_context uninitialized");
-			character_tables = pcre2_maketables(pcre2_global_context);
+			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
 			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
-			pcre2_set_character_tables(p->pcre2_compile_context, character_tables);
+			pcre2_set_character_tables(p->pcre2_compile_context,
+							p->pcre2_tables);
 		}
 		options |= PCRE2_CASELESS;
 	}
@@ -645,6 +645,7 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_match_data_free(p->pcre2_match_data);
 	pcre2_jit_stack_free(p->pcre2_jit_stack);
 	pcre2_match_context_free(p->pcre2_match_context);
+	free((void *)p->pcre2_tables);
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 526c2db9ef..533165c21a 100644
--- a/grep.h
+++ b/grep.h
@@ -96,6 +96,7 @@ struct grep_pat {
 	pcre2_compile_context *pcre2_compile_context;
 	pcre2_match_context *pcre2_match_context;
 	pcre2_jit_stack *pcre2_jit_stack;
+	const uint8_t *pcre2_tables;
 	uint32_t pcre2_jit_on;
 	kwset_t kws;
 	unsigned fixed:1;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 2/3] grep: make PCRE2 aware of custom allocator
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
@ 2019-10-18  1:38     ` Junio C Hamano
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2019-10-18  1:38 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón via GitGitGadget
  Cc: git, Carlo Marcelo Arenas Belón, Johannes Schindelin

"Carlo Marcelo Arenas Belón via GitGitGadget"
<gitgitgadget@gmail.com> writes:

>  builtin/grep.c |  1 +
>  grep.c         | 34 +++++++++++++++++++++++++++++++++-
>  grep.h         |  1 +
>  3 files changed, 35 insertions(+), 1 deletion(-)

>  
> +#if defined(USE_LIBPCRE2)
> +	if (!pcre2_global_context)
> +		pcre2_global_context = pcre2_general_context_create(
> +					pcre2_malloc, pcre2_free, NULL);
> +#endif

This part should use the same "#ifdef" as the other one which is a
minor nit, for consistency.  I do not care too deeply which way we
unify, but we should stick to one.

> +
>  #ifdef USE_LIBPCRE1
>  	pcre_malloc = malloc;
>  	pcre_free = free;

Other than that, all 3 patches look sensible, and they certainly got
simplified by dropping the conditional ;-).


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 00/10] grep/pcre2: memory allocation fixes
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-18  1:38     ` Junio C Hamano
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-10 21:34       ` Junio C Hamano
                         ` (11 more replies)
  2021-02-04 21:05     ` [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
                       ` (9 subsequent siblings)
  11 siblings, 12 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

This series fixes up bugs in our PCRE v2 wrapper code and how it
handles malloc()/free().

Junio: I'm splitting this off my recently sent 25 patch series, which
I should probably have sent as an RFC:
https://lore.kernel.org/git/20210203032811.14979-1-avarab@gmail.com/

It's on top of "next", as it would otherwise conflict with my
in-flight ab/grep-pcre-invalid-utf8, ab/lose-grep-debug and ab/retire-pcre1.

06/10 is a follow-up improvement (not a fix, the in-flight works fine
too) for ab/grep-pcre-invalid-utf8. The latter two just touch adjacent
lines of code.

There's no notable new behavior here, just cleanup of existing
functionality. In mid-2019 there was a lot of discussion around the
code being fixed here:

    https://lore.kernel.org/git/pull.306.git.gitgitgadget@gmail.com/#t
    https://lore.kernel.org/git/pull.402.git.1571227613.gitgitgadget@gmail.com/

As discussed in 08/10 I believe that fix was so difficult to get right
because it was starting out with a fundamentally incorrect assumption
about how PCRE v2's memory handling works. With 08-10/10 we end up
with a much easier to reason about end-state, as the API itself is
actually quite simple.

Ævar Arnfjörð Bjarmason (10):
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: move definitions of pcre2_{malloc,free}

 builtin/grep.c |  1 -
 grep.c         | 99 ++++++++++++++++++++++----------------------------
 grep.h         |  9 ++++-
 3 files changed, 51 insertions(+), 58 deletions(-)

-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
  2019-10-18  1:38     ` Junio C Hamano
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Drop an assignment added in b65abcafc7a (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6e (grep: add support for PCRE v2,
2017-06-01).

There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option, and "opt"
is (mostly) used for the options the user supplied, let's avoid the
pattern of needlessly assigning to it.

With my in-flight removal of PCREv1 [1] ("Remove support for v1 of the
PCRE library", 2021-01-24) there'll be even less confusion around what
we call where in these codepaths, which is one more reason to remove
this.

1. https://lore.kernel.org/git/xmqqmtwy29x8.fsf@gitster.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/grep.c b/grep.c
index aabfaaa4c3..816e23f17e 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	assert(opt->pcre2);
-
 	p->pcre2_compile_context = NULL;
 
 	/* pcre2_global_context is initialized in append_grep_pattern */
@@ -555,7 +553,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 #endif
 	if (p->fixed || p->is_fixed) {
 #ifdef USE_LIBPCRE2
-		opt->pcre2 = 1;
 		if (p->is_fixed) {
 			compile_pcre2_pattern(p, opt);
 		} else {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/10] grep/pcre2: drop needless assignment to NULL
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). In
create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.

I think this was probably something left over from an earlier
development version of mine.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/grep.c b/grep.c
index 816e23f17e..f27c5de7f5 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	p->pcre2_compile_context = NULL;
-
 	/* pcre2_global_context is initialized in append_grep_pattern */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
                       ` (6 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Correct a comment added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index f27c5de7f5..b9adcd83e7 100644
--- a/grep.c
+++ b/grep.c
@@ -373,7 +373,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in append_grep_pattern */
+	/* pcre2_global_context is initialized in grep_init */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc()
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer. This doesn't introduce
any functional change, it just makes a subsequent commit's diff easier
to read. Changes code added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index b9adcd83e7..f96d86c929 100644
--- a/grep.c
+++ b/grep.c
@@ -45,7 +45,8 @@ static pcre2_general_context *pcre2_global_context;
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
-	return malloc(size);
+	void *pointer = malloc(size);
+	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (5 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-10 10:38       ` Johannes Schindelin
  2021-02-04 21:05     ` [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
"1".

This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/grep.c b/grep.c
index f96d86c929..7d262a23d8 100644
--- a/grep.c
+++ b/grep.c
@@ -42,15 +42,25 @@ static struct grep_opt grep_defaults = {
 
 #ifdef USE_LIBPCRE2
 static pcre2_general_context *pcre2_global_context;
+#define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
 	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
 	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
 {
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
 	free(pointer);
 }
 #endif
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (6 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987ed (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.

It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 18 ++++--------------
 grep.h |  3 +++
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/grep.c b/grep.c
index 7d262a23d8..e58044474d 100644
--- a/grep.c
+++ b/grep.c
@@ -400,21 +400,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	    !(!opt->ignore_case && (p->fixed || p->is_fixed)))
 		options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
 
+#ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER
 	/* Work around https://bugs.exim.org/show_bug.cgi?id=2642 fixed in 10.36 */
-	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS)) {
-		struct strbuf buf;
-		int len;
-		int err;
-
-		if ((len = pcre2_config(PCRE2_CONFIG_VERSION, NULL)) < 0)
-			BUG("pcre2_config(..., NULL) failed: %d", len);
-		strbuf_init(&buf, len + 1);
-		if ((err = pcre2_config(PCRE2_CONFIG_VERSION, buf.buf)) < 0)
-			BUG("pcre2_config(..., buf.buf) failed: %d", err);
-		if (versioncmp(buf.buf, "10.36") < 0)
-			options |= PCRE2_NO_START_OPTIMIZE;
-		strbuf_release(&buf);
-	}
+	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS))
+		options |= PCRE2_NO_START_OPTIMIZE;
+#endif
 
 	p->pcre2_pattern = pcre2_compile((PCRE2_SPTR)p->pattern,
 					 p->patternlen, options, &error, &erroffset,
diff --git a/grep.h b/grep.h
index ae89d6254b..54e52042cb 100644
--- a/grep.h
+++ b/grep.h
@@ -4,6 +4,9 @@
 #ifdef USE_LIBPCRE2
 #define PCRE2_CODE_UNIT_WIDTH 8
 #include <pcre2.h>
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_36_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (7 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-10 10:43       ` Johannes Schindelin
  2021-03-04  0:16       ` Junio C Hamano
  2021-02-04 21:05     ` [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
                       ` (2 subsequent siblings)
  11 siblings, 2 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables(). At first sight it's strange that
10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
which added the free() call here doesn't make use of the pcre2_free()
the author introduced in the preceding commit in 513f2b0bbd4 (grep:
make PCRE2 aware of custom allocator, 2019-10-16).

The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.

Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable. At some distant point in the future we'll be able
to drop the version guard, as nobody will be running a version older
than 10.34.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++++
 grep.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/grep.c b/grep.c
index e58044474d..c63dbff4b2 100644
--- a/grep.c
+++ b/grep.c
@@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_compile_context_free(p->pcre2_compile_context);
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
+#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
+	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+#else
 	free((void *)p->pcre2_tables);
+#endif
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 54e52042cb..64666e9204 100644
--- a/grep.h
+++ b/grep.h
@@ -7,6 +7,9 @@
 #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
 #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
 #endif
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (8 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-10 12:38       ` Johannes Schindelin
  2021-02-04 21:05     ` [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.

The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it managed to target pretty much the allocation freed
via free(), as opposed to by a pcre2_*free() function. I.e. the
pcre2_maketables() and pcre2_maketables_free() pair. For most of the
rest we continued allocating with stock malloc() inside PCREv2 itself,
but didn't segfault because we'd use its corresponding free().

In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd4. But then
we wouldn't use the desired allocator, let's just use it instead.

Before this patch we'd on e.g.:

    grep --threads=1 -iP æ.*var.*xyz

Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() call. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.

Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.

My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:

    "most of the time (like when using UTF-8) the chartable (and
    therefore the global context) is not needed (even when using
    alternate allocators)"

That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().

Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().

That's not what PCREv2's compile context is. To quote PCREv2's
documentation:

    "This context just contains pointers to (and data for) external
    memory management functions that are called from several places in
    the PCRE2 library."

Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd4 took of always
creating it.

Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.

1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index c63dbff4b2..0116ff5f09 100644
--- a/grep.c
+++ b/grep.c
@@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 			if (!pcre2_global_context)
 				BUG("pcre2_global_context uninitialized");
 			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
+			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (9 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-04 21:05     ` [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init(), as happens with
all the rest of the pcre2_* members of the grep_pat structure.

As noted in the preceding commit the approach 513f2b0bbd4 (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

This approach of creating a global context is just added complexity
for almost zero gain. On my system it's 24 bytes saved per-thread, for
context PCREv2 will then go on to some kilobytes for its own
thread-local state.

As noted in 6d423dd542f (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
for simplicity again.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  1 -
 grep.c         | 41 +++++++++++++++--------------------------
 grep.h         |  3 ++-
 3 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 55d06c9513..c69fe99340 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1175,6 +1175,5 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
 	free_grep_patterns(&opt);
-	grep_destroy();
 	return !hit;
 }
diff --git a/grep.c b/grep.c
index 0116ff5f09..2599f329cd 100644
--- a/grep.c
+++ b/grep.c
@@ -41,7 +41,6 @@ static struct grep_opt grep_defaults = {
 };
 
 #ifdef USE_LIBPCRE2
-static pcre2_general_context *pcre2_global_context;
 #define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
@@ -163,20 +162,9 @@ int grep_config(const char *var, const char *value, void *cb)
  * Initialize one instance of grep_opt and copy the
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
- *
- * If using PCRE, make sure that the library is configured
- * to use the same allocator as Git (e.g. nedmalloc on Windows).
- *
- * Any allocated memory needs to be released in grep_destroy().
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
-#if defined(USE_LIBPCRE2)
-	if (!pcre2_global_context)
-		pcre2_global_context = pcre2_general_context_create(
-					pcre2_malloc, pcre2_free, NULL);
-#endif
-
 	*opt = grep_defaults;
 
 	opt->repo = repo;
@@ -186,13 +174,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 	opt->header_tail = &opt->header_list;
 }
 
-void grep_destroy(void)
-{
-#ifdef USE_LIBPCRE2
-	pcre2_general_context_free(pcre2_global_context);
-#endif
-}
-
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
 {
 	/*
@@ -384,13 +365,20 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in grep_init */
+	/*
+	 * Call pcre2_general_context_create() before calling any
+	 * other pcre2_*(). It sets up our malloc()/free() functions
+	 * with which everything else is allocated.
+	 */
+	p->pcre2_general_context = pcre2_general_context_create(
+		pcre2_malloc, pcre2_free, NULL);
+	if (!p->pcre2_general_context)
+		die("Couldn't allocate PCRE2 general context");
+
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-			if (!pcre2_global_context)
-				BUG("pcre2_global_context uninitialized");
-			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
+			p->pcre2_tables = pcre2_maketables(p->pcre2_general_context);
+			p->pcre2_compile_context = pcre2_compile_context_create(p->pcre2_general_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +399,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, p->pcre2_general_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
@@ -491,10 +479,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
 #ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
-	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+	pcre2_maketables_free(p->pcre2_general_context, p->pcre2_tables);
 #else
 	free((void *)p->pcre2_tables);
 #endif
+	pcre2_general_context_free(p->pcre2_general_context);
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 64666e9204..72f82b1e30 100644
--- a/grep.h
+++ b/grep.h
@@ -14,6 +14,7 @@
 typedef int pcre2_code;
 typedef int pcre2_match_data;
 typedef int pcre2_compile_context;
+typedef int pcre2_general_context;
 #endif
 #ifndef PCRE2_MATCH_INVALID_UTF
 /* PCRE2_MATCH_* dummy also with !USE_LIBPCRE2, for test-pcre2-config.c */
@@ -75,6 +76,7 @@ struct grep_pat {
 	pcre2_code *pcre2_pattern;
 	pcre2_match_data *pcre2_match_data;
 	pcre2_compile_context *pcre2_compile_context;
+	pcre2_general_context *pcre2_general_context;
 	const uint8_t *pcre2_tables;
 	uint32_t pcre2_jit_on;
 	unsigned fixed:1;
@@ -167,7 +169,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
-void grep_destroy(void);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free}
  2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
                       ` (10 preceding siblings ...)
  2021-02-04 21:05     ` [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
@ 2021-02-04 21:05     ` Ævar Arnfjörð Bjarmason
  2021-02-10 12:40       ` Johannes Schindelin
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-04 21:05 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used it. Before the preceding
commit they used to be needed earlier, but now we can move them to be
adjacent to the other PCREv2 functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/grep.c b/grep.c
index 2599f329cd..636ac48bf0 100644
--- a/grep.c
+++ b/grep.c
@@ -40,30 +40,6 @@ static struct grep_opt grep_defaults = {
 	.output = std_output,
 };
 
-#ifdef USE_LIBPCRE2
-#define GREP_PCRE2_DEBUG_MALLOC 0
-
-static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
-{
-	void *pointer = malloc(size);
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
-#endif
-	return pointer;
-}
-
-static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-{
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	if (pointer)
-		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
-#endif
-	free(pointer);
-}
-#endif
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len)
 }
 
 #ifdef USE_LIBPCRE2
+#define GREP_PCRE2_DEBUG_MALLOC 0
+
+static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
+{
+	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
+	return pointer;
+}
+
+static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
+{
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
+	free(pointer);
+}
+
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
 {
 	int error;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  2021-02-04 21:05     ` [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
@ 2021-02-10 10:38       ` Johannes Schindelin
  0 siblings, 0 replies; 42+ messages in thread
From: Johannes Schindelin @ 2021-02-10 10:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Carlo Marcelo Arenas Belón

[-- Attachment #1: Type: text/plain, Size: 1677 bytes --]

Hi Ævar,

On Thu, 4 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Add optional printing of PCREv2 allocations to stderr for a developer
> who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
> "1".

Maybe clarify in the oneline that this is not an environment variable, but
a Makefile knob? I had to read all the way to the diff to understand that
aspect.

Otherwise, the patch series so far looks really fine to me.

Thanks,
Dscho

>
> This will be referenced a subsequent commit, and is generally useful
> to manually see what's going on with PCREv2 allocations while working
> on that code.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index f96d86c929..7d262a23d8 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -42,15 +42,25 @@ static struct grep_opt grep_defaults = {
>
>  #ifdef USE_LIBPCRE2
>  static pcre2_general_context *pcre2_global_context;
> +#define GREP_PCRE2_DEBUG_MALLOC 0
>
>  static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
>  {
>  	void *pointer = malloc(size);
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count = 1;
> +	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
> +#endif
>  	return pointer;
>  }
>
>  static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
>  {
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count = 1;
> +	if (pointer)
> +		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> +#endif
>  	free(pointer);
>  }
>  #endif
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function
  2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
@ 2021-02-10 10:43       ` Johannes Schindelin
  2021-03-04  0:16       ` Junio C Hamano
  1 sibling, 0 replies; 42+ messages in thread
From: Johannes Schindelin @ 2021-02-10 10:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Carlo Marcelo Arenas Belón

[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]

Hi Ævar,

On Thu, 4 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Make use of the pcre2_maketables_free() function to free the memory
> allocated by pcre2_maketables(). At first sight it's strange that
> 10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
> which added the free() call here doesn't make use of the pcre2_free()
> the author introduced in the preceding commit in 513f2b0bbd4 (grep:
> make PCRE2 aware of custom allocator, 2019-10-16).
>
> The reason is that at the time the function didn't exist. It was first
> introduced in PCREv2 version 10.34, released on 2019-11-21.

Git for Windows still uses v10.33. Thanks for the prod, I will update the
library.

Ciao,
Dscho

>
> Let's make use of it behind a macro. I don't think this matters for
> anything to do with custom allocators, but it makes our use of PCREv2
> more discoverable. At some distant point in the future we'll be able
> to drop the version guard, as nobody will be running a version older
> than 10.34.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++++
>  grep.h | 3 +++
>  2 files changed, 7 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index e58044474d..c63dbff4b2 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
>  	pcre2_compile_context_free(p->pcre2_compile_context);
>  	pcre2_code_free(p->pcre2_pattern);
>  	pcre2_match_data_free(p->pcre2_match_data);
> +#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
> +#else
>  	free((void *)p->pcre2_tables);
> +#endif
>  }
>  #else /* !USE_LIBPCRE2 */
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
> diff --git a/grep.h b/grep.h
> index 54e52042cb..64666e9204 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -7,6 +7,9 @@
>  #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
>  #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
>  #endif
> +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
> +#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +#endif
>  #else
>  typedef int pcre2_code;
>  typedef int pcre2_match_data;
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator
  2021-02-04 21:05     ` [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
@ 2021-02-10 12:38       ` Johannes Schindelin
  0 siblings, 0 replies; 42+ messages in thread
From: Johannes Schindelin @ 2021-02-10 12:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Carlo Marcelo Arenas Belón

[-- Attachment #1: Type: text/plain, Size: 4559 bytes --]

Hi Ævar,

ACK!

And thank you for this patch,
Dscho

On Thu, 4 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
> allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
> functions for allocation. We'll now use it for all PCREv2 allocations.
>
> The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
> issue is because it managed to target pretty much the allocation freed
> via free(), as opposed to by a pcre2_*free() function. I.e. the
> pcre2_maketables() and pcre2_maketables_free() pair. For most of the
> rest we continued allocating with stock malloc() inside PCREv2 itself,
> but didn't segfault because we'd use its corresponding free().
>
> In a preceding commit of mine I changed the free() to
> pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
> far as fixing the segfault goes we could revert 513f2b0bbd4. But then
> we wouldn't use the desired allocator, let's just use it instead.
>
> Before this patch we'd on e.g.:
>
>     grep --threads=1 -iP æ.*var.*xyz
>
> Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
> corresponding free() call. Now it's 12 calls to each. This can be
> observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
>
> Reading the history of how this bug got introduced it wasn't present
> in Johannes's original patch[1] to fix the issue.
>
> My reading of that thread is that the approach the follow-up patches
> to Johannes's original pursued were based on misunderstanding of how
> the PCREv2 API works. In particular this part of [2]:
>
>     "most of the time (like when using UTF-8) the chartable (and
>     therefore the global context) is not needed (even when using
>     alternate allocators)"
>
> That's simply not how PCREv2 memory allocation works. It's easy to see
> how the misunderstanding came about. It's because (as noted above) the
> issue was noticed because of our use of free() in our own grep.c for
> freeing the memory allocated by pcre2_maketables().
>
> Thus the misunderstanding that PCREv2's compile context is something
> only needed for pcre2_maketables(), and e.g. an aborted earlier
> attempt[3] to only set it up when we ourselves called
> pcre2_maketables().
>
> That's not what PCREv2's compile context is. To quote PCREv2's
> documentation:
>
>     "This context just contains pointers to (and data for) external
>     memory management functions that are called from several places in
>     the PCRE2 library."
>
> Thus the failed attempts to go down the route of only creating the
> general context in cases where we ourselves call pcre2_maketables(),
> before finally settling on the approach 513f2b0bbd4 took of always
> creating it.
>
> Instead we should always create it, and then pass the general context
> to those functions that accept it, so that they'll consistently use
> our preferred memory allocation functions.
>
> 1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
> 2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
> 3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index c63dbff4b2..0116ff5f09 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  			if (!pcre2_global_context)
>  				BUG("pcre2_global_context uninitialized");
>  			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
> -			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
> +			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
>  			pcre2_set_character_tables(p->pcre2_compile_context,
>  							p->pcre2_tables);
>  		}
> @@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  					 p->pcre2_compile_context);
>
>  	if (p->pcre2_pattern) {
> -		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
> +		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
>  		if (!p->pcre2_match_data)
>  			die("Couldn't allocate PCRE2 match data");
>  	} else {
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free}
  2021-02-04 21:05     ` [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
@ 2021-02-10 12:40       ` Johannes Schindelin
  0 siblings, 0 replies; 42+ messages in thread
From: Johannes Schindelin @ 2021-02-10 12:40 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Carlo Marcelo Arenas Belón

[-- Attachment #1: Type: text/plain, Size: 2497 bytes --]

Hi Ævar,

On Thu, 4 Feb 2021, Ævar Arnfjörð Bjarmason wrote:

> Move the definitions of the pcre2_{malloc,free} functions above the
> compile_pcre2_pattern() function they're used it. Before the preceding

s/it/in/

Thank you for this entire patch series. I like its intention and its
execution.

Ciao,
Dscho

> commit they used to be needed earlier, but now we can move them to be
> adjacent to the other PCREv2 functions.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 46 ++++++++++++++++++++++------------------------
>  1 file changed, 22 insertions(+), 24 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 2599f329cd..636ac48bf0 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -40,30 +40,6 @@ static struct grep_opt grep_defaults = {
>  	.output = std_output,
>  };
>
> -#ifdef USE_LIBPCRE2
> -#define GREP_PCRE2_DEBUG_MALLOC 0
> -
> -static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
> -{
> -	void *pointer = malloc(size);
> -#if GREP_PCRE2_DEBUG_MALLOC
> -	static int count = 1;
> -	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
> -#endif
> -	return pointer;
> -}
> -
> -static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
> -{
> -#if GREP_PCRE2_DEBUG_MALLOC
> -	static int count = 1;
> -	if (pointer)
> -		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> -#endif
> -	free(pointer);
> -}
> -#endif
> -
>  static const char *color_grep_slots[] = {
>  	[GREP_COLOR_CONTEXT]	    = "context",
>  	[GREP_COLOR_FILENAME]	    = "filename",
> @@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len)
>  }
>
>  #ifdef USE_LIBPCRE2
> +#define GREP_PCRE2_DEBUG_MALLOC 0
> +
> +static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
> +{
> +	void *pointer = malloc(size);
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count = 1;
> +	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
> +#endif
> +	return pointer;
> +}
> +
> +static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
> +{
> +#if GREP_PCRE2_DEBUG_MALLOC
> +	static int count = 1;
> +	if (pointer)
> +		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
> +#endif
> +	free(pointer);
> +}
> +
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
>  {
>  	int error;
> --
> 2.30.0.284.gd98b1dd5eaa7
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/10] grep/pcre2: memory allocation fixes
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
@ 2021-02-10 21:34       ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 " Ævar Arnfjörð Bjarmason
                         ` (10 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-02-10 21:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> This series fixes up bugs in our PCRE v2 wrapper code and how it
> handles malloc()/free().
>
> Junio: I'm splitting this off my recently sent 25 patch series, which
> I should probably have sent as an RFC:
> https://lore.kernel.org/git/20210203032811.14979-1-avarab@gmail.com/
>
> It's on top of "next", as it would otherwise conflict with my
> in-flight ab/grep-pcre-invalid-utf8, ab/lose-grep-debug and ab/retire-pcre1.

These three seem to be solid and I am planning to merge them down to
'master' soonish.  I was hoping that the series would get enough
reviews by the time it happens so that I can expect an update that
cleanly applies on top of 'master', and the plan seems to be working
well ;-)


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 00/10] grep/pcre2: memory allocation fixes
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
  2021-02-10 21:34       ` Junio C Hamano
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-03-04  0:34         ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
                         ` (9 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Now based on "master", when I sent v1[1] it dependen on other
in-flight topics of mine.

Addresses minor issues with the commit messages raised by Johannes on
v1, and other commit message issues I noticed myself on re-reading
this.

1. https://lore.kernel.org/git/20210204210556.25242-1-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (10):
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: move definitions of pcre2_{malloc,free}

 builtin/grep.c |  1 -
 grep.c         | 99 ++++++++++++++++++++++----------------------------
 grep.h         |  9 ++++-
 3 files changed, 51 insertions(+), 58 deletions(-)

Range-diff:
 1:  a11f1e91552 !  1:  40d2e47c540 grep/pcre2: drop needless assignment + assert() on opt->pcre2
    @@ Commit message
         There was never a good reason for this, it's just a relic from when I
         initially wrote the PCREv2 support. We're not going to have confusion
         about compile_pcre2_pattern() being called when it shouldn't just
    -    because we forgot to cargo-cult this opt->pcre2 option, and "opt"
    -    is (mostly) used for the options the user supplied, let's avoid the
    -    pattern of needlessly assigning to it.
    +    because we forgot to cargo-cult this opt->pcre2 option.
     
    -    With my in-flight removal of PCREv1 [1] ("Remove support for v1 of the
    -    PCRE library", 2021-01-24) there'll be even less confusion around what
    -    we call where in these codepaths, which is one more reason to remove
    -    this.
    +    Furthermore the "struct grep_opt" is (mostly) used for the options the
    +    user supplied, let's avoid the pattern of needlessly assigning to it.
     
    -    1. https://lore.kernel.org/git/xmqqmtwy29x8.fsf@gitster.c.googlers.com/
    +    With my recent removal of the PCREv1 backend in 7599730b7e2 (Remove
    +    support for v1 of the PCRE library, 2021-01-24) there's even less
    +    confusion around what we call where in these codepaths, which is one
    +    more reason to remove this.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 2:  db0ef9189e3 !  2:  e02f9b5ab50 grep/pcre2: drop needless assignment to NULL
    @@ Commit message
         grep/pcre2: drop needless assignment to NULL
     
         Remove a redundant assignment of pcre2_compile_context dating back to
    -    my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01). In
    -    create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
    +    my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01).
    +
    +    In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
         need to NULL out individual members here.
     
         I think this was probably something left over from an earlier
 3:  9c5429f4d96 =  3:  2bcb6c53589 grep/pcre2: correct reference to grep_init() in comment
 4:  e5558f5f0f1 !  4:  78477193cd4 grep/pcre2: prepare to add debugging to pcre2_malloc()
    @@ Commit message
         grep/pcre2: prepare to add debugging to pcre2_malloc()
     
         Change pcre2_malloc() in a way that'll make it easier for a debugging
    -    fprintf() to spew out the allocated pointer. This doesn't introduce
    -    any functional change, it just makes a subsequent commit's diff easier
    -    to read. Changes code added in 513f2b0bbd4 (grep: make PCRE2 aware of
    -    custom allocator, 2019-10-16).
    +    fprintf() to spew out the allocated pointer.
    +
    +    This doesn't introduce any functional change, it just makes a
    +    subsequent commit's diff easier to read. Changes code added in
    +    513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16).
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 5:  7968effaa8a !  5:  cbeb521bd65 grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
    @@ Commit message
         grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
     
         Add optional printing of PCREv2 allocations to stderr for a developer
    -    who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to
    -    "1".
    +    who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".
    +
    +    You need to manually change the definition in the source file similar
    +    to the DEBUG_MAILMAP, there's no Makefile knob for this.
     
         This will be referenced a subsequent commit, and is generally useful
         to manually see what's going on with PCREv2 allocations while working
 6:  604e183c224 =  6:  788929c3de6 grep/pcre2: use compile-time PCREv2 version test
 7:  f864a9aac4c !  7:  6a4552c6d4e grep/pcre2: use pcre2_maketables_free() function
    @@ Commit message
         grep/pcre2: use pcre2_maketables_free() function
     
         Make use of the pcre2_maketables_free() function to free the memory
    -    allocated by pcre2_maketables(). At first sight it's strange that
    -    10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
    -    which added the free() call here doesn't make use of the pcre2_free()
    -    the author introduced in the preceding commit in 513f2b0bbd4 (grep:
    -    make PCRE2 aware of custom allocator, 2019-10-16).
    +    allocated by pcre2_maketables().
    +
    +    At first sight it's strange that 10da030ab75 (grep: avoid leak of
    +    chartables in PCRE2, 2019-10-16) which added the free() call here
    +    doesn't make use of the pcre2_free() the author introduced in the
    +    preceding commit in 513f2b0bbd4 (grep: make PCRE2 aware of custom
    +    allocator, 2019-10-16).
     
         The reason is that at the time the function didn't exist. It was first
         introduced in PCREv2 version 10.34, released on 2019-11-21.
     
         Let's make use of it behind a macro. I don't think this matters for
         anything to do with custom allocators, but it makes our use of PCREv2
    -    more discoverable. At some distant point in the future we'll be able
    -    to drop the version guard, as nobody will be running a version older
    -    than 10.34.
    +    more discoverable.
    +
    +    At some distant point in the future we'll be able to drop the version
    +    guard, as nobody will be running a version older than 10.34.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
 8:  cea9e066951 !  8:  bf58d597a4b grep/pcre2: actually make pcre2 use custom allocator
    @@ Commit message
         functions for allocation. We'll now use it for all PCREv2 allocations.
     
         The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
    -    issue is because it managed to target pretty much the allocation freed
    -    via free(), as opposed to by a pcre2_*free() function. I.e. the
    -    pcre2_maketables() and pcre2_maketables_free() pair. For most of the
    -    rest we continued allocating with stock malloc() inside PCREv2 itself,
    -    but didn't segfault because we'd use its corresponding free().
    +    issue is because it targeted the allocation freed via free(), as
    +    opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
    +    and pcre2_maketables_free() pair.
    +
    +    For most of the rest we continued allocating with stock malloc()
    +    inside PCREv2 itself, but didn't segfault because we'd use its
    +    corresponding free().
     
         In a preceding commit of mine I changed the free() to
         pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
    @@ Commit message
             grep --threads=1 -iP æ.*var.*xyz
     
         Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
    -    corresponding free() call. Now it's 12 calls to each. This can be
    +    corresponding free() calls. Now it's 12 calls to each. This can be
         observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.
     
         Reading the history of how this bug got introduced it wasn't present
    @@ Commit message
         Thus the failed attempts to go down the route of only creating the
         general context in cases where we ourselves call pcre2_maketables(),
         before finally settling on the approach 513f2b0bbd4 took of always
    -    creating it.
    +    creating it, but then mostly not using it.
     
         Instead we should always create it, and then pass the general context
         to those functions that accept it, so that they'll consistently use
         our preferred memory allocation functions.
     
    -    1. https://public-inbox.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
    +    1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
         2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
         3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/
     
 9:  99eaa918261 !  9:  129adf872fb grep/pcre2: move back to thread-only PCREv2 structures
    @@ Commit message
         grep/pcre2: move back to thread-only PCREv2 structures
     
         Change the setup of the "pcre2_general_context" to happen per-thread
    -    in compile_pcre2_pattern() instead of in grep_init(), as happens with
    -    all the rest of the pcre2_* members of the grep_pat structure.
    +    in compile_pcre2_pattern() instead of in grep_init().
    +
    +    This change brings it in line with how the rest of the pcre2_* members
    +    in the grep_pat structure are set up.
     
         As noted in the preceding commit the approach 513f2b0bbd4 (grep: make
         PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
         pcre2_general_context seems to have been initially based on a
         misunderstanding of how PCREv2 memory allocation works.
     
    -    This approach of creating a global context is just added complexity
    -    for almost zero gain. On my system it's 24 bytes saved per-thread, for
    -    context PCREv2 will then go on to some kilobytes for its own
    -    thread-local state.
    +    The approach of creating a global context in grep_init() is just added
    +    complexity for almost zero gain. On my system it's 24 bytes saved
    +    per-thread. For comparison PCREv2 will then go on to allocate at least
    +    a kilobyte for its own thread-local state.
     
         As noted in 6d423dd542f (grep: don't redundantly compile throwaway
         patterns under threading, 2017-05-25) the grep code is intentionally
    @@ Commit message
         structures globally, while making others thread-local.
     
         So let's remove this special case and make all of them thread-local
    -    for simplicity again.
    +    again for simplicity. With this change we could move the
    +    pcre2_{malloc,free} functions around to live closer to their current
    +    use. I'm not doing that here to keep this change small, that cleanup
    +    will be done in a follow-up commit.
     
         See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
         2017-06-01) about thread safety, and Johannes's comments[1] to the
10:  462f12126c8 ! 10:  688d1b00d5d grep/pcre2: move definitions of pcre2_{malloc,free}
    @@ Commit message
         grep/pcre2: move definitions of pcre2_{malloc,free}
     
         Move the definitions of the pcre2_{malloc,free} functions above the
    -    compile_pcre2_pattern() function they're used it. Before the preceding
    -    commit they used to be needed earlier, but now we can move them to be
    -    adjacent to the other PCREv2 functions.
    +    compile_pcre2_pattern() function they're used in.
    +
    +    Before the preceding commit they used to be needed earlier, but now we
    +    can move them to be adjacent to the other PCREv2 functions.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
  2021-02-10 21:34       ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 " Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
                         ` (8 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Drop an assignment added in b65abcafc7a (grep: use PCRE v2 for
optimized fixed-string search, 2019-07-01) and the overly cautious
assert() I added in 94da9193a6e (grep: add support for PCRE v2,
2017-06-01).

There was never a good reason for this, it's just a relic from when I
initially wrote the PCREv2 support. We're not going to have confusion
about compile_pcre2_pattern() being called when it shouldn't just
because we forgot to cargo-cult this opt->pcre2 option.

Furthermore the "struct grep_opt" is (mostly) used for the options the
user supplied, let's avoid the pattern of needlessly assigning to it.

With my recent removal of the PCREv1 backend in 7599730b7e2 (Remove
support for v1 of the PCRE library, 2021-01-24) there's even less
confusion around what we call where in these codepaths, which is one
more reason to remove this.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/grep.c b/grep.c
index aabfaaa4c32..816e23f17ef 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	assert(opt->pcre2);
-
 	p->pcre2_compile_context = NULL;
 
 	/* pcre2_global_context is initialized in append_grep_pattern */
@@ -555,7 +553,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 #endif
 	if (p->fixed || p->is_fixed) {
 #ifdef USE_LIBPCRE2
-		opt->pcre2 = 1;
 		if (p->is_fixed) {
 			compile_pcre2_pattern(p, opt);
 		} else {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/10] grep/pcre2: drop needless assignment to NULL
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (2 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Remove a redundant assignment of pcre2_compile_context dating back to
my 94da9193a6e (grep: add support for PCRE v2, 2017-06-01).

In create_grep_pat() we xcalloc() the "grep_pat" struct, so there's no
need to NULL out individual members here.

I think this was probably something left over from an earlier
development version of mine.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/grep.c b/grep.c
index 816e23f17ef..f27c5de7f56 100644
--- a/grep.c
+++ b/grep.c
@@ -373,8 +373,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	p->pcre2_compile_context = NULL;
-
 	/* pcre2_global_context is initialized in append_grep_pattern */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/10] grep/pcre2: correct reference to grep_init() in comment
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (3 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Correct a comment added in 513f2b0bbd4 (grep: make PCRE2 aware of
custom allocator, 2019-10-16). This comment was never correct in
git.git, but was consistent with an older version of the patch[1].

1. https://lore.kernel.org/git/20190806163658.66932-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index f27c5de7f56..b9adcd83e7a 100644
--- a/grep.c
+++ b/grep.c
@@ -373,7 +373,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in append_grep_pattern */
+	/* pcre2_global_context is initialized in grep_init */
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
 			if (!pcre2_global_context)
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc()
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (4 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Change pcre2_malloc() in a way that'll make it easier for a debugging
fprintf() to spew out the allocated pointer.

This doesn't introduce any functional change, it just makes a
subsequent commit's diff easier to read. Changes code added in
513f2b0bbd4 (grep: make PCRE2 aware of custom allocator, 2019-10-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index b9adcd83e7a..f96d86c9293 100644
--- a/grep.c
+++ b/grep.c
@@ -45,7 +45,8 @@ static pcre2_general_context *pcre2_global_context;
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
-	return malloc(size);
+	void *pointer = malloc(size);
+	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (5 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Add optional printing of PCREv2 allocations to stderr for a developer
who manually changes the GREP_PCRE2_DEBUG_MALLOC definition to "1".

You need to manually change the definition in the source file similar
to the DEBUG_MAILMAP, there's no Makefile knob for this.

This will be referenced a subsequent commit, and is generally useful
to manually see what's going on with PCREv2 allocations while working
on that code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/grep.c b/grep.c
index f96d86c9293..7d262a23d88 100644
--- a/grep.c
+++ b/grep.c
@@ -42,15 +42,25 @@ static struct grep_opt grep_defaults = {
 
 #ifdef USE_LIBPCRE2
 static pcre2_general_context *pcre2_global_context;
+#define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
 {
 	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
 	return pointer;
 }
 
 static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
 {
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
 	free(pointer);
 }
 #endif
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (6 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-03-04  0:14         ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Replace a use of pcre2_config(PCRE2_CONFIG_VERSION, ...) which I added
in 95ca1f987ed (grep/pcre2: better support invalid UTF-8 haystacks,
2021-01-24) with the same test done at compile-time.

It might be cuter to do this at runtime since we don't have to do the
"major >= 11 || (major >= 10 && ...)" test. But in the next commit
we'll add another version comparison that absolutely needs to be done
at compile-time, so we're better of being consistent across the board.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 18 ++++--------------
 grep.h |  3 +++
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/grep.c b/grep.c
index 7d262a23d88..e58044474dc 100644
--- a/grep.c
+++ b/grep.c
@@ -400,21 +400,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	    !(!opt->ignore_case && (p->fixed || p->is_fixed)))
 		options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF);
 
+#ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER
 	/* Work around https://bugs.exim.org/show_bug.cgi?id=2642 fixed in 10.36 */
-	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS)) {
-		struct strbuf buf;
-		int len;
-		int err;
-
-		if ((len = pcre2_config(PCRE2_CONFIG_VERSION, NULL)) < 0)
-			BUG("pcre2_config(..., NULL) failed: %d", len);
-		strbuf_init(&buf, len + 1);
-		if ((err = pcre2_config(PCRE2_CONFIG_VERSION, buf.buf)) < 0)
-			BUG("pcre2_config(..., buf.buf) failed: %d", err);
-		if (versioncmp(buf.buf, "10.36") < 0)
-			options |= PCRE2_NO_START_OPTIMIZE;
-		strbuf_release(&buf);
-	}
+	if (PCRE2_MATCH_INVALID_UTF && options & (PCRE2_UTF | PCRE2_CASELESS))
+		options |= PCRE2_NO_START_OPTIMIZE;
+#endif
 
 	p->pcre2_pattern = pcre2_compile((PCRE2_SPTR)p->pattern,
 					 p->patternlen, options, &error, &erroffset,
diff --git a/grep.h b/grep.h
index ae89d6254b3..54e52042cb9 100644
--- a/grep.h
+++ b/grep.h
@@ -4,6 +4,9 @@
 #ifdef USE_LIBPCRE2
 #define PCRE2_CODE_UNIT_WIDTH 8
 #include <pcre2.h>
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_36_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/10] grep/pcre2: use pcre2_maketables_free() function
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (7 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Make use of the pcre2_maketables_free() function to free the memory
allocated by pcre2_maketables().

At first sight it's strange that 10da030ab75 (grep: avoid leak of
chartables in PCRE2, 2019-10-16) which added the free() call here
doesn't make use of the pcre2_free() the author introduced in the
preceding commit in 513f2b0bbd4 (grep: make PCRE2 aware of custom
allocator, 2019-10-16).

The reason is that at the time the function didn't exist. It was first
introduced in PCREv2 version 10.34, released on 2019-11-21.

Let's make use of it behind a macro. I don't think this matters for
anything to do with custom allocators, but it makes our use of PCREv2
more discoverable.

At some distant point in the future we'll be able to drop the version
guard, as nobody will be running a version older than 10.34.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++++
 grep.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/grep.c b/grep.c
index e58044474dc..c63dbff4b24 100644
--- a/grep.c
+++ b/grep.c
@@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_compile_context_free(p->pcre2_compile_context);
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
+#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
+	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+#else
 	free((void *)p->pcre2_tables);
+#endif
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 54e52042cb9..64666e92049 100644
--- a/grep.h
+++ b/grep.h
@@ -7,6 +7,9 @@
 #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
 #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
 #endif
+#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
+#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
+#endif
 #else
 typedef int pcre2_code;
 typedef int pcre2_match_data;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (8 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-03-04  0:24         ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
  2021-02-18  0:07       ` [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
functions for allocation. We'll now use it for all PCREv2 allocations.

The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
issue is because it targeted the allocation freed via free(), as
opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
and pcre2_maketables_free() pair.

For most of the rest we continued allocating with stock malloc()
inside PCREv2 itself, but didn't segfault because we'd use its
corresponding free().

In a preceding commit of mine I changed the free() to
pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
far as fixing the segfault goes we could revert 513f2b0bbd4. But then
we wouldn't use the desired allocator, let's just use it instead.

Before this patch we'd on e.g.:

    grep --threads=1 -iP æ.*var.*xyz

Only use pcre2_{malloc,free}() for 2 malloc() calls and 2
corresponding free() calls. Now it's 12 calls to each. This can be
observed with the GREP_PCRE2_DEBUG_MALLOC debug mode.

Reading the history of how this bug got introduced it wasn't present
in Johannes's original patch[1] to fix the issue.

My reading of that thread is that the approach the follow-up patches
to Johannes's original pursued were based on misunderstanding of how
the PCREv2 API works. In particular this part of [2]:

    "most of the time (like when using UTF-8) the chartable (and
    therefore the global context) is not needed (even when using
    alternate allocators)"

That's simply not how PCREv2 memory allocation works. It's easy to see
how the misunderstanding came about. It's because (as noted above) the
issue was noticed because of our use of free() in our own grep.c for
freeing the memory allocated by pcre2_maketables().

Thus the misunderstanding that PCREv2's compile context is something
only needed for pcre2_maketables(), and e.g. an aborted earlier
attempt[3] to only set it up when we ourselves called
pcre2_maketables().

That's not what PCREv2's compile context is. To quote PCREv2's
documentation:

    "This context just contains pointers to (and data for) external
    memory management functions that are called from several places in
    the PCRE2 library."

Thus the failed attempts to go down the route of only creating the
general context in cases where we ourselves call pcre2_maketables(),
before finally settling on the approach 513f2b0bbd4 took of always
creating it, but then mostly not using it.

Instead we should always create it, and then pass the general context
to those functions that accept it, so that they'll consistently use
our preferred memory allocation functions.

1. https://lore.kernel.org/git/3397e6797f872aedd18c6d795f4976e1c579514b.1565005867.git.gitgitgadget@gmail.com/
2. https://lore.kernel.org/git/CAPUEsphMh_ZqcH3M7PXC9jHTfEdQN3mhTAK2JDkdvKBp53YBoA@mail.gmail.com/
3. https://lore.kernel.org/git/20190806085014.47776-3-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index c63dbff4b24..0116ff5f09f 100644
--- a/grep.c
+++ b/grep.c
@@ -390,7 +390,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 			if (!pcre2_global_context)
 				BUG("pcre2_global_context uninitialized");
 			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(NULL);
+			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +411,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, NULL);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (9 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-03-04  0:27         ` Junio C Hamano
  2021-02-18  0:07       ` [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().

This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.

As noted in the preceding commit the approach 513f2b0bbd4 (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.

As noted in 6d423dd542f (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c |  1 -
 grep.c         | 41 +++++++++++++++--------------------------
 grep.h         |  3 ++-
 3 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 55d06c95130..c69fe993406 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1175,6 +1175,5 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		run_pager(&opt, prefix);
 	clear_pathspec(&pathspec);
 	free_grep_patterns(&opt);
-	grep_destroy();
 	return !hit;
 }
diff --git a/grep.c b/grep.c
index 0116ff5f09f..2599f329cd6 100644
--- a/grep.c
+++ b/grep.c
@@ -41,7 +41,6 @@ static struct grep_opt grep_defaults = {
 };
 
 #ifdef USE_LIBPCRE2
-static pcre2_general_context *pcre2_global_context;
 #define GREP_PCRE2_DEBUG_MALLOC 0
 
 static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
@@ -163,20 +162,9 @@ int grep_config(const char *var, const char *value, void *cb)
  * Initialize one instance of grep_opt and copy the
  * default values from the template we read the configuration
  * information in an earlier call to git_config(grep_config).
- *
- * If using PCRE, make sure that the library is configured
- * to use the same allocator as Git (e.g. nedmalloc on Windows).
- *
- * Any allocated memory needs to be released in grep_destroy().
  */
 void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix)
 {
-#if defined(USE_LIBPCRE2)
-	if (!pcre2_global_context)
-		pcre2_global_context = pcre2_general_context_create(
-					pcre2_malloc, pcre2_free, NULL);
-#endif
-
 	*opt = grep_defaults;
 
 	opt->repo = repo;
@@ -186,13 +174,6 @@ void grep_init(struct grep_opt *opt, struct repository *repo, const char *prefix
 	opt->header_tail = &opt->header_list;
 }
 
-void grep_destroy(void)
-{
-#ifdef USE_LIBPCRE2
-	pcre2_general_context_free(pcre2_global_context);
-#endif
-}
-
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct grep_opt *opt)
 {
 	/*
@@ -384,13 +365,20 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 	int patinforet;
 	size_t jitsizearg;
 
-	/* pcre2_global_context is initialized in grep_init */
+	/*
+	 * Call pcre2_general_context_create() before calling any
+	 * other pcre2_*(). It sets up our malloc()/free() functions
+	 * with which everything else is allocated.
+	 */
+	p->pcre2_general_context = pcre2_general_context_create(
+		pcre2_malloc, pcre2_free, NULL);
+	if (!p->pcre2_general_context)
+		die("Couldn't allocate PCRE2 general context");
+
 	if (opt->ignore_case) {
 		if (!opt->ignore_locale && has_non_ascii(p->pattern)) {
-			if (!pcre2_global_context)
-				BUG("pcre2_global_context uninitialized");
-			p->pcre2_tables = pcre2_maketables(pcre2_global_context);
-			p->pcre2_compile_context = pcre2_compile_context_create(pcre2_global_context);
+			p->pcre2_tables = pcre2_maketables(p->pcre2_general_context);
+			p->pcre2_compile_context = pcre2_compile_context_create(p->pcre2_general_context);
 			pcre2_set_character_tables(p->pcre2_compile_context,
 							p->pcre2_tables);
 		}
@@ -411,7 +399,7 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 					 p->pcre2_compile_context);
 
 	if (p->pcre2_pattern) {
-		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, pcre2_global_context);
+		p->pcre2_match_data = pcre2_match_data_create_from_pattern(p->pcre2_pattern, p->pcre2_general_context);
 		if (!p->pcre2_match_data)
 			die("Couldn't allocate PCRE2 match data");
 	} else {
@@ -491,10 +479,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
 	pcre2_code_free(p->pcre2_pattern);
 	pcre2_match_data_free(p->pcre2_match_data);
 #ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
-	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
+	pcre2_maketables_free(p->pcre2_general_context, p->pcre2_tables);
 #else
 	free((void *)p->pcre2_tables);
 #endif
+	pcre2_general_context_free(p->pcre2_general_context);
 }
 #else /* !USE_LIBPCRE2 */
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
diff --git a/grep.h b/grep.h
index 64666e92049..72f82b1e302 100644
--- a/grep.h
+++ b/grep.h
@@ -14,6 +14,7 @@
 typedef int pcre2_code;
 typedef int pcre2_match_data;
 typedef int pcre2_compile_context;
+typedef int pcre2_general_context;
 #endif
 #ifndef PCRE2_MATCH_INVALID_UTF
 /* PCRE2_MATCH_* dummy also with !USE_LIBPCRE2, for test-pcre2-config.c */
@@ -75,6 +76,7 @@ struct grep_pat {
 	pcre2_code *pcre2_pattern;
 	pcre2_match_data *pcre2_match_data;
 	pcre2_compile_context *pcre2_compile_context;
+	pcre2_general_context *pcre2_general_context;
 	const uint8_t *pcre2_tables;
 	uint32_t pcre2_jit_on;
 	unsigned fixed:1;
@@ -167,7 +169,6 @@ struct grep_opt {
 
 int grep_config(const char *var, const char *value, void *);
 void grep_init(struct grep_opt *, struct repository *repo, const char *prefix);
-void grep_destroy(void);
 void grep_commit_pattern_type(enum grep_pattern_type, struct grep_opt *opt);
 
 void append_grep_pat(struct grep_opt *opt, const char *pat, size_t patlen, const char *origin, int no, enum grep_pat_token t);
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free}
  2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
                         ` (10 preceding siblings ...)
  2021-02-18  0:07       ` [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
@ 2021-02-18  0:07       ` Ævar Arnfjörð Bjarmason
  2021-03-04  0:28         ` Junio C Hamano
  11 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-18  0:07 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Johannes Schindelin,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

Move the definitions of the pcre2_{malloc,free} functions above the
compile_pcre2_pattern() function they're used in.

Before the preceding commit they used to be needed earlier, but now we
can move them to be adjacent to the other PCREv2 functions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/grep.c b/grep.c
index 2599f329cd6..636ac48bf07 100644
--- a/grep.c
+++ b/grep.c
@@ -40,30 +40,6 @@ static struct grep_opt grep_defaults = {
 	.output = std_output,
 };
 
-#ifdef USE_LIBPCRE2
-#define GREP_PCRE2_DEBUG_MALLOC 0
-
-static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
-{
-	void *pointer = malloc(size);
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
-#endif
-	return pointer;
-}
-
-static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
-{
-#if GREP_PCRE2_DEBUG_MALLOC
-	static int count = 1;
-	if (pointer)
-		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
-#endif
-	free(pointer);
-}
-#endif
-
 static const char *color_grep_slots[] = {
 	[GREP_COLOR_CONTEXT]	    = "context",
 	[GREP_COLOR_FILENAME]	    = "filename",
@@ -355,6 +331,28 @@ static int is_fixed(const char *s, size_t len)
 }
 
 #ifdef USE_LIBPCRE2
+#define GREP_PCRE2_DEBUG_MALLOC 0
+
+static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data)
+{
+	void *pointer = malloc(size);
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	fprintf(stderr, "PCRE2:%p -> #%02d: alloc(%lu)\n", pointer, count++, size);
+#endif
+	return pointer;
+}
+
+static void pcre2_free(void *pointer, MAYBE_UNUSED void *memory_data)
+{
+#if GREP_PCRE2_DEBUG_MALLOC
+	static int count = 1;
+	if (pointer)
+		fprintf(stderr, "PCRE2:%p -> #%02d: free()\n", pointer, count++);
+#endif
+	free(pointer);
+}
+
 static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
 {
 	int error;
-- 
2.30.0.284.gd98b1dd5eaa7


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test
  2021-02-18  0:07       ` [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
@ 2021-03-04  0:14         ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
> +#define GIT_PCRE2_VERSION_10_36_OR_HIGHER
> +#endif

This is not wrong per-se but it does look misleading not to spell it
as

#if (PCRE2_MAJOR == 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11

which would convey the intention much clearly.

Hint to notice the difference: imagine if it were "9.37 and later in
9.X series, 10.36 and later in 10.X series and anything after 11 are
OK".

In other words, the minor version is always tied to a particular
major version and "major >= X && minor >= Y" is often a bug, even
though in this case it happens to be OK only because 10 and 11 are
consecutive.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function
  2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
  2021-02-10 10:43       ` Johannes Schindelin
@ 2021-03-04  0:16       ` Junio C Hamano
  1 sibling, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Make use of the pcre2_maketables_free() function to free the memory
> allocated by pcre2_maketables(). At first sight it's strange that
> 10da030ab75 (grep: avoid leak of chartables in PCRE2, 2019-10-16)
> which added the free() call here doesn't make use of the pcre2_free()
> the author introduced in the preceding commit in 513f2b0bbd4 (grep:
> make PCRE2 aware of custom allocator, 2019-10-16).
>
> The reason is that at the time the function didn't exist. It was first
> introduced in PCREv2 version 10.34, released on 2019-11-21.
>
> Let's make use of it behind a macro. I don't think this matters for
> anything to do with custom allocators, but it makes our use of PCREv2
> more discoverable. At some distant point in the future we'll be able
> to drop the version guard, as nobody will be running a version older
> than 10.34.

OK.  The same comment about the macro that happens to be OK only
because 10 and 11 are consecutive applies here.

>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 4 ++++
>  grep.h | 3 +++
>  2 files changed, 7 insertions(+)
>
> diff --git a/grep.c b/grep.c
> index e58044474d..c63dbff4b2 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -490,7 +490,11 @@ static void free_pcre2_pattern(struct grep_pat *p)
>  	pcre2_compile_context_free(p->pcre2_compile_context);
>  	pcre2_code_free(p->pcre2_pattern);
>  	pcre2_match_data_free(p->pcre2_match_data);
> +#ifdef GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +	pcre2_maketables_free(pcre2_global_context, p->pcre2_tables);
> +#else
>  	free((void *)p->pcre2_tables);
> +#endif
>  }
>  #else /* !USE_LIBPCRE2 */
>  static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt)
> diff --git a/grep.h b/grep.h
> index 54e52042cb..64666e9204 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -7,6 +7,9 @@
>  #if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 36) || PCRE2_MAJOR >= 11
>  #define GIT_PCRE2_VERSION_10_36_OR_HIGHER
>  #endif
> +#if (PCRE2_MAJOR >= 10 && PCRE2_MINOR >= 34) || PCRE2_MAJOR >= 11
> +#define GIT_PCRE2_VERSION_10_34_OR_HIGHER
> +#endif
>  #else
>  typedef int pcre2_code;
>  typedef int pcre2_match_data;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator
  2021-02-18  0:07       ` [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
@ 2021-03-04  0:24         ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Continue work started in 513f2b0bbd4 (grep: make PCRE2 aware of custom
> allocator, 2019-10-16) and make PCREv2 use our pcre2_{malloc,free}().
> functions for allocation. We'll now use it for all PCREv2 allocations.
>
> The reason 513f2b0bbd4 worked as a bugfix for the USE_NED_ALLOCATOR
> issue is because it targeted the allocation freed via free(), as
> opposed to by a pcre2_*free() function. I.e. the pcre2_maketables()
> and pcre2_maketables_free() pair.
>
> For most of the rest we continued allocating with stock malloc()
> inside PCREv2 itself, but didn't segfault because we'd use its
> corresponding free().
>
> In a preceding commit of mine I changed the free() to
> pcre2_maketables_free() on versions of PCREv2 10.34 and newer. So as
> far as fixing the segfault goes ...

Wait, wait.  So, because of the previous step, we would start
segfaulting and we need to fix that breakage, which is the reason
why this commit exists?

If so, ...

> we could revert 513f2b0bbd4. But then
> we wouldn't use the desired allocator, let's just use it instead.

... I agree with the conclusion that both the previous step and this
step are needed and better than a reversion of 513f2b0b (grep: make
PCRE2 aware of custom allocator, 2019-10-16) and the previou step.

But even then, it feels somewhat backwards.  Shouldn't this step
come first, so that we would be using a matching alloc/free pair,
and then do the previous step?

> Instead we should always create it, and then pass the general context
> to those functions that accept it, so that they'll consistently use
> our preferred memory allocation functions.

Thanks.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures
  2021-02-18  0:07       ` [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
@ 2021-03-04  0:27         ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> So let's remove this special case and make all of them thread-local
> again for simplicity. With this change we could move the
> pcre2_{malloc,free} functions around to live closer to their current
> use. I'm not doing that here to keep this change small, that cleanup
> will be done in a follow-up commit.

Nice.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free}
  2021-02-18  0:07       ` [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
@ 2021-03-04  0:28         ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Move the definitions of the pcre2_{malloc,free} functions above the
> compile_pcre2_pattern() function they're used in.
>
> Before the preceding commit they used to be needed earlier, but now we
> can move them to be adjacent to the other PCREv2 functions.

Yes, much nicer.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 00/10] grep/pcre2: memory allocation fixes
  2021-02-18  0:07       ` [PATCH v2 " Ævar Arnfjörð Bjarmason
@ 2021-03-04  0:34         ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2021-03-04  0:34 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Johannes Schindelin, Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Now based on "master", when I sent v1[1] it dependen on other
> in-flight topics of mine.

This has been in "Needs review" state for too long, so I looked at
them myself.  Aside from a few minor issues, they all made sense to
me.

Thanks.



^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2021-03-04  1:09 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16 12:06 [PATCH 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
2019-10-16 12:06 ` [PATCH 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:06 ` [PATCH 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:06 ` [PATCH 3/3] grep: avoid leak of chartables in PCRE2 Johannes Schindelin via GitGitGadget
2019-10-16 12:10 ` [PATCH v2 0/3] Revive 'pcre2-chartables-leakfix' Johannes Schindelin via GitGitGadget
2019-10-16 12:10   ` [PATCH v2 1/3] grep: make PCRE1 aware of custom allocator Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-16 12:10   ` [PATCH v2 2/3] grep: make PCRE2 " Carlo Marcelo Arenas Belón via GitGitGadget
2019-10-18  1:38     ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 00/10] grep/pcre2: memory allocation fixes Ævar Arnfjörð Bjarmason
2021-02-10 21:34       ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 " Ævar Arnfjörð Bjarmason
2021-03-04  0:34         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
2021-03-04  0:14         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
2021-02-18  0:07       ` [PATCH v2 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
2021-03-04  0:24         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
2021-03-04  0:27         ` Junio C Hamano
2021-02-18  0:07       ` [PATCH v2 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
2021-03-04  0:28         ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 01/10] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 02/10] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 03/10] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 04/10] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 05/10] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
2021-02-10 10:38       ` Johannes Schindelin
2021-02-04 21:05     ` [PATCH 06/10] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 07/10] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
2021-02-10 10:43       ` Johannes Schindelin
2021-03-04  0:16       ` Junio C Hamano
2021-02-04 21:05     ` [PATCH 08/10] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
2021-02-10 12:38       ` Johannes Schindelin
2021-02-04 21:05     ` [PATCH 09/10] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
2021-02-04 21:05     ` [PATCH 10/10] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
2021-02-10 12:40       ` Johannes Schindelin
2019-10-16 12:10   ` [PATCH v2 3/3] grep: avoid leak of chartables in PCRE2 Carlo Marcelo Arenas Belón via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).