All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/29] Easy to review grep & pre-PCRE changes
@ 2017-05-13 23:14 Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 01/29] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
                   ` (29 more replies)
  0 siblings, 30 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Easy to review? 29 patches? Are you kidding me?!

As noted in v1 (<20170511091829.5634-1-avarab@gmail.com>;
https://public-inbox.org/git/20170511091829.5634-1-avarab@gmail.com/)
these are all doc, test, refactoring etc. changes needed by the
subsequent "PCRE v2, PCRE v1 JIT, log -P & fixes" series.

Thanks a lot for the review everyone. This fixes all the issues
raised. Changes noted below, with names prefixed by the person who
raised the issue.

Ævar Arnfjörð Bjarmason (29):
  Makefile & configure: reword inaccurate comment about PCRE
  grep & rev-list doc: stop promising libpcre for --perl-regexp
  test-lib: rename the LIBPCRE prerequisite to PCRE

No changes.

  log: add exhaustive tests for pattern style options & config

Johannes: Now doesn't create a "(1|2)" tag, so should work on Windows
& beyond (wasn't needed, just created as a side-effect of test_commit)

Junio: Added comments for tricky basic/extended/perl

Junio: Moved all the 'test_have_prereq PCRE' test / test_cmp code
together, not apart as before.

  grep: add a test asserting that --perl-regexp dies when !PCRE
  grep: add a test for backreferences in PCRE patterns

No changes.

  grep: change non-ASCII -i test to stop using --debug

Brandon: Removed stray leftover unused --debug

  grep: add tests for --threads=N and grep.threads

Brandon: Amended commit message to clarify that this test doesn't need
a NO_PTHREADS prerequisite, and we actually get coverage out of
testing with --threads=N when not with threads, or at least it doesn't
harm anything.

  grep: amend submodule recursion test for regex engine testing

Junio: Now "foo" -> "(1|2)" as the commit message claims, not ->
"(1|2)d".

  grep: add tests for grep pattern types being passed to submodules

No changes.

  grep: add a test helper function for less verbose -f \0 tests

Junio: "nul_match() {" -> "nul_match () {" & quote "$status" & don't
quote non-variable strings.

  grep: prepare for testing binary regexes containing rx metacharacters

No changes.

  grep: add tests to fix blind spots with \0 patterns

Junio: Also fixed quoted variable strings here as above.

  perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
  perf: emit progress output when unpacking & building

No changes.

  perf: add a performance comparison test of grep -G, -E and -P

All my multibyte performance tests were done with the string
'm(ú|u)ult.b(æ|y)te' which didn't match anything in the kernel, now
done with 'm(ú|u)lt.b(æ|y)te' instead.

I re-ran all the performance tests mentioned in the commit messages
where applicable.

  perf: add a performance comparison of fixed-string grep

One test_cmp was run twice due to rebasing from the pcre1/pcre2 days
of this series. Fixed.

  grep: catch a missing enum in switch statement

Stefan: Removed the comment about die(..BUG) & put the relevant detail
in the commit message instead.

  grep: remove redundant regflags assignment under PCRE
  grep: remove redundant `regflags &= ~REG_EXTENDED` assignments

No changes.

  grep: factor test for \0 in grep patterns into a function

Brandon: Fix comment syntax creating the function, and move it to the
correct place now instead of later in the "move is_fixed()" commit.

  grep: change the internal PCRE macro names to be PCRE1
  grep: change internal *pcre* variable & function names to be *pcre1*

No changes.

  grep: move is_fixed() earlier to avoid forward declaration

Brandon: Now just moves is_fixed() instead of is_fixed() & has_null()

  test-lib: add a PTHREADS prerequisite
  pack-objects & index-pack: add test for --threads warning
  pack-objects: fix buggy warning about threads

No changes.

  grep: given --threads with NO_PTHREADS=YesPlease, warn

Use Git standard comment syntax for TRANSLATORS comment.

  grep: assert that threading is enabled when calling grep_{lock,unlock}

 Documentation/git-grep.txt         |   7 +-
 Documentation/rev-list-options.txt |   8 +-
 Makefile                           |  14 ++-
 builtin/grep.c                     |  23 +++-
 builtin/pack-objects.c             |   4 +-
 configure.ac                       |  12 ++-
 grep.c                             | 108 ++++++++++---------
 grep.h                             |  10 +-
 t/README                           |   8 +-
 t/perf/README                      |  19 +++-
 t/perf/p7820-grep-engines.sh       |  35 ++++++
 t/perf/p7821-grep-engines-fixed.sh |  26 +++++
 t/perf/run                         |  13 ++-
 t/t4202-log.sh                     |  96 ++++++++++++++++-
 t/t5300-pack-object.sh             |  33 ++++++
 t/t7008-grep-binary.sh             | 135 +++++++++++++++++------
 t/t7810-grep.sh                    |  81 +++++++++++---
 t/t7812-grep-icase-non-ascii.sh    |  29 ++---
 t/t7813-grep-icase-iso.sh          |   2 +-
 t/t7814-grep-recurse-submodules.sh | 215 +++++++++++++++++++++++--------------
 t/test-lib.sh                      |   3 +-
 21 files changed, 646 insertions(+), 235 deletions(-)
 create mode 100755 t/perf/p7820-grep-engines.sh
 create mode 100755 t/perf/p7821-grep-engines-fixed.sh

-- 
2.11.0


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v2 01/29] Makefile & configure: reword inaccurate comment about PCRE
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 02/29] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Reword an outdated & inaccurate comment which suggests that only
git-grep can use PCRE.

This comment was added back when PCRE support was initially added in
commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true
at the time.

It hasn't been telling the full truth since git-log learned to use
PCRE with --grep in commit 727b6fc3ed ("log --grep: accept
--basic-regexp and --perl-regexp", 2012-10-03), and more importantly
is likely to get more inaccurate over time as more use is made of PCRE
in other areas.

Reword it to be more future-proof, and to more clearly explain that
this enables user-initiated runtime behavior.

Copy/pasting this so much in configure.ac is lame, these Makefile-like
flags aren't even used by autoconf, just the corresponding
--with[out]-* options. But copy/pasting the comments that make sense
for the Makefile to configure.ac where they make less sense is the
pattern everything else follows in that file. I'm not going to war
against that as part of this change, just following the existing
pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile     |  6 ++++--
 configure.ac | 12 ++++++++----
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/Makefile b/Makefile
index e35542e631..eedadb8056 100644
--- a/Makefile
+++ b/Makefile
@@ -24,8 +24,10 @@ all::
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 # This also implies BLK_SHA1.
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
diff --git a/configure.ac b/configure.ac
index 128165529f..deeb968daa 100644
--- a/configure.ac
+++ b/configure.ac
@@ -250,8 +250,10 @@ AS_HELP_STRING([--with-openssl],[use OpenSSL library (default is YES)])
 AS_HELP_STRING([],              [ARG can be prefix for openssl library and headers]),
 GIT_PARSE_WITH([openssl]))
 
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 # Define LIBPCREDIR=/foo/bar if your libpcre header and library files are in
 # /foo/bar/include and /foo/bar/lib directories.
@@ -499,8 +501,10 @@ GIT_CONF_SUBST([NEEDS_SSL_WITH_CRYPTO])
 GIT_CONF_SUBST([NO_OPENSSL])
 
 #
-# Define USE_LIBPCRE if you have and want to use libpcre. git-grep will be
-# able to use Perl-compatible regular expressions.
+# Define USE_LIBPCRE if you have and want to use libpcre. Various
+# commands such as log and grep offer runtime options to use
+# Perl-compatible regular expressions instead of standard or extended
+# POSIX regular expressions.
 #
 
 if test -n "$USE_LIBPCRE"; then
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 02/29] grep & rev-list doc: stop promising libpcre for --perl-regexp
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 01/29] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 03/29] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Stop promising in our grep & rev-list options documentation that we're
always going to be using libpcre when given the --perl-regexp option.

Instead talk about using "Perl-compatible regular expressions" and
using these types of patterns using "a compile-time dependency".

Saying "libpcre" means that we're talking about libpcre.so, which is
always going to be v1. This change is part of an ongoing saga to add
support for libpcre2, which comes with PCRE v2.

In the future we might use some completely unrelated library to
provide perl-compatible regular expression support. By wording the
documentation differently and not promising any specific version of
PCRE or even PCRE at all we have more wiggle room to change the
implementation.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-grep.txt         | 7 +++++--
 Documentation/rev-list-options.txt | 8 ++++++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 71f32f3508..5033483db4 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -161,8 +161,11 @@ OPTIONS
 
 -P::
 --perl-regexp::
-	Use Perl-compatible regexp for patterns. Requires libpcre to be
-	compiled in.
+	Use Perl-compatible regular expressions for patterns.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 -F::
 --fixed-strings::
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a02f7324c0..a46f70c2b1 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -92,8 +92,12 @@ endif::git-rev-list[]
 	pattern as a regular expression).
 
 --perl-regexp::
-	Consider the limiting patterns to be Perl-compatible regular expressions.
-	Requires libpcre to be compiled in.
+	Consider the limiting patterns to be Perl-compatible regular
+	expressions.
++
+Support for these types of regular expressions is an optional
+compile-time dependency. If Git wasn't compiled with support for them
+providing this option will cause it to die.
 
 --remove-empty::
 	Stop when a given path disappears from the tree.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 03/29] test-lib: rename the LIBPCRE prerequisite to PCRE
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 01/29] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 02/29] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 7203 bytes --]

Rename the LIBPCRE prerequisite to PCRE. This is for preparation for
libpcre2 support, where having just "LIBPCRE" would be confusing as it
implies v1 of the library.

None of these tests are incompatible between versions 1 & 2 of
libpcre, it's less confusing to give them a more general name to make
it clear that they work on both library versions.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/README                        |  4 ++--
 t/t7810-grep.sh                 | 28 ++++++++++++++--------------
 t/t7812-grep-icase-non-ascii.sh |  4 ++--
 t/t7813-grep-icase-iso.sh       |  2 +-
 t/test-lib.sh                   |  2 +-
 5 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/t/README b/t/README
index ab386c3681..a90cb62583 100644
--- a/t/README
+++ b/t/README
@@ -803,9 +803,9 @@ use these, and "test_set_prereq" for how to define your own.
    Test is not run by root user, and an attempt to write to an
    unwritable file is expected to fail correctly.
 
- - LIBPCRE
+ - PCRE
 
-   Git was compiled with USE_LIBPCRE=YesPlease. Wrap any tests
+   Git was compiled with support for PCRE. Wrap any tests
    that use git-grep --perl-regexp or git-grep -P in these.
 
  - CASE_INSENSITIVE_FS
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index cee42097b0..c84c4d99f9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -275,7 +275,7 @@ do
 		test_cmp expected actual
 	'
 
-	test_expect_success LIBPCRE "grep $L with grep.patterntype=perl" '
+	test_expect_success PCRE "grep $L with grep.patterntype=perl" '
 		echo "${HC}ab:a+b*c" >expected &&
 		git -c grep.patterntype=perl grep "a\x{2b}b\x{2a}c" $H ab >actual &&
 		test_cmp expected actual
@@ -1053,12 +1053,12 @@ hello.c:int main(int argc, const char **argv)
 hello.c:	printf("Hello world.\n");
 EOF
 
-test_expect_success LIBPCRE 'grep --perl-regexp pattern' '
+test_expect_success PCRE 'grep --perl-regexp pattern' '
 	git grep --perl-regexp "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern' '
+test_expect_success PCRE 'grep -P pattern' '
 	git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
@@ -1070,13 +1070,13 @@ test_expect_success 'grep pattern with grep.extendedRegexp=true' '
 	test_cmp empty actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern with grep.extendedRegexp=true' '
+test_expect_success PCRE 'grep -P pattern with grep.extendedRegexp=true' '
 	git -c grep.extendedregexp=true \
 		grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -v pattern' '
+test_expect_success PCRE 'grep -P -v pattern' '
 	{
 		echo "ab:a+b*c"
 		echo "ab:a+bc"
@@ -1085,7 +1085,7 @@ test_expect_success LIBPCRE 'grep -P -v pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -i pattern' '
+test_expect_success PCRE 'grep -P -i pattern' '
 	cat >expected <<-EOF &&
 	hello.c:	printf("Hello world.\n");
 	EOF
@@ -1093,7 +1093,7 @@ test_expect_success LIBPCRE 'grep -P -i pattern' '
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P -w pattern' '
+test_expect_success PCRE 'grep -P -w pattern' '
 	{
 		echo "hello_world:Hello world"
 		echo "hello_world:HeLLo world"
@@ -1118,11 +1118,11 @@ test_expect_success 'grep invalidpattern properly dies with grep.patternType=ext
 	test_must_fail git -c grep.patterntype=extended grep "a["
 '
 
-test_expect_success LIBPCRE 'grep -P invalidpattern properly dies ' '
+test_expect_success PCRE 'grep -P invalidpattern properly dies ' '
 	test_must_fail git grep -P "a["
 '
 
-test_expect_success LIBPCRE 'grep invalidpattern properly dies with grep.patternType=perl' '
+test_expect_success PCRE 'grep invalidpattern properly dies with grep.patternType=perl' '
 	test_must_fail git -c grep.patterntype=perl grep "a["
 '
 
@@ -1191,13 +1191,13 @@ test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =perl, =e
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -G -F -E -P pattern' '
+test_expect_success PCRE 'grep -G -F -E -P pattern' '
 	echo "d0:0" >expected &&
 	git grep -G -F -E -P "[\d]" d0 >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
+test_expect_success PCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
 	echo "d0:0" >expected &&
 	git \
 		-c grep.patterntype=fixed \
@@ -1208,7 +1208,7 @@ test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P pattern with grep.patternType=fixed' '
+test_expect_success PCRE 'grep -P pattern with grep.patternType=fixed' '
 	echo "ab:a+b*c" >expected &&
 	git \
 		-c grep.patterntype=fixed \
@@ -1343,12 +1343,12 @@ space: line with leading space2
 space: line with leading space3
 EOF
 
-test_expect_success LIBPCRE 'grep -E "^ "' '
+test_expect_success PCRE 'grep -E "^ "' '
 	git grep -E "^ " space >actual &&
 	test_cmp expected actual
 '
 
-test_expect_success LIBPCRE 'grep -P "^ "' '
+test_expect_success PCRE 'grep -P "^ "' '
 	git grep -P "^ " space >actual &&
 	test_cmp expected actual
 '
diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh
index 169fd8d706..04a61cb8e0 100755
--- a/t/t7812-grep-icase-non-ascii.sh
+++ b/t/t7812-grep-icase-non-ascii.sh
@@ -20,13 +20,13 @@ test_expect_success REGEX_LOCALE 'grep literal string, no -F' '
 	git grep -i "TILRAUN: HALLÓ HEIMUR!"
 '
 
-test_expect_success GETTEXT_LOCALE,LIBPCRE 'grep pcre utf-8 icase' '
+test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 icase' '
 	git grep --perl-regexp    "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!"
 '
 
-test_expect_success GETTEXT_LOCALE,LIBPCRE 'grep pcre utf-8 string with "+"' '
+test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 string with "+"' '
 	test_write_lines "TILRAUN: Hallóó Heimur!" >file2 &&
 	git add file2 &&
 	git grep -l --perl-regexp "TILRAUN: H.lló+ Heimur!" >actual &&
diff --git a/t/t7813-grep-icase-iso.sh b/t/t7813-grep-icase-iso.sh
index efef7fb81f..701e08a8e5 100755
--- a/t/t7813-grep-icase-iso.sh
+++ b/t/t7813-grep-icase-iso.sh
@@ -11,7 +11,7 @@ test_expect_success GETTEXT_ISO_LOCALE 'setup' '
 	export LC_ALL
 '
 
-test_expect_success GETTEXT_ISO_LOCALE,LIBPCRE 'grep pcre string' '
+test_expect_success GETTEXT_ISO_LOCALE,PCRE 'grep pcre string' '
 	git grep --perl-regexp -i "TILRAUN: H.lló Heimur!" &&
 	git grep --perl-regexp -i "TILRAUN: H.LLÓ HEIMUR!"
 '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822..6abf1d1918 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq LIBPCRE
+test -n "$USE_LIBPCRE" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (2 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 03/29] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-15  4:57   ` Junio C Hamano
  2017-05-13 23:14 ` [PATCH v2 05/29] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add exhaustive tests for how the different grep.patternType options &
the corresponding command-line options affect git-log.

Before this change it was possible to patch revision.c so that the
--basic-regexp option was synonymous with --extended-regexp, and
--perl-regexp wasn't recognized at all, and still have 100% of the
test suite pass.

This was because the first test being modified here, added in commit
34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as
"git grep"", 2012-10-03), didn't actually check whether we'd enabled
extended regular expressions as distinct from re-toggling non-fixed
string support.

Fix that by changing the pattern to a pattern that'll only match if
--extended-regexp option is provided, but won't match under the
default --basic-regexp option.

Other potential regressions were possible since there were no tests
for the rest of the combinations of grep.patternType configuration
toggles & corresponding git-log command-line options. Add exhaustive
tests for those.

The patterns being passed to fixed/basic/extended/PCRE are carefully
crafted to return the wrong thing if the grep engine were to pick any
other matching method than the one it's told to use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index f577990716..e522a2fcd5 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -262,7 +262,28 @@ test_expect_success 'log --grep -i' '
 
 test_expect_success 'log -F -E --grep=<ere> uses ere' '
 	echo second >expect &&
-	git log -1 --pretty="tformat:%s" -F -E --grep=s.c.nd >actual &&
+	# basic would need \(s\) to do the same
+	git log -1 --pretty="tformat:%s" -F -E --grep="(s).c.nd" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success PCRE 'log -F -E --perl-regexp --grep=<pcre> uses PCRE' '
+	test_when_finished "rm -rf num_commits" &&
+	git init num_commits &&
+	(
+		cd num_commits &&
+		test_commit 1d &&
+		test_commit 2e
+	) &&
+	echo 2e >expect &&
+	# In PCRE \d in [\d] is like saying "0-9", and matches the 2
+	# in 2e...
+	git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual &&
+	test_cmp expect actual &&
+	echo 1d >expect &&
+	# ...in POSIX basic & extended it is the same as [d],
+	# i.e. "d", which matches 1d, but not and does not match 2e.
+	git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual &&
 	test_cmp expect actual
 '
 
@@ -280,6 +301,77 @@ test_expect_success 'log with grep.patternType configuration and command line' '
 	test_cmp expect actual
 '
 
+test_expect_success 'log with various grep.patternType configurations & command-lines' '
+	git init pattern-type &&
+	(
+		cd pattern-type &&
+		test_commit 1 file A &&
+
+		# The tagname is overridden here because creating a
+		# tag called "(1|2)" as test_commit would otherwise
+		# implicitly do would fail on e.g. MINGW.
+		test_commit "(1|2)" file B 2 &&
+
+		echo "(1|2)" >expect.fixed &&
+		cp expect.fixed expect.basic &&
+		cp expect.fixed expect.extended &&
+		cp expect.fixed expect.perl &&
+
+		# A strcmp-like match with fixed.
+		git -c grep.patternType=fixed log --pretty=tformat:%s \
+			--grep="(1|2)" >actual.fixed &&
+
+		# POSIX basic matches (, | and ) literally.
+		git -c grep.patternType=basic log --pretty=tformat:%s \
+			--grep="(.|.)" >actual.basic &&
+
+		# POSIX extended needs to have | escaped to match it
+		# literally, whereas under basic this is the same as
+		# (|2), i.e. it would also match "1". This test checks
+		# for extended by asserting that it is not matching
+		# what basic would match.
+		git -c grep.patternType=extended log --pretty=tformat:%s \
+			--grep="\|2" >actual.extended &&
+		if test_have_prereq PCRE
+		then
+			# Only PCRE would match [\d]\| with only
+			# "(1|2)" due to [\d]. POSIX basic would match
+			# both it and "1", and POSIX extended would
+			# match neither.
+			git -c grep.patternType=perl log --pretty=tformat:%s \
+				--grep="[\d]\|" >actual.perl &&
+			test_cmp expect.perl actual.perl
+		fi &&
+		test_cmp expect.fixed actual.fixed &&
+		test_cmp expect.basic actual.basic &&
+		test_cmp expect.extended actual.extended &&
+
+		git log --pretty=tformat:%s -F \
+			--grep="(1|2)" >actual.fixed.short-arg &&
+		git log --pretty=tformat:%s -E \
+			--grep="\|2" >actual.extended.short-arg &&
+		test_cmp expect.fixed actual.fixed.short-arg &&
+		test_cmp expect.extended actual.extended.short-arg &&
+
+		git log --pretty=tformat:%s --fixed-strings \
+			--grep="(1|2)" >actual.fixed.long-arg &&
+		git log --pretty=tformat:%s --basic-regexp \
+			--grep="(.|.)" >actual.basic.long-arg &&
+		git log --pretty=tformat:%s --extended-regexp \
+			--grep="\|2" >actual.extended.long-arg &&
+		if test_have_prereq PCRE
+		then
+			git log --pretty=tformat:%s --perl-regexp \
+				--grep="[\d]\|" >actual.perl.long-arg &&
+			test_cmp expect.perl actual.perl.long-arg
+
+		fi &&
+		test_cmp expect.fixed actual.fixed.long-arg &&
+		test_cmp expect.basic actual.basic.long-arg &&
+		test_cmp expect.extended actual.extended.long-arg
+	)
+'
+
 cat > expect <<EOF
 * Second
 * sixth
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 05/29] grep: add a test asserting that --perl-regexp dies when !PCRE
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (3 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 06/29] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a test asserting that when --perl-regexp (and -P for grep) is
given to git-grep & git-log that we die with an error.

In developing the PCRE v2 series I introduced a regression where -P
would (through control-flow fall-through) become synonymous with basic
POSIX matching. I.e. 'git grep -P '[\d]' would match "d" instead of
digits.

The entire test suite would still pass with this serious regression,
since everything that tested for --perl-regexp would be guarded by the
PCRE prerequisite, fix that blind-spot by adding tests under !PCRE
asserting that git must die when given --perl-regexp or -P.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t4202-log.sh  |  4 +++-
 t/t7810-grep.sh | 12 ++++++++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/t/t4202-log.sh b/t/t4202-log.sh
index e522a2fcd5..9680dfe400 100755
--- a/t/t4202-log.sh
+++ b/t/t4202-log.sh
@@ -364,7 +364,9 @@ test_expect_success 'log with various grep.patternType configurations & command-
 			git log --pretty=tformat:%s --perl-regexp \
 				--grep="[\d]\|" >actual.perl.long-arg &&
 			test_cmp expect.perl actual.perl.long-arg
-
+		else
+			test_must_fail git log --perl-regexp \
+				--grep="[\d]\|"
 		fi &&
 		test_cmp expect.fixed actual.fixed.long-arg &&
 		test_cmp expect.basic actual.basic.long-arg &&
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index c84c4d99f9..8d69113695 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -281,6 +281,10 @@ do
 		test_cmp expected actual
 	'
 
+	test_expect_success !PCRE "grep $L with grep.patterntype=perl errors without PCRE" '
+		test_must_fail git -c grep.patterntype=perl grep "foo.*bar"
+	'
+
 	test_expect_success "grep $L with grep.patternType=default and grep.extendedRegexp=true" '
 		echo "${HC}ab:abc" >expected &&
 		git \
@@ -1058,11 +1062,19 @@ test_expect_success PCRE 'grep --perl-regexp pattern' '
 	test_cmp expected actual
 '
 
+test_expect_success !PCRE 'grep --perl-regexp pattern errors without PCRE' '
+	test_must_fail git grep --perl-regexp "foo.*bar"
+'
+
 test_expect_success PCRE 'grep -P pattern' '
 	git grep -P "\p{Ps}.*?\p{Pe}" hello.c >actual &&
 	test_cmp expected actual
 '
 
+test_expect_success !PCRE 'grep -P pattern errors without PCRE' '
+	test_must_fail git grep -P "foo.*bar"
+'
+
 test_expect_success 'grep pattern with grep.extendedRegexp=true' '
 	>empty &&
 	test_must_fail git -c grep.extendedregexp=true \
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 06/29] grep: add a test for backreferences in PCRE patterns
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (4 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 05/29] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 07/29] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a test for backreferences such as (.)\1 in PCRE patterns. This
test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned
on. Before this change turning it on would break these sort of
patterns, but wouldn't break any tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 8d69113695..daa906b9b0 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1114,6 +1114,13 @@ test_expect_success PCRE 'grep -P -w pattern' '
 	test_cmp expected actual
 '
 
+test_expect_success PCRE 'grep -P backreferences work (the PCRE NO_AUTO_CAPTURE flag is not set)' '
+	git grep -P -h "(?P<one>.)(?P=one)" hello_world >actual &&
+	test_cmp hello_world actual &&
+	git grep -P -h "(.)\1" hello_world >actual &&
+	test_cmp hello_world actual
+'
+
 test_expect_success 'grep -G invalidpattern properly dies ' '
 	test_must_fail git grep -G "a["
 '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 07/29] grep: change non-ASCII -i test to stop using --debug
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (5 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 06/29] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 08/29] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Change a non-ASCII case-insensitive test case to stop using --debug,
and instead simply test for the expected results.

The test coverage remains the same with this change, but the test
won't break due to internal refactoring.

This test was added in commit 793dc676e0 ("grep/icase: avoid kwsset
when -F is specified", 2016-06-25). It was asserting that the regex
must be compiled with compile_fixed_regexp(), instead test for the
expected results, allowing the underlying implementation to change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7812-grep-icase-non-ascii.sh | 25 +++++--------------------
 1 file changed, 5 insertions(+), 20 deletions(-)

diff --git a/t/t7812-grep-icase-non-ascii.sh b/t/t7812-grep-icase-non-ascii.sh
index 04a61cb8e0..0059a1f837 100755
--- a/t/t7812-grep-icase-non-ascii.sh
+++ b/t/t7812-grep-icase-non-ascii.sh
@@ -36,29 +36,14 @@ test_expect_success GETTEXT_LOCALE,PCRE 'grep pcre utf-8 string with "+"' '
 '
 
 test_expect_success REGEX_LOCALE 'grep literal string, with -F' '
-	git grep --debug -i -F "TILRAUN: Halló Heimur!"  2>&1 >/dev/null |
-		 grep fixed >debug1 &&
-	test_write_lines "fixed TILRAUN: Halló Heimur!" >expect1 &&
-	test_cmp expect1 debug1 &&
-
-	git grep --debug -i -F "TILRAUN: HALLÓ HEIMUR!"  2>&1 >/dev/null |
-		 grep fixed >debug2 &&
-	test_write_lines "fixed TILRAUN: HALLÓ HEIMUR!" >expect2 &&
-	test_cmp expect2 debug2
+	git grep -i -F "TILRAUN: Halló Heimur!" &&
+	git grep -i -F "TILRAUN: HALLÓ HEIMUR!"
 '
 
 test_expect_success REGEX_LOCALE 'grep string with regex, with -F' '
-	test_write_lines "^*TILR^AUN:.* \\Halló \$He[]imur!\$" >file &&
-
-	git grep --debug -i -F "^*TILR^AUN:.* \\Halló \$He[]imur!\$" 2>&1 >/dev/null |
-		 grep fixed >debug1 &&
-	test_write_lines "fixed \\^*TILR^AUN:\\.\\* \\\\Halló \$He\\[]imur!\\\$" >expect1 &&
-	test_cmp expect1 debug1 &&
-
-	git grep --debug -i -F "^*TILR^AUN:.* \\HALLÓ \$HE[]IMUR!\$"  2>&1 >/dev/null |
-		 grep fixed >debug2 &&
-	test_write_lines "fixed \\^*TILR^AUN:\\.\\* \\\\HALLÓ \$HE\\[]IMUR!\\\$" >expect2 &&
-	test_cmp expect2 debug2
+	test_write_lines "TILRAUN: Halló Heimur [abc]!" >file3 &&
+	git add file3 &&
+	git grep -i -F "TILRAUN: Halló Heimur [abc]!" file3
 '
 
 test_expect_success REGEX_LOCALE 'pickaxe -i on non-ascii' '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 08/29] grep: add tests for --threads=N and grep.threads
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (6 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 07/29] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 09/29] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add tests for --threads=N being supplied on the command-line, or when
grep.threads=N being supplied in the configuration.

When the threading support was made run-time configurable in commit
89f09dd34e ("grep: add --threads=<num> option and grep.threads
configuration", 2015-12-15) no tests were added for it.

In developing a change to the grep code I was able to make
'--threads=1 <pat>` segfault, while the test suite still passed. This
change fixes that blind spot in the tests.

In addition to asserting that asking for N threads shouldn't segfault,
test that the grep output given any N is the same.

The choice to test only 1..10 as opposed to 1..8 or 1..16 or whatever
is arbitrary. Testing 1..1024 works locally for me (but gets
noticeably slower as more threads are spawned). Given the structure of
the code there's no reason to test an arbitrary number of threads,
only 0, 1 and >=2 are special modes of operation.

A later patch introduces a PTHREADS test prerequisite which is true
under NO_PTHREADS=UnfortunatelyYes, but even under NO_PTHREADS it's
fine to test --threads=N, we'll just ignore it and not use
threading. So these tests also make sense under that mode to assert
that --threads=N without pthreads still returns expected results.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7810-grep.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index daa906b9b0..561709ef6a 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -775,6 +775,22 @@ test_expect_success 'grep -W with userdiff' '
 	test_cmp expected actual
 '
 
+for threads in $(test_seq 0 10)
+do
+	test_expect_success "grep --threads=$threads & -c grep.threads=$threads" "
+		git grep --threads=$threads . >actual.$threads &&
+		if test $threads -ge 1
+		then
+			test_cmp actual.\$(($threads - 1)) actual.$threads
+		fi &&
+		git -c grep.threads=$threads grep . >actual.$threads &&
+		if test $threads -ge 1
+		then
+			test_cmp actual.\$(($threads - 1)) actual.$threads
+		fi
+	"
+done
+
 test_expect_success 'grep from a subdirectory to search wider area (1)' '
 	mkdir -p s &&
 	(
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 09/29] grep: amend submodule recursion test for regex engine testing
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (7 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 08/29] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 10/29] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Amend the submodule recursion test to prepare it for subsequent tests
of whether it passes along the grep.patternType to the submodule
greps.

This is the result of searching & replacing:

    foobar -> (1|2)d(3|4)
    foo    -> (1|2)
    bar    -> (3|4)

Currently there's no tests for whether e.g. -P or -E is correctly
passed along, tests for that will be added in a follow-up change, but
first add content to the tests which will match differently under
different regex engines.

Reuse the pattern established in an earlier commit of mine in this
series ("log: add exhaustive tests for pattern style options &
config", 2017-04-07). The pattern "(.|.)[\d]" will match this content
differently under fixed/basic/extended & perl.

This test code was originally added in commit 0281e487fd ("grep:
optionally recurse into submodules", 2016-12-16).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7814-grep-recurse-submodules.sh | 166 ++++++++++++++++++-------------------
 1 file changed, 83 insertions(+), 83 deletions(-)

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 5b6eb3a65e..1472855e1d 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -9,13 +9,13 @@ submodules.
 . ./test-lib.sh
 
 test_expect_success 'setup directory structure and submodule' '
-	echo "foobar" >a &&
+	echo "(1|2)d(3|4)" >a &&
 	mkdir b &&
-	echo "bar" >b/b &&
+	echo "(3|4)" >b/b &&
 	git add a b &&
 	git commit -m "add a and b" &&
 	git init submodule &&
-	echo "foobar" >submodule/a &&
+	echo "(1|2)d(3|4)" >submodule/a &&
 	git -C submodule add a &&
 	git -C submodule commit -m "add a" &&
 	git submodule add ./submodule &&
@@ -24,18 +24,18 @@ test_expect_success 'setup directory structure and submodule' '
 
 test_expect_success 'grep correctly finds patterns in a submodule' '
 	cat >expect <<-\EOF &&
-	a:foobar
-	b/b:bar
-	submodule/a:foobar
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and basic pathspecs' '
 	cat >expect <<-\EOF &&
-	submodule/a:foobar
+	submodule/a:(1|2)d(3|4)
 	EOF
 
 	git grep -e. --recurse-submodules -- submodule >actual &&
@@ -44,7 +44,7 @@ test_expect_success 'grep and basic pathspecs' '
 
 test_expect_success 'grep and nested submodules' '
 	git init submodule/sub &&
-	echo "foobar" >submodule/sub/a &&
+	echo "(1|2)d(3|4)" >submodule/sub/a &&
 	git -C submodule/sub add a &&
 	git -C submodule/sub commit -m "add a" &&
 	git -C submodule submodule add ./sub &&
@@ -54,117 +54,117 @@ test_expect_success 'grep and nested submodules' '
 	git commit -m "updated submodule" &&
 
 	cat >expect <<-\EOF &&
-	a:foobar
-	b/b:bar
-	submodule/a:foobar
-	submodule/sub/a:foobar
+	a:(1|2)d(3|4)
+	b/b:(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
 	cat >expect <<-\EOF &&
-	a:foobar
-	submodule/a:foobar
-	submodule/sub/a:foobar
+	a:(1|2)d(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --and -e "foo" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --and -e "(1|2)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep and multiple patterns' '
 	cat >expect <<-\EOF &&
-	b/b:bar
+	b/b:(3|4)
 	EOF
 
-	git grep -e "bar" --and --not -e "foo" --recurse-submodules >actual &&
+	git grep -e "(3|4)" --and --not -e "(1|2)" --recurse-submodules >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'basic grep tree' '
 	cat >expect <<-\EOF &&
-	HEAD:a:foobar
-	HEAD:b/b:bar
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:a:(1|2)d(3|4)
+	HEAD:b/b:(3|4)
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree HEAD^' '
 	cat >expect <<-\EOF &&
-	HEAD^:a:foobar
-	HEAD^:b/b:bar
-	HEAD^:submodule/a:foobar
+	HEAD^:a:(1|2)d(3|4)
+	HEAD^:b/b:(3|4)
+	HEAD^:submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD^ >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree HEAD^^' '
 	cat >expect <<-\EOF &&
-	HEAD^^:a:foobar
-	HEAD^^:b/b:bar
+	HEAD^^:a:(1|2)d(3|4)
+	HEAD^^:b/b:(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD^^ >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD^^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- submodule >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- submodule >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodule*a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodule*a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and more pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/a:foobar
+	HEAD:submodule/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodul?/a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodul?/a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep tree and more pathspecs' '
 	cat >expect <<-\EOF &&
-	HEAD:submodule/sub/a:foobar
+	HEAD:submodule/sub/a:(1|2)d(3|4)
 	EOF
 
-	git grep -e "bar" --recurse-submodules HEAD -- "submodul*/sub/a" >actual &&
+	git grep -e "(3|4)" --recurse-submodules HEAD -- "submodul*/sub/a" >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success !MINGW 'grep recurse submodule colon in name' '
 	git init parent &&
 	test_when_finished "rm -rf parent" &&
-	echo "foobar" >"parent/fi:le" &&
+	echo "(1|2)d(3|4)" >"parent/fi:le" &&
 	git -C parent add "fi:le" &&
 	git -C parent commit -m "add fi:le" &&
 
 	git init "su:b" &&
 	test_when_finished "rm -rf su:b" &&
-	echo "foobar" >"su:b/fi:le" &&
+	echo "(1|2)d(3|4)" >"su:b/fi:le" &&
 	git -C "su:b" add "fi:le" &&
 	git -C "su:b" commit -m "add fi:le" &&
 
@@ -172,30 +172,30 @@ test_expect_success !MINGW 'grep recurse submodule colon in name' '
 	git -C parent commit -m "add submodule" &&
 
 	cat >expect <<-\EOF &&
-	fi:le:foobar
-	su:b/fi:le:foobar
+	fi:le:(1|2)d(3|4)
+	su:b/fi:le:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	cat >expect <<-\EOF &&
-	HEAD:fi:le:foobar
-	HEAD:su:b/fi:le:foobar
+	HEAD:fi:le:(1|2)d(3|4)
+	HEAD:su:b/fi:le:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules HEAD >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules HEAD >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep history with moved submoules' '
 	git init parent &&
 	test_when_finished "rm -rf parent" &&
-	echo "foobar" >parent/file &&
+	echo "(1|2)d(3|4)" >parent/file &&
 	git -C parent add file &&
 	git -C parent commit -m "add file" &&
 
 	git init sub &&
 	test_when_finished "rm -rf sub" &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
@@ -203,82 +203,82 @@ test_expect_success 'grep history with moved submoules' '
 	git -C parent commit -m "add submodule" &&
 
 	cat >expect <<-\EOF &&
-	dir/sub/file:foobar
-	file:foobar
+	dir/sub/file:(1|2)d(3|4)
+	file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	git -C parent mv dir/sub sub-moved &&
 	git -C parent commit -m "moved submodule" &&
 
 	cat >expect <<-\EOF &&
-	file:foobar
-	sub-moved/file:foobar
+	file:(1|2)d(3|4)
+	sub-moved/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules >actual &&
 	test_cmp expect actual &&
 
 	cat >expect <<-\EOF &&
-	HEAD^:dir/sub/file:foobar
-	HEAD^:file:foobar
+	HEAD^:dir/sub/file:(1|2)d(3|4)
+	HEAD^:file:(1|2)d(3|4)
 	EOF
-	git -C parent grep -e "foobar" --recurse-submodules HEAD^ >actual &&
+	git -C parent grep -e "(1|2)d(3|4)" --recurse-submodules HEAD^ >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep using relative path' '
 	test_when_finished "rm -rf parent sub" &&
 	git init sub &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
 	git init parent &&
-	echo "foobar" >parent/file &&
+	echo "(1|2)d(3|4)" >parent/file &&
 	git -C parent add file &&
 	mkdir parent/src &&
-	echo "foobar" >parent/src/file2 &&
+	echo "(1|2)d(3|4)" >parent/src/file2 &&
 	git -C parent add src/file2 &&
 	git -C parent submodule add ../sub &&
 	git -C parent commit -m "add files and submodule" &&
 
 	# From top works
 	cat >expect <<-\EOF &&
-	file:foobar
-	src/file2:foobar
-	sub/file:foobar
+	file:(1|2)d(3|4)
+	src/file2:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual &&
 
 	# Relative path to top
 	cat >expect <<-\EOF &&
-	../file:foobar
-	file2:foobar
-	../sub/file:foobar
+	../file:(1|2)d(3|4)
+	file2:(1|2)d(3|4)
+	../sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" -- .. >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" -- .. >actual &&
 	test_cmp expect actual &&
 
 	# Relative path to submodule
 	cat >expect <<-\EOF &&
-	../sub/file:foobar
+	../sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" -- ../sub >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" -- ../sub >actual &&
 	test_cmp expect actual
 '
 
 test_expect_success 'grep from a subdir' '
 	test_when_finished "rm -rf parent sub" &&
 	git init sub &&
-	echo "foobar" >sub/file &&
+	echo "(1|2)d(3|4)" >sub/file &&
 	git -C sub add file &&
 	git -C sub commit -m "add file" &&
 
 	git init parent &&
 	mkdir parent/src &&
-	echo "foobar" >parent/src/file &&
+	echo "(1|2)d(3|4)" >parent/src/file &&
 	git -C parent add src/file &&
 	git -C parent submodule add ../sub src/sub &&
 	git -C parent submodule add ../sub sub &&
@@ -286,19 +286,19 @@ test_expect_success 'grep from a subdir' '
 
 	# Verify grep from root works
 	cat >expect <<-\EOF &&
-	src/file:foobar
-	src/sub/file:foobar
-	sub/file:foobar
+	src/file:(1|2)d(3|4)
+	src/sub/file:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual &&
 
 	# Verify grep from a subdir works
 	cat >expect <<-\EOF &&
-	file:foobar
-	sub/file:foobar
+	file:(1|2)d(3|4)
+	sub/file:(1|2)d(3|4)
 	EOF
-	git -C parent/src grep --recurse-submodules -e "foobar" >actual &&
+	git -C parent/src grep --recurse-submodules -e "(1|2)d(3|4)" >actual &&
 	test_cmp expect actual
 '
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 10/29] grep: add tests for grep pattern types being passed to submodules
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (8 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 09/29] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 11/29] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add testing for grep pattern types being correctly passed to
submodules. The pattern "(.|.)[\d]" matches differently under
fixed (not at all), and then matches different lines under
basic/extended & perl regular expressions, so this change asserts that
the pattern type is passed along correctly.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7814-grep-recurse-submodules.sh | 49 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 1472855e1d..3a58197f47 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -313,4 +313,53 @@ test_incompatible_with_recurse_submodules ()
 test_incompatible_with_recurse_submodules --untracked
 test_incompatible_with_recurse_submodules --no-index
 
+test_expect_success 'grep --recurse-submodules should pass the pattern type along' '
+	# Fixed
+	test_must_fail git grep -F --recurse-submodules -e "(.|.)[\d]" &&
+	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
+
+	# Basic
+	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
+	cat >expect <<-\EOF &&
+	a:(1|2)d(3|4)
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
+	EOF
+	test_cmp expect actual &&
+	git -c grep.patternType=basic grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+
+	# Extended
+	git grep -E --recurse-submodules -e "(.|.)[\d]" >actual &&
+	cat >expect <<-\EOF &&
+	.gitmodules:[submodule "submodule"]
+	.gitmodules:	path = submodule
+	.gitmodules:	url = ./submodule
+	a:(1|2)d(3|4)
+	submodule/.gitmodules:[submodule "sub"]
+	submodule/a:(1|2)d(3|4)
+	submodule/sub/a:(1|2)d(3|4)
+	EOF
+	test_cmp expect actual &&
+	git -c grep.patternType=extended grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+	git -c grep.extendedRegexp=true grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+	test_cmp expect actual &&
+
+	# Perl
+	if test_have_prereq PCRE
+	then
+		git grep -P --recurse-submodules -e "(.|.)[\d]" >actual &&
+		cat >expect <<-\EOF &&
+		a:(1|2)d(3|4)
+		b/b:(3|4)
+		submodule/a:(1|2)d(3|4)
+		submodule/sub/a:(1|2)d(3|4)
+		EOF
+		test_cmp expect actual &&
+		git -c grep.patternType=perl grep --recurse-submodules -e "(.|.)[\d]" >actual &&
+		test_cmp expect actual
+	fi
+'
+
 test_done
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 11/29] grep: add a test helper function for less verbose -f \0 tests
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (9 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 10/29] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 12/29] grep: prepare for testing binary regexes containing rx metacharacters Ævar Arnfjörð Bjarmason
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a helper function to make the tests which check for patterns with
\0 in them more succinct. Right now this isn't a big win, but
subsequent commits will add a lot more of these tests.

The helper is based on the match() function in t3070-wildmatch.sh.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7008-grep-binary.sh | 58 +++++++++++++++++++++++++-------------------------
 1 file changed, 29 insertions(+), 29 deletions(-)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 9c9c378119..70e7868829 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -4,6 +4,29 @@ test_description='git grep in binary files'
 
 . ./test-lib.sh
 
+nul_match () {
+	status=$1
+	flags=$2
+	pattern=$3
+	pattern_human=$(echo "$pattern" | sed 's/Q/<NUL>/g')
+
+	if test "$status" = 1
+	then
+		test_expect_success "git grep -f f $flags '$pattern_human' a" "
+			printf '$pattern' | q_to_nul >f &&
+			git grep -f f $flags a
+		"
+	elif test "$status" = 0
+	then
+		test_expect_success "git grep -f f $flags '$pattern_human' a" "
+			printf '$pattern' | q_to_nul >f &&
+			test_must_fail git grep -f f $flags a
+		"
+	else
+		test_expect_success "PANIC: Test framework error. Unknown status $status" 'false'
+	fi
+}
+
 test_expect_success 'setup' "
 	echo 'binaryQfile' | q_to_nul >a &&
 	git add a &&
@@ -69,35 +92,12 @@ test_expect_failure 'git grep .fi a' '
 	git grep .fi a
 '
 
-test_expect_success 'git grep -F y<NUL>f a' "
-	printf 'yQf' | q_to_nul >f &&
-	git grep -f f -F a
-"
-
-test_expect_success 'git grep -F y<NUL>x a' "
-	printf 'yQx' | q_to_nul >f &&
-	test_must_fail git grep -f f -F a
-"
-
-test_expect_success 'git grep -Fi Y<NUL>f a' "
-	printf 'YQf' | q_to_nul >f &&
-	git grep -f f -Fi a
-"
-
-test_expect_success 'git grep -Fi Y<NUL>x a' "
-	printf 'YQx' | q_to_nul >f &&
-	test_must_fail git grep -f f -Fi a
-"
-
-test_expect_success 'git grep y<NUL>f a' "
-	printf 'yQf' | q_to_nul >f &&
-	git grep -f f a
-"
-
-test_expect_success 'git grep y<NUL>x a' "
-	printf 'yQx' | q_to_nul >f &&
-	test_must_fail git grep -f f a
-"
+nul_match 1 '-F' 'yQf'
+nul_match 0 '-F' 'yQx'
+nul_match 1 '-Fi' 'YQf'
+nul_match 0 '-Fi' 'YQx'
+nul_match 1 '' 'yQf'
+nul_match 0 '' 'yQx'
 
 test_expect_success 'grep respects binary diff attribute' '
 	echo text >t &&
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 12/29] grep: prepare for testing binary regexes containing rx metacharacters
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (10 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 11/29] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 13/29] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add setup code needed for testing regexes that contain both binary
data and regex metacharacters.

The POSIX regcomp() function inherently can't support that, because it
takes a \0-delimited char *, but other regex engines APIs like PCRE v2
take a pattern/length pair, and are thus able to handle \0s in
patterns as well as any other character.

When kwset was imported in commit 9eceddeec6 ("Use kwset in grep",
2011-08-21) this limitation was fixed, but at the expense of
introducing the undocumented limitation that any pattern containing \0
implicitly becomes a fixed match (equivalent to -F having been
provided).

That's not something we'd like to keep in the future. The inability to
match patterns containing \0 is a leaky implementation detail.

So add tests as a first step towards changing that. In order to test
that \0-patterns can properly match as regexes the test string needs
to have some regex metacharacters in it.

There were other blind spots in the tests. The code around kwset
specially handles case-insensitive & non-ASCII data, but there were no
tests for this.

Fix all of that by amending the text being matched to contain both
regex metacharacters & non-ASCII data.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7008-grep-binary.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index 70e7868829..e7754c3946 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -28,7 +28,7 @@ nul_match () {
 }
 
 test_expect_success 'setup' "
-	echo 'binaryQfile' | q_to_nul >a &&
+	echo 'binaryQfileQm[*]cQ*æQð' | q_to_nul >a &&
 	git add a &&
 	git commit -m.
 "
@@ -162,7 +162,7 @@ test_expect_success 'grep does not honor textconv' '
 '
 
 test_expect_success 'grep --textconv honors textconv' '
-	echo "a:binaryQfile" >expect &&
+	echo "a:binaryQfileQm[*]cQ*æQð" >expect &&
 	git grep --textconv Qfile >actual &&
 	test_cmp expect actual
 '
@@ -172,7 +172,7 @@ test_expect_success 'grep --no-textconv does not honor textconv' '
 '
 
 test_expect_success 'grep --textconv blob honors textconv' '
-	echo "HEAD:a:binaryQfile" >expect &&
+	echo "HEAD:a:binaryQfileQm[*]cQ*æQð" >expect &&
 	git grep --textconv Qfile HEAD:a >actual &&
 	test_cmp expect actual
 '
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 13/29] grep: add tests to fix blind spots with \0 patterns
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (11 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 12/29] grep: prepare for testing binary regexes containing rx metacharacters Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 14/29] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Address a big blind spot in the tests for patterns containing \0. The
is_fixed() function considers any string that contains \0 fixed, even
if it contains regular expression metacharacters, those patterns are
currently matched with kwset.

Before this change removing that memchr(s, 0, len) check from
is_fixed() wouldn't change the result of any of the tests, since
regcomp() will happily match the part before the \0.

Furthermore, the kwset path is dependent on whether the the -i flag is
on, and whether the pattern has any non-ASCII characters, but none of
this was tested for.

See the a previous commit in this series ("grep: add tests to fix
blind spots with \0 patterns", 2017-04-21) for further details &
rationale.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t7008-grep-binary.sh | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/t/t7008-grep-binary.sh b/t/t7008-grep-binary.sh
index e7754c3946..079395ba54 100755
--- a/t/t7008-grep-binary.sh
+++ b/t/t7008-grep-binary.sh
@@ -22,6 +22,18 @@ nul_match () {
 			printf '$pattern' | q_to_nul >f &&
 			test_must_fail git grep -f f $flags a
 		"
+	elif test "$status" = T1
+	then
+		test_expect_failure "git grep -f f $flags '$pattern_human' a" "
+			printf '$pattern' | q_to_nul >f &&
+			git grep -f f $flags a
+		"
+	elif test "$status" = T0
+	then
+		test_expect_failure "git grep -f f $flags '$pattern_human' a" "
+			printf '$pattern' | q_to_nul >f &&
+			test_must_fail git grep -f f $flags a
+		"
 	else
 		test_expect_success "PANIC: Test framework error. Unknown status $status" 'false'
 	fi
@@ -98,6 +110,65 @@ nul_match 1 '-Fi' 'YQf'
 nul_match 0 '-Fi' 'YQx'
 nul_match 1 '' 'yQf'
 nul_match 0 '' 'yQx'
+nul_match 1 '' 'æQð'
+nul_match 1 '-F' 'eQm[*]c'
+nul_match 1 '-Fi' 'EQM[*]C'
+
+# Regex patterns that would match but shouldn't with -F
+nul_match 0 '-F' 'yQ[f]'
+nul_match 0 '-F' '[y]Qf'
+nul_match 0 '-Fi' 'YQ[F]'
+nul_match 0 '-Fi' '[Y]QF'
+nul_match 0 '-F' 'æQ[ð]'
+nul_match 0 '-F' '[æ]Qð'
+nul_match 0 '-Fi' 'ÆQ[Ð]'
+nul_match 0 '-Fi' '[Æ]QÐ'
+
+# kwset is disabled on -i & non-ASCII. No way to match non-ASCII \0
+# patterns case-insensitively.
+nul_match T1 '-i' 'ÆQÐ'
+
+# \0 implicitly disables regexes. This is an undocumented internal
+# limitation.
+nul_match T1 '' 'yQ[f]'
+nul_match T1 '' '[y]Qf'
+nul_match T1 '-i' 'YQ[F]'
+nul_match T1 '-i' '[Y]Qf'
+nul_match T1 '' 'æQ[ð]'
+nul_match T1 '' '[æ]Qð'
+nul_match T1 '-i' 'ÆQ[Ð]'
+
+# ... because of \0 implicitly disabling regexes regexes that
+# should/shouldn't match don't do the right thing.
+nul_match T1 '' 'eQm.*cQ'
+nul_match T1 '-i' 'EQM.*cQ'
+nul_match T0 '' 'eQm[*]c'
+nul_match T0 '-i' 'EQM[*]C'
+
+# Due to the REG_STARTEND extension when kwset() is disabled on -i &
+# non-ASCII the string will be matched in its entirety, but the
+# pattern will be cut off at the first \0.
+nul_match 0 '-i' 'NOMATCHQð'
+nul_match T0 '-i' '[Æ]QNOMATCH'
+nul_match T0 '-i' '[æ]QNOMATCH'
+# Matches, but for the wrong reasons, just stops at [æ]
+nul_match 1 '-i' '[Æ]Qð'
+nul_match 1 '-i' '[æ]Qð'
+
+# Ensure that the matcher doesn't regress to something that stops at
+# \0
+nul_match 0 '-F' 'yQ[f]'
+nul_match 0 '-Fi' 'YQ[F]'
+nul_match 0 '' 'yQNOMATCH'
+nul_match 0 '' 'QNOMATCH'
+nul_match 0 '-i' 'YQNOMATCH'
+nul_match 0 '-i' 'QNOMATCH'
+nul_match 0 '-F' 'æQ[ð]'
+nul_match 0 '-Fi' 'ÆQ[Ð]'
+nul_match 0 '' 'yQNÓMATCH'
+nul_match 0 '' 'QNÓMATCH'
+nul_match 0 '-i' 'YQNÓMATCH'
+nul_match 0 '-i' 'QNÓMATCH'
 
 test_expect_success 'grep respects binary diff attribute' '
 	echo text >t &&
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 14/29] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (12 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 13/29] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 15/29] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing
GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell
command to execute instead of 'make'.

This is useful e.g. in cases where the name, semantics or defaults of
a Makefile flag have changed over time. It can even be used to change
the contents of the tree, useful for monkeypatching ancient versions
of git to get them to build.

This opens Pandora's box in some ways, it's now possible to
"jailbreak" the perf environment and e.g. modify the source tree via
this arbitrary instead of just issuing a custom "make" command, such a
command has to be re-entrant in the sense that subsequent perf runs
will re-use the possibly modified tree.

It would be pointless to try to mitigate or work around that caveat in
a tool purely aimed at Git developers, so this change makes no attempt
to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile      |  3 +++
 t/perf/README | 19 +++++++++++++++++--
 t/perf/run    | 11 +++++++++--
 3 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index eedadb8056..d1587452f1 100644
--- a/Makefile
+++ b/Makefile
@@ -2272,6 +2272,9 @@ endif
 ifdef GIT_PERF_MAKE_OPTS
 	@echo GIT_PERF_MAKE_OPTS=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_MAKE_OPTS)))'\' >>$@+
 endif
+ifdef GIT_PERF_MAKE_COMMAND
+	@echo GIT_PERF_MAKE_COMMAND=\''$(subst ','\'',$(subst ','\'',$(GIT_PERF_MAKE_COMMAND)))'\' >>$@+
+endif
 ifdef GIT_INTEROP_MAKE_OPTS
 	@echo GIT_INTEROP_MAKE_OPTS=\''$(subst ','\'',$(subst ','\'',$(GIT_INTEROP_MAKE_OPTS)))'\' >>$@+
 endif
diff --git a/t/perf/README b/t/perf/README
index 49ea4349be..b3d95042a8 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -60,8 +60,23 @@ You can set the following variables (also in your config.mak):
 
     GIT_PERF_MAKE_OPTS
 	Options to use when automatically building a git tree for
-	performance testing.  E.g., -j6 would be useful.
-
+	performance testing. E.g., -j6 would be useful. Passed
+	directly to make as "make $GIT_PERF_MAKE_OPTS".
+
+    GIT_PERF_MAKE_COMMAND
+	An arbitrary command that'll be run in place of the make
+	command, if set the GIT_PERF_MAKE_OPTS variable is
+	ignored. Useful in cases where source tree changes might
+	require issuing a different make command to different
+	revisions.
+
+	This can be (ab)used to monkeypatch or otherwise change the
+	tree about to be built. Note that the build directory can be
+	re-used for subsequent runs so the make command might get
+	executed multiple times on the same tree, but don't count on
+	any of that, that's an implementation detail that might change
+	in the future.
+ 
     GIT_PERF_REPO
     GIT_PERF_LARGE_REPO
 	Repositories to copy for the performance tests.  The normal
diff --git a/t/perf/run b/t/perf/run
index c788d713ae..b61024a830 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -37,8 +37,15 @@ build_git_rev () {
 			cp "../../$config" "build/$rev/"
 		fi
 	done
-	(cd build/$rev && make $GIT_PERF_MAKE_OPTS) ||
-	die "failed to build revision '$mydir'"
+	(
+		cd build/$rev &&
+		if test -n "$GIT_PERF_MAKE_COMMAND"
+		then
+			sh -c "$GIT_PERF_MAKE_COMMAND"
+		else
+			make $GIT_PERF_MAKE_OPTS
+		fi
+	) || die "failed to build revision '$mydir'"
 }
 
 run_dirs_helper () {
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 15/29] perf: emit progress output when unpacking & building
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (13 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 14/29] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 16/29] perf: add a performance comparison test of grep -G, -E and -P Ævar Arnfjörð Bjarmason
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Amend the t/perf/run output so that in addition to the "Running N
tests" heading currently being emitted, it also emits "Unpacking $rev"
and "Building $rev" when setting up the build/$rev directory & when
building it, respectively.

This makes it easier to see what's going on and what revision is being
tested as the output scrolls by.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/run | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/t/perf/run b/t/perf/run
index b61024a830..beb4acc0e4 100755
--- a/t/perf/run
+++ b/t/perf/run
@@ -24,6 +24,7 @@ run_one_dir () {
 
 unpack_git_rev () {
 	rev=$1
+	echo "=== Unpacking $rev in build/$rev ==="
 	mkdir -p build/$rev
 	(cd "$(git rev-parse --show-cdup)" && git archive --format=tar $rev) |
 	(cd build/$rev && tar x)
@@ -37,6 +38,7 @@ build_git_rev () {
 			cp "../../$config" "build/$rev/"
 		fi
 	done
+	echo "=== Building $rev ==="
 	(
 		cd build/$rev &&
 		if test -n "$GIT_PERF_MAKE_COMMAND"
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 16/29] perf: add a performance comparison test of grep -G, -E and -P
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (14 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 15/29] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 17/29] perf: add a performance comparison of fixed-string grep Ævar Arnfjörð Bjarmason
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a very basic performance comparison test comparing the POSIX
basic, extended and perl engines.

In theory the "basic" and "extended" engines should be implemented
using the same underlying code with a slightly different pattern
parser, but some implementations may not do this. Jump through some
slight hoops to test both, which is worthwhile since "basic" is the
default.

Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3 using gcc 7.1.1:

    $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
    [...]
    -------------------------------------------------------------
    7820.1: basic grep how.to                     0.28(1.27+0.41)
    7820.2: extended grep how.to                  0.29(1.23+0.44)
    7820.3: perl grep how.to                      0.27(1.14+0.47)
    7820.5: basic grep ^how to                    0.28(1.18+0.46)
    7820.6: extended grep ^how to                 0.28(1.25+0.40)
    7820.7: perl grep ^how to                     0.48(2.76+0.40)
    7820.9: basic grep [how] to                   0.42(2.20+0.42)
    7820.10: extended grep [how] to               0.41(2.12+0.50)
    7820.11: perl grep [how] to                   0.48(2.59+0.42)
    7820.13: basic grep \(e.t[^ ]*\|v.ry\) rare   0.55(3.26+0.44)
    7820.14: extended grep (e.t[^ ]*|v.ry) rare   0.54(3.34+0.38)
    7820.15: perl grep (e.t[^ ]*|v.ry) rare       0.88(5.82+0.35)
    7820.17: basic grep m\(ú\|u\)lt.b\(æ\|y\)te   0.29(1.26+0.48)
    7820.18: extended grep m(ú|u)lt.b(æ|y)te      0.29(1.36+0.39)
    7820.19: perl grep m(ú|u)lt.b(æ|y)te          0.32(1.61+0.43)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/p7820-grep-engines.sh | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100755 t/perf/p7820-grep-engines.sh

diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
new file mode 100755
index 0000000000..eb439900f3
--- /dev/null
+++ b/t/perf/p7820-grep-engines.sh
@@ -0,0 +1,35 @@
+#!/bin/sh
+
+test_description="Comparison of git-grep's regex engines"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for pattern in \
+	'how.to' \
+	'^how to' \
+	'[how] to' \
+	'\(e.t[^ ]*\|v.ry\) rare' \
+	'm\(ú\|u\)lt.b\(æ\|y\)te'
+do
+	for engine in basic extended perl
+	do
+		if test $engine != "basic"
+		then
+			# Poor man's basic -> extended converter.
+			pattern=$(echo $pattern | sed 's/\\//g')
+		fi
+		test_perf "$engine grep $pattern" "
+			git -c grep.patternType=$engine grep -- '$pattern' >'out.$engine' || :
+		"
+	done
+
+	test_expect_success "assert that all engines found the same for $pattern" "
+		test_cmp 'out.basic' 'out.extended' &&
+		test_cmp 'out.basic' 'out.perl'
+	"
+done
+
+test_done
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 17/29] perf: add a performance comparison of fixed-string grep
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (15 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 16/29] perf: add a performance comparison test of grep -G, -E and -P Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:14 ` [PATCH v2 18/29] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a performance comparison test which compares both case-sensitive &
case-insensitive fixed-string grep, as well as non-ASCII
case-sensitive & case-insensitive grep.

Currently only the "-i æ" performance test doesn't go through the same
kwset.[ch] codepath, see the "Even when -F..." comment in grep.c.

    $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7821-grep-engines-fixed.sh
    ----------------------------------------------
    7821.1: fixed grep int         1.75(1.39+0.34)
    7821.2: basic grep int         1.68(1.38+0.28)
    7821.3: extended grep int      1.75(1.41+0.29)
    7821.4: perl grep int          1.73(1.40+0.30)
    7821.6: fixed grep -i int      1.94(1.54+0.35)
    7821.7: basic grep -i int      1.92(1.57+0.32)
    7821.8: extended grep -i int   1.90(1.54+0.30)
    7821.9: perl grep -i int       1.91(1.53+0.36)
    7821.11: fixed grep æ          1.35(1.14+0.18)
    7821.12: basic grep æ          1.34(1.16+0.16)
    7821.13: extended grep æ       1.33(1.15+0.17)
    7821.14: perl grep æ           1.35(1.12+0.20)
    7821.16: fixed grep -i æ       0.72(0.49+0.22)
    7821.17: basic grep -i æ       0.74(0.49+0.21)
    7821.18: extended grep -i æ    0.72(0.48+0.22)
    7821.19: perl grep -i æ        0.71(0.44+0.23)

I'm planning to make that not be the case, this performance test gives
a baseline for comparing performance before & after any such change.

See commit ("perf: add a performance comparison test of grep -G, -E
and -P", 2017-04-19) for details on the machine the above test run was
executed on.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/perf/p7821-grep-engines-fixed.sh | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
 create mode 100755 t/perf/p7821-grep-engines-fixed.sh

diff --git a/t/perf/p7821-grep-engines-fixed.sh b/t/perf/p7821-grep-engines-fixed.sh
new file mode 100755
index 0000000000..d771cccfdf
--- /dev/null
+++ b/t/perf/p7821-grep-engines-fixed.sh
@@ -0,0 +1,26 @@
+#!/bin/sh
+
+test_description="Comparison of fixed string grep under git-grep's regex engines"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for args in 'int' '-i int' 'æ' '-i æ'
+do
+	for engine in fixed basic extended perl
+	do
+		test_perf "$engine grep $args" "
+			git -c grep.patternType=$engine grep $args >'out.$engine.$args' || :
+		"
+	done
+
+	test_expect_success "assert that all engines found the same for $args" "
+		test_cmp 'out.fixed.$args' 'out.basic.$args' &&
+		test_cmp 'out.fixed.$args' 'out.extended.$args' &&
+		test_cmp 'out.fixed.$args' 'out.perl.$args'
+	"
+done
+
+test_done
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 18/29] grep: catch a missing enum in switch statement
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (16 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 17/29] perf: add a performance comparison of fixed-string grep Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-15  5:50   ` Junio C Hamano
  2017-05-13 23:14 ` [PATCH v2 19/29] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a die(...) to a default case for the switch statement selecting
between grep pattern types under --recurse-submodules.

Normally this would be caught by -Wswitch, but the grep_pattern_type
type is converted to int by going through parse_options(). Changing
the argument type passed to compile_submodule_options() won't work,
the value will just get coerced. The -Wswitch-default warning will
warn about it, but that produces a lot of noise across the codebase,
this potential issue would be drowned in that noise.

Thus catching this at runtime is the least worst option. This won't
ever trigger in practice, but if a new pattern type were to be added
this catches an otherwise silent bug during development.

See commit 0281e487fd ("grep: optionally recurse into submodules",
2016-12-16) for the initial addition of this code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ffb5b4e81..a191e2976b 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt *opt,
 		break;
 	case GREP_PATTERN_TYPE_UNSPECIFIED:
 		break;
+	default:
+		die("BUG: Added a new grep pattern type without updating switch statement");
 	}
 
 	for (pattern = opt->pattern_list; pattern != NULL;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 19/29] grep: remove redundant regflags assignment under PCRE
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (17 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 18/29] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:14 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Remove a redundant assignment to the "regflags" variable. This
variable is only used for POSIX regular expression matching, not when
the PCRE library is used.

This redundant assignment was added as a result of copy/paste
programming in commit 84befcd0a4 ("grep: add a grep.patternType
configuration setting", 2012-08-03). That commit modified already
working code in commit cca2c172e0 ("git-grep: do not die upon -F/-P
when grep.extendedRegexp is set.", 2011-05-09) which didn't assign to
regflags when under PCRE.

Revert back to that behavior, more to reduce "wait this is used under
PCRE how?" confusion when reading the code, than to to save ourselves
trivial CPU cycles by removing one assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/grep.c b/grep.c
index 47cee45067..59ae7809f2 100644
--- a/grep.c
+++ b/grep.c
@@ -197,7 +197,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_PCRE:
 		opt->fixed = 0;
 		opt->pcre = 1;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 	}
 }
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (18 preceding siblings ...)
  2017-05-13 23:14 ` [PATCH v2 19/29] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-15  6:14   ` Junio C Hamano
  2017-05-13 23:15 ` [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Remove redundant assignments to the "regflags" variable. There are no
code paths that have previously set the regflags to anything, and
certainly not to `|= REG_EXTENDED`.

This code gave the impression that it had to reset its environment,
but it doesn't. This dates back to the initial introduction of
git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/grep.c b/grep.c
index 59ae7809f2..bf6c2494fd 100644
--- a/grep.c
+++ b/grep.c
@@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
@@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
 		opt->pcre = 0;
-		opt->regflags &= ~REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_PCRE:
@@ -414,10 +412,9 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
 {
 	struct strbuf sb = STRBUF_INIT;
 	int err;
-	int regflags;
+	int regflags = opt->regflags;
 
 	basic_regex_quote_buf(&sb, p->pattern);
-	regflags = opt->regflags & ~REG_EXTENDED;
 	if (opt->ignore_case)
 		regflags |= REG_ICASE;
 	err = regcomp(&p->regexp, sb.buf, regflags);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (19 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-15  6:24   ` Junio C Hamano
  2017-05-13 23:15 ` [PATCH v2 22/29] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Factor the test for \0 in grep patterns into a function. Since commit
9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
\0 is considered fixed as regcomp() can't handle it.

This limitation was never documented, and other some regular
expression engines are capable of compiling a pattern containing a
\0. Factoring this out makes a subsequent change which does that
smaller.

See a previous commit in this series ("grep: add tests to fix blind
spots with \0 patterns", 2017-04-21) for further details & rationale.

While I'm at it make the comment conform to the style guide, i.e. add
an opening "/*\n".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/grep.c b/grep.c
index bf6c2494fd..79eb681c6e 100644
--- a/grep.c
+++ b/grep.c
@@ -321,6 +321,18 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p,
 	die("%s'%s': %s", where, p->pattern, error);
 }
 
+static int has_null(const char *s, size_t len)
+{
+	/*
+	 * regcomp cannot accept patterns with NULs so when using it
+	 * we consider any pattern containing a NUL fixed.
+	 */
+	if (memchr(s, 0, len))
+		return 1;
+
+	return 0;
+}
+
 #ifdef USE_LIBPCRE
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
@@ -394,12 +406,6 @@ static int is_fixed(const char *s, size_t len)
 {
 	size_t i;
 
-	/* regcomp cannot accept patterns with NULs so we
-	 * consider any pattern containing a NUL fixed.
-	 */
-	if (memchr(s, 0, len))
-		return 1;
-
 	for (i = 0; i < len; i++) {
 		if (is_regex_special(s[i]))
 			return 0;
@@ -451,7 +457,7 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 	 * simple string match using kws.  p->fixed tells us if we
 	 * want to use kws.
 	 */
-	if (opt->fixed || is_fixed(p->pattern, p->patternlen))
+	if (opt->fixed || has_null(p->pattern, p->patternlen) || is_fixed(p->pattern, p->patternlen))
 		p->fixed = !icase || ascii_only;
 	else
 		p->fixed = 0;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 22/29] grep: change the internal PCRE macro names to be PCRE1
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (20 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 23/29] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Change the internal USE_LIBPCRE define, & build options flag to use a
naming convention ending in PCRE1, without changing the long-standing
USE_LIBPCRE Makefile flag which enables this code.

This is for preparation for libpcre2 support where having things like
USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely
need to for backwards compatibility with old Makefile arguments would
be confusing.

In some ways it would be better to change everything that now uses
USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying
USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time
burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their
build scripts.

However I'd like to leave the door open to making
USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease,
i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it
the default.

This code and the USE_LIBPCRE Makefile argument was added in commit
63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was
no indication that the PCRE project would release an entirely new &
incompatible API around 3 years later.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile      | 4 ++--
 grep.c        | 6 +++---
 grep.h        | 2 +-
 t/test-lib.sh | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index d1587452f1..374fbc7e58 100644
--- a/Makefile
+++ b/Makefile
@@ -1088,7 +1088,7 @@ ifdef NO_LIBGEN_H
 endif
 
 ifdef USE_LIBPCRE
-	BASIC_CFLAGS += -DUSE_LIBPCRE
+	BASIC_CFLAGS += -DUSE_LIBPCRE1
 	ifdef LIBPCREDIR
 		BASIC_CFLAGS += -I$(LIBPCREDIR)/include
 		EXTLIBS += -L$(LIBPCREDIR)/$(lib) $(CC_LD_DYNPATH)$(LIBPCREDIR)/$(lib)
@@ -2240,7 +2240,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo TAR=\''$(subst ','\'',$(subst ','\'',$(TAR)))'\' >>$@+
 	@echo NO_CURL=\''$(subst ','\'',$(subst ','\'',$(NO_CURL)))'\' >>$@+
 	@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
-	@echo USE_LIBPCRE=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
+	@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
 	@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
 	@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
diff --git a/grep.c b/grep.c
index 79eb681c6e..854f2404be 100644
--- a/grep.c
+++ b/grep.c
@@ -333,7 +333,7 @@ static int has_null(const char *s, size_t len)
 	return 0;
 }
 
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	const char *error;
@@ -385,7 +385,7 @@ static void free_pcre_regexp(struct grep_pat *p)
 	pcre_free(p->pcre_extra_info);
 	pcre_free((void *)p->pcre_tables);
 }
-#else /* !USE_LIBPCRE */
+#else /* !USE_LIBPCRE1 */
 static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE");
@@ -400,7 +400,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 static void free_pcre_regexp(struct grep_pat *p)
 {
 }
-#endif /* !USE_LIBPCRE */
+#endif /* !USE_LIBPCRE1 */
 
 static int is_fixed(const char *s, size_t len)
 {
diff --git a/grep.h b/grep.h
index 267534ca24..073b0e4c92 100644
--- a/grep.h
+++ b/grep.h
@@ -1,7 +1,7 @@
 #ifndef GREP_H
 #define GREP_H
 #include "color.h"
-#ifdef USE_LIBPCRE
+#ifdef USE_LIBPCRE1
 #include <pcre.h>
 #else
 typedef int pcre;
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 6abf1d1918..e5cfbcc36b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1010,7 +1010,7 @@ esac
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
-test -n "$USE_LIBPCRE" && test_set_prereq PCRE
+test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
 
 # Can we rely on git's output in the C locale?
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 23/29] grep: change internal *pcre* variable & function names to be *pcre1*
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (21 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 22/29] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 24/29] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Change the internal PCRE variable & function names to have a "1"
suffix. This is for preparation for libpcre2 support, where having
non-versioned names would be confusing.

An earlier change in this series ("grep: change the internal PCRE
macro names to be PCRE1", 2017-04-07) elaborates on the motivations
behind this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 52 ++++++++++++++++++++++++++--------------------------
 grep.h |  8 ++++----
 2 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/grep.c b/grep.c
index 854f2404be..07512346b1 100644
--- a/grep.c
+++ b/grep.c
@@ -178,23 +178,23 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
 
 	case GREP_PATTERN_TYPE_BRE:
 		opt->fixed = 0;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		break;
 
 	case GREP_PATTERN_TYPE_ERE:
 		opt->fixed = 0;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		opt->regflags |= REG_EXTENDED;
 		break;
 
 	case GREP_PATTERN_TYPE_FIXED:
 		opt->fixed = 1;
-		opt->pcre = 0;
+		opt->pcre1 = 0;
 		break;
 
 	case GREP_PATTERN_TYPE_PCRE:
 		opt->fixed = 0;
-		opt->pcre = 1;
+		opt->pcre1 = 1;
 		break;
 	}
 }
@@ -334,7 +334,7 @@ static int has_null(const char *s, size_t len)
 }
 
 #ifdef USE_LIBPCRE1
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	const char *error;
 	int erroffset;
@@ -342,23 +342,23 @@ static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
 
 	if (opt->ignore_case) {
 		if (has_non_ascii(p->pattern))
-			p->pcre_tables = pcre_maketables();
+			p->pcre1_tables = pcre_maketables();
 		options |= PCRE_CASELESS;
 	}
 	if (is_utf8_locale() && has_non_ascii(p->pattern))
 		options |= PCRE_UTF8;
 
-	p->pcre_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
-				      p->pcre_tables);
-	if (!p->pcre_regexp)
+	p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset,
+				      p->pcre1_tables);
+	if (!p->pcre1_regexp)
 		compile_regexp_failed(p, error);
 
-	p->pcre_extra_info = pcre_study(p->pcre_regexp, 0, &error);
-	if (!p->pcre_extra_info && error)
+	p->pcre1_extra_info = pcre_study(p->pcre1_regexp, 0, &error);
+	if (!p->pcre1_extra_info && error)
 		die("%s", error);
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 		regmatch_t *match, int eflags)
 {
 	int ovector[30], ret, flags = 0;
@@ -366,7 +366,7 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 	if (eflags & REG_NOTBOL)
 		flags |= PCRE_NOTBOL;
 
-	ret = pcre_exec(p->pcre_regexp, p->pcre_extra_info, line, eol - line,
+	ret = pcre_exec(p->pcre1_regexp, p->pcre1_extra_info, line, eol - line,
 			0, flags, ovector, ARRAY_SIZE(ovector));
 	if (ret < 0 && ret != PCRE_ERROR_NOMATCH)
 		die("pcre_exec failed with error code %d", ret);
@@ -379,25 +379,25 @@ static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
 	return ret;
 }
 
-static void free_pcre_regexp(struct grep_pat *p)
+static void free_pcre1_regexp(struct grep_pat *p)
 {
-	pcre_free(p->pcre_regexp);
-	pcre_free(p->pcre_extra_info);
-	pcre_free((void *)p->pcre_tables);
+	pcre_free(p->pcre1_regexp);
+	pcre_free(p->pcre1_extra_info);
+	pcre_free((void *)p->pcre1_tables);
 }
 #else /* !USE_LIBPCRE1 */
-static void compile_pcre_regexp(struct grep_pat *p, const struct grep_opt *opt)
+static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt)
 {
 	die("cannot use Perl-compatible regexes when not compiled with USE_LIBPCRE");
 }
 
-static int pcrematch(struct grep_pat *p, const char *line, const char *eol,
+static int pcre1match(struct grep_pat *p, const char *line, const char *eol,
 		regmatch_t *match, int eflags)
 {
 	return 1;
 }
 
-static void free_pcre_regexp(struct grep_pat *p)
+static void free_pcre1_regexp(struct grep_pat *p)
 {
 }
 #endif /* !USE_LIBPCRE1 */
@@ -477,8 +477,8 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 		return;
 	}
 
-	if (opt->pcre) {
-		compile_pcre_regexp(p, opt);
+	if (opt->pcre1) {
+		compile_pcre1_regexp(p, opt);
 		return;
 	}
 
@@ -834,8 +834,8 @@ void free_grep_patterns(struct grep_opt *opt)
 		case GREP_PATTERN_BODY:
 			if (p->kws)
 				kwsfree(p->kws);
-			else if (p->pcre_regexp)
-				free_pcre_regexp(p);
+			else if (p->pcre1_regexp)
+				free_pcre1_regexp(p);
 			else
 				regfree(&p->regexp);
 			free(p->pattern);
@@ -914,8 +914,8 @@ static int patmatch(struct grep_pat *p, char *line, char *eol,
 
 	if (p->fixed)
 		hit = !fixmatch(p, line, eol, match);
-	else if (p->pcre_regexp)
-		hit = !pcrematch(p, line, eol, match, eflags);
+	else if (p->pcre1_regexp)
+		hit = !pcre1match(p, line, eol, match, eflags);
 	else
 		hit = !regexec_buf(&p->regexp, line, eol - line, 1, match,
 				   eflags);
diff --git a/grep.h b/grep.h
index 073b0e4c92..38ac82b638 100644
--- a/grep.h
+++ b/grep.h
@@ -46,9 +46,9 @@ struct grep_pat {
 	size_t patternlen;
 	enum grep_header_field field;
 	regex_t regexp;
-	pcre *pcre_regexp;
-	pcre_extra *pcre_extra_info;
-	const unsigned char *pcre_tables;
+	pcre *pcre1_regexp;
+	pcre_extra *pcre1_extra_info;
+	const unsigned char *pcre1_tables;
 	kwset_t kws;
 	unsigned fixed:1;
 	unsigned ignore_case:1;
@@ -111,7 +111,7 @@ struct grep_opt {
 	int allow_textconv;
 	int extended;
 	int use_reflog_filter;
-	int pcre;
+	int pcre1;
 	int relative;
 	int pathname;
 	int null_following_name;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 24/29] grep: move is_fixed() earlier to avoid forward declaration
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (22 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 23/29] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 25/29] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Move the is_fixed() function which are currently only used in
compile_regexp() earlier so it can be used in the PCRE family of
functions in a later change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 grep.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/grep.c b/grep.c
index 07512346b1..1157529115 100644
--- a/grep.c
+++ b/grep.c
@@ -321,6 +321,18 @@ static NORETURN void compile_regexp_failed(const struct grep_pat *p,
 	die("%s'%s': %s", where, p->pattern, error);
 }
 
+static int is_fixed(const char *s, size_t len)
+{
+	size_t i;
+
+	for (i = 0; i < len; i++) {
+		if (is_regex_special(s[i]))
+			return 0;
+	}
+
+	return 1;
+}
+
 static int has_null(const char *s, size_t len)
 {
 	/*
@@ -402,18 +414,6 @@ static void free_pcre1_regexp(struct grep_pat *p)
 }
 #endif /* !USE_LIBPCRE1 */
 
-static int is_fixed(const char *s, size_t len)
-{
-	size_t i;
-
-	for (i = 0; i < len; i++) {
-		if (is_regex_special(s[i]))
-			return 0;
-	}
-
-	return 1;
-}
-
 static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
 {
 	struct strbuf sb = STRBUF_INIT;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 25/29] test-lib: add a PTHREADS prerequisite
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (23 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 24/29] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 26/29] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a PTHREADS prerequisite which is false when git is compiled with
NO_PTHREADS=YesPlease.

There's lots of custom code that runs when threading isn't available,
but before this prerequisite there was no way to test it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile      | 1 +
 t/README      | 4 ++++
 t/test-lib.sh | 1 +
 3 files changed, 6 insertions(+)

diff --git a/Makefile b/Makefile
index 374fbc7e58..a79274e5e6 100644
--- a/Makefile
+++ b/Makefile
@@ -2242,6 +2242,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_EXPAT=\''$(subst ','\'',$(subst ','\'',$(NO_EXPAT)))'\' >>$@+
 	@echo USE_LIBPCRE1=\''$(subst ','\'',$(subst ','\'',$(USE_LIBPCRE)))'\' >>$@+
 	@echo NO_PERL=\''$(subst ','\'',$(subst ','\'',$(NO_PERL)))'\' >>$@+
+	@echo NO_PTHREADS=\''$(subst ','\'',$(subst ','\'',$(NO_PTHREADS)))'\' >>$@+
 	@echo NO_PYTHON=\''$(subst ','\'',$(subst ','\'',$(NO_PYTHON)))'\' >>$@+
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
diff --git a/t/README b/t/README
index a90cb62583..2f95860369 100644
--- a/t/README
+++ b/t/README
@@ -817,6 +817,10 @@ use these, and "test_set_prereq" for how to define your own.
    Test is run on a filesystem which converts decomposed utf-8 (nfd)
    to precomposed utf-8 (nfc).
 
+ - PTHREADS
+
+   Git wasn't compiled with NO_PTHREADS=YesPlease.
+
 Tips for Writing Tests
 ----------------------
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index e5cfbcc36b..ab92c0ebaa 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1009,6 +1009,7 @@ esac
 
 ( COLUMNS=1 && test $COLUMNS = 1 ) && test_set_prereq COLUMNS_CAN_BE_1
 test -z "$NO_PERL" && test_set_prereq PERL
+test -z "$NO_PTHREADS" && test_set_prereq PTHREADS
 test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE1" && test_set_prereq PCRE
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 26/29] pack-objects & index-pack: add test for --threads warning
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (24 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 25/29] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 27/29] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a test for the warning that's emitted when --threads or
pack.threads is provided under NO_PTHREADS=YesPlease. This uses the
new PTHREADS prerequisite.

The assertion for C_LOCALE_OUTPUT in the latter test is currently
redundant, since unlike index-pack the pack-objects warnings aren't
i18n'd. However they might be changed to be i18n'd in the future, and
there's no harm in future-proofing the test.

There's an existing bug in the implementation of pack-objects which
this test currently tests for as-is. Details about the bug & the fix
are included in a follow-up change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t5300-pack-object.sh | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 43a672c345..1629fa80b0 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -421,6 +421,40 @@ test_expect_success 'index-pack <pack> works in non-repo' '
 	test_path_is_file foo.idx
 '
 
+test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'index-pack --threads=N or pack.threads=N warns when no pthreads' '
+	test_must_fail git index-pack --threads=2 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 1 warnings &&
+	grep -F "no threads support, ignoring --threads=2" err &&
+	test_must_fail git -c pack.threads=2 index-pack 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 1 warnings &&
+	grep -F "no threads support, ignoring pack.threads" err &&
+	test_must_fail git -c pack.threads=2 index-pack --threads=4 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 2 warnings &&
+	grep -F "no threads support, ignoring --threads=4" err &&
+	grep -F "no threads support, ignoring pack.threads" err
+'
+
+test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or pack.threads=N warns when no pthreads' '
+	git pack-objects --threads=2 --stdout --all </dev/null >/dev/null 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 1 warnings &&
+	grep -F "no threads support, ignoring --threads" err &&
+	git -c pack.threads=2 pack-objects --stdout --all </dev/null >/dev/null 2>err &&
+	cat err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 2 warnings &&
+	grep -F "no threads support, ignoring --threads" err &&
+	grep -F "no threads support, ignoring pack.threads" err &&
+	git -c pack.threads=2 pack-objects --threads=4 --stdout --all </dev/null >/dev/null 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 2 warnings &&
+	grep -F "no threads support, ignoring --threads" err &&
+	grep -F "no threads support, ignoring pack.threads" err
+'
+
 #
 # WARNING!
 #
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 27/29] pack-objects: fix buggy warning about threads
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (25 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 26/29] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-15  8:59   ` Junio C Hamano
  2017-05-13 23:15 ` [PATCH v2 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Fix a buggy warning about threads under NO_PTHREADS=YesPlease. Due to
re-using the delta_search_threads variable for both the state of the
"pack.threads" config & the --threads option, setting "pack.threads"
but not supplying --threads would trigger the warning for both
"pack.threads" & --threads.

Solve this bug by resetting the delta_search_threads variable in
git_pack_config(), it might then be set by --threads again and be
subsequently warned about, as the test I'm changing here asserts.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/pack-objects.c | 4 +++-
 t/t5300-pack-object.sh | 3 +--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0fe35d1b5a..f1baf05dfe 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2472,8 +2472,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 			die("invalid number of threads specified (%d)",
 			    delta_search_threads);
 #ifdef NO_PTHREADS
-		if (delta_search_threads != 1)
+		if (delta_search_threads != 1) {
 			warning("no threads support, ignoring %s", k);
+			delta_search_threads = 0;
+		}
 #endif
 		return 0;
 	}
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index 1629fa80b0..0ec25c4966 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -445,8 +445,7 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or pack.
 	git -c pack.threads=2 pack-objects --stdout --all </dev/null >/dev/null 2>err &&
 	cat err &&
 	grep ^warning: err >warnings &&
-	test_line_count = 2 warnings &&
-	grep -F "no threads support, ignoring --threads" err &&
+	test_line_count = 1 warnings &&
 	grep -F "no threads support, ignoring pack.threads" err &&
 	git -c pack.threads=2 pack-objects --threads=4 --stdout --all </dev/null >/dev/null 2>err &&
 	grep ^warning: err >warnings &&
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (26 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 27/29] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-13 23:15 ` [PATCH v2 29/29] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason
  2017-05-15  9:09 ` [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Junio C Hamano
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add a warning about missing thread support when grep.threads or
--threads is set to a non 0 (default) or 1 (no parallelism) value
under NO_PTHREADS=YesPlease.

This is for consistency with the index-pack & pack-objects commands,
which also take a --threads option & are configurable via
pack.threads, and have long warned about the same under
NO_PTHREADS=YesPlease.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c  | 13 +++++++++++++
 t/t7810-grep.sh | 18 ++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index a191e2976b..3c721b75a5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -289,6 +289,17 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 		if (num_threads < 0)
 			die(_("invalid number of threads specified (%d) for %s"),
 			    num_threads, var);
+#ifdef NO_PTHREADS
+		else if (num_threads && num_threads != 1) {
+			/*
+			 * TRANSLATORS: %s is the configuration
+			 * variable for tweaking threads, currently
+			 * grep.threads
+			 */
+			warning(_("no threads support, ignoring %s"), var);
+			num_threads = 0;
+		}
+#endif
 	}
 
 	return st;
@@ -1229,6 +1240,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	else if (num_threads < 0)
 		die(_("invalid number of threads specified (%d)"), num_threads);
 #else
+	if (num_threads)
+		warning(_("no threads support, ignoring --threads"));
 	num_threads = 0;
 #endif
 
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 561709ef6a..f106387820 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -791,6 +791,24 @@ do
 	"
 done
 
+test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'grep --threads=N or pack.threads=N warns when no pthreads' '
+	git grep --threads=2 Hello hello_world 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 1 warnings &&
+	grep -F "no threads support, ignoring --threads" err &&
+	git -c grep.threads=2 grep Hello hello_world 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 1 warnings &&
+	grep -F "no threads support, ignoring grep.threads" err &&
+	git -c grep.threads=2 grep --threads=4 Hello hello_world 2>err &&
+	grep ^warning: err >warnings &&
+	test_line_count = 2 warnings &&
+	grep -F "no threads support, ignoring --threads" err &&
+	grep -F "no threads support, ignoring grep.threads" err &&
+	git -c grep.threads=0 grep --threads=0 Hello hello_world 2>err &&
+	test_line_count = 0 err
+'
+
 test_expect_success 'grep from a subdirectory to search wider area (1)' '
 	mkdir -p s &&
 	(
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v2 29/29] grep: assert that threading is enabled when calling grep_{lock,unlock}
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (27 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
@ 2017-05-13 23:15 ` Ævar Arnfjörð Bjarmason
  2017-05-15  9:09 ` [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Junio C Hamano
  29 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-13 23:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Change the grep_{lock,unlock} functions to assert that num_threads is
true, instead of only locking & unlocking the pthread mutex lock when
it is.

These functions are never called when num_threads isn't true, this
logic has gone through multiple iterations since the initial
introduction of grep threading in commit 5b594f457a ("Threaded grep",
2010-01-25), but ever since then they'd only be called if num_threads
was true, so this check made the code confusing to read.

Replace the check with an assertion, so that it's clear to the reader
that this code path is never taken unless we're spawning threads.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/grep.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 3c721b75a5..b1095362fb 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -73,14 +73,14 @@ static pthread_mutex_t grep_mutex;
 
 static inline void grep_lock(void)
 {
-	if (num_threads)
-		pthread_mutex_lock(&grep_mutex);
+	assert(num_threads);
+	pthread_mutex_lock(&grep_mutex);
 }
 
 static inline void grep_unlock(void)
 {
-	if (num_threads)
-		pthread_mutex_unlock(&grep_mutex);
+	assert(num_threads);
+	pthread_mutex_unlock(&grep_mutex);
 }
 
 /* Signalled when a new work_item is added to todo. */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config
  2017-05-13 23:14 ` [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
@ 2017-05-15  4:57   ` Junio C Hamano
  2017-05-15 17:38     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  4:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +	echo 2e >expect &&
> +	# In PCRE \d in [\d] is like saying "0-9", and matches the 2
> +	# in 2e...
> +	git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual &&
> +	test_cmp expect actual &&
> +	echo 1d >expect &&
> +	# ...in POSIX basic & extended it is the same as [d],
> +	# i.e. "d", which matches 1d, but not and does not match 2e.

s/not and//; I think.

> +	git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual &&
>  	test_cmp expect actual
>  '
>  
> @@ -280,6 +301,77 @@ test_expect_success 'log with grep.patternType configuration and command line' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'log with various grep.patternType configurations & command-lines' '
> +	git init pattern-type &&
> +	(
> +		cd pattern-type &&
> +		test_commit 1 file A &&
> +
> +		# The tagname is overridden here because creating a
> +		# tag called "(1|2)" as test_commit would otherwise
> +		# implicitly do would fail on e.g. MINGW.

Thanks.

> +		# POSIX extended needs to have | escaped to match it
> +		# literally, whereas under basic this is the same as
> +		# (|2), i.e. it would also match "1". This test checks
> +		# for extended by asserting that it is not matching
> +		# what basic would match.
> +		git -c grep.patternType=extended log --pretty=tformat:%s \
> +			--grep="\|2" >actual.extended &&

Makes sense.

> +		if test_have_prereq PCRE
> +		then
> +			# Only PCRE would match [\d]\| with only
> +			# "(1|2)" due to [\d]. POSIX basic would match
> +			# both it and "1", and POSIX extended would
> +			# match neither.

OK.  BRE would match because the other side of "\|" is empty to
match anything?

> +			git -c grep.patternType=perl log --pretty=tformat:%s \
> +				--grep="[\d]\|" >actual.perl &&
> +			test_cmp expect.perl actual.perl
> +		fi &&


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/29] grep: catch a missing enum in switch statement
  2017-05-13 23:14 ` [PATCH v2 18/29] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
@ 2017-05-15  5:50   ` Junio C Hamano
  2017-05-15 17:39     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  5:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Add a die(...) to a default case for the switch statement selecting
> between grep pattern types under --recurse-submodules.
>
> Normally this would be caught by -Wswitch, but the grep_pattern_type
> type is converted to int by going through parse_options(). Changing
> the argument type passed to compile_submodule_options() won't work,
> the value will just get coerced. The -Wswitch-default warning will
> warn about it, but that produces a lot of noise across the codebase,
> this potential issue would be drowned in that noise.
>
> Thus catching this at runtime is the least worst option. This won't

least bad, I guess?

> ever trigger in practice, but if a new pattern type were to be added
> this catches an otherwise silent bug during development.

Good future-proofing.  When this and Peff's "BUG" thing both
graduates, we can turn this into a BUG, I think.

Thanks.

>
> See commit 0281e487fd ("grep: optionally recurse into submodules",
> 2016-12-16) for the initial addition of this code.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/grep.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 3ffb5b4e81..a191e2976b 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt *opt,
>  		break;
>  	case GREP_PATTERN_TYPE_UNSPECIFIED:
>  		break;
> +	default:
> +		die("BUG: Added a new grep pattern type without updating switch statement");
>  	}
>  
>  	for (pattern = opt->pattern_list; pattern != NULL;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-05-13 23:15 ` [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
@ 2017-05-15  6:14   ` Junio C Hamano
  2017-05-15 17:41     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  6:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Remove redundant assignments to the "regflags" variable. There are no
> code paths that have previously set the regflags to anything, and
> certainly not to `|= REG_EXTENDED`.
>
> This code gave the impression that it had to reset its environment,
> but it doesn't. This dates back to the initial introduction of
> git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).

Back in 5010cb5fcc, we did do "opt.regflags &= ~REG_EXTENDED" upon
seeing "-G" on the command line and flipped the bit on upon seeing
"-E", but I think that was perfectly sensible and it would have been
a bug if we didn't.  They were part of the command line parsing that
could have seen "-E" on the command line earlier.

If we want to find a commit to assign blame to, I think it is more
correct to say that this came from the same one as 19/29 fixes.

When cca2c172 ("git-grep: do not die upon -F/-P when
grep.extendedRegexp is set.", 2011-05-09) switched the command line
parsing to "read into a 'tentatively this is what we saw the last'
variable and then finally commit just once", we didn't touch
opt.regflags for PCRE and FIXED, but we still had to flip regflags
between BRE and ERE, because parsing of grep.extendedregexp
configuration variable directly touched opt.regflags back then,
which was done by b22520a3 ("grep: allow -E and -n to be turned on
by default via configuration", 2011-03-30).  When 84befcd0 ("grep:
add a grep.patternType configuration setting", 2012-08-03)
introduced extended_regexp_option field, we stopped flipping
regflags while reading the configuration, and that was when we
should have noticed and stopped dropping REG_EXTENDED bit in the
"now we can commit what type to use" helper function.

In any case, I think this change is safe to do in the current
codebase.  I wonder if this and 19/29 should be a single patch,
though, as the unnecessary bit-flipping all are blamed to the same
origin.

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 59ae7809f2..bf6c2494fd 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
>  	case GREP_PATTERN_TYPE_BRE:
>  		opt->fixed = 0;
>  		opt->pcre = 0;
> -		opt->regflags &= ~REG_EXTENDED;
>  		break;
>  
>  	case GREP_PATTERN_TYPE_ERE:
> @@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
>  	case GREP_PATTERN_TYPE_FIXED:
>  		opt->fixed = 1;
>  		opt->pcre = 0;
> -		opt->regflags &= ~REG_EXTENDED;
>  		break;
>  
>  	case GREP_PATTERN_TYPE_PCRE:
> @@ -414,10 +412,9 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  {
>  	struct strbuf sb = STRBUF_INIT;
>  	int err;
> -	int regflags;
> +	int regflags = opt->regflags;
>  
>  	basic_regex_quote_buf(&sb, p->pattern);
> -	regflags = opt->regflags & ~REG_EXTENDED;
>  	if (opt->ignore_case)
>  		regflags |= REG_ICASE;
>  	err = regcomp(&p->regexp, sb.buf, regflags);

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function
  2017-05-13 23:15 ` [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
@ 2017-05-15  6:24   ` Junio C Hamano
  2017-05-15 18:07     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  6:24 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Factor the test for \0 in grep patterns into a function. Since commit
> 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
> \0 is considered fixed as regcomp() can't handle it.
>
> This limitation was never documented, and other some regular
> expression engines are capable of compiling a pattern containing a
> \0. Factoring this out makes a subsequent change which does that
> smaller.

The latter half of the first sentence of this paragraph does not
parse well, around "and other some".  "never documented" makes
readers expect "... so let's document it", but I think you are
driving in a different direction, something like...

        Since some other regular expression engines are capable of a
        pattern to match a string with NUL in it, it would be nice
        if we can use such an engine and match against NUL; as this
        "patterns with NUL always use --fixed match" is not documented,
	let's hope that nobody depend on that current behaviour.
        This step does not yet change the behaviour yet, but it
        makes it easier to do so in later steps.

perhaps?  I tend to agree that it is OK to break users who depended
on that "patterns with NULL match with --fixed" (partly because it
is so unintuitive and not useful behaviour, and partly because it is
easy for them to explicitly say -F), but let's make sure we clearly
say that we will be knowingly breaking them.

Thanks.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 27/29] pack-objects: fix buggy warning about threads
  2017-05-13 23:15 ` [PATCH v2 27/29] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
@ 2017-05-15  8:59   ` Junio C Hamano
  2017-05-15 17:16     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  8:59 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix a buggy warning about threads under NO_PTHREADS=YesPlease. Due to
> re-using the delta_search_threads variable for both the state of the
> "pack.threads" config & the --threads option, setting "pack.threads"
> but not supplying --threads would trigger the warning for both
> "pack.threads" & --threads.
>
> Solve this bug by resetting the delta_search_threads variable in
> git_pack_config(), it might then be set by --threads again and be
> subsequently warned about, as the test I'm changing here asserts.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/pack-objects.c | 4 +++-
>  t/t5300-pack-object.sh | 3 +--
>  2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 0fe35d1b5a..f1baf05dfe 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2472,8 +2472,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
>  			die("invalid number of threads specified (%d)",
>  			    delta_search_threads);
>  #ifdef NO_PTHREADS
> -		if (delta_search_threads != 1)
> +		if (delta_search_threads != 1) {
>  			warning("no threads support, ignoring %s", k);
> +			delta_search_threads = 0;
> +		}
>  #endif
>  		return 0;
>  	}
> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
> index 1629fa80b0..0ec25c4966 100755
> --- a/t/t5300-pack-object.sh
> +++ b/t/t5300-pack-object.sh
> @@ -445,8 +445,7 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or pack.
>  	git -c pack.threads=2 pack-objects --stdout --all </dev/null >/dev/null 2>err &&
>  	cat err &&
>  	grep ^warning: err >warnings &&
> -	test_line_count = 2 warnings &&
> -	grep -F "no threads support, ignoring --threads" err &&
> +	test_line_count = 1 warnings &&
>  	grep -F "no threads support, ignoring pack.threads" err &&
>  	git -c pack.threads=2 pack-objects --threads=4 --stdout --all </dev/null >/dev/null 2>err &&
>  	grep ^warning: err >warnings &&

Commenting on both 26 and 27.

The usual way we document a known breakage to be fixed is to write a
test that checks for the desired outcome with test_expect_failure,
and when a patch corrects the behaviour we just flip it to expect
success instead.  On the other hand, when we document a behaviour
that is accepted/acceptable we would have a test that checks for the
then-accepted behaviour with test_expect_success, and a patch that
improves the behaviour would update the expectation.

This follows the second pattern, even though the log message for the
patches claim this is about an existing bug and its fix.

Now, I am not saying (at least not yet) that 26 & 27 violates the
established practice and they must be changed to expect seeing only
one line of output in warnings with test_expect_failure in patch 26
which is flipped to test_expect_success in patch 27.  Yes, it does
not follow the usual pattern, but it gives a good food-for-thought.

Perhaps our usual pattern may be suboptimal in illustrating in what
way the current behaviour is not desirable, and the way these two
patches document the current breakage and then documents the fixed
behaviour may be a better example to follow?  With our usual way, it
is not apparent until you actually run the tests with the current
code what the questionable current behaviour is, but with the way
patch 26 is written, we can tell that two warnings are given,
instead of one.

I dunno.  What do others think?

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 00/29] Easy to review grep & pre-PCRE changes
  2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
                   ` (28 preceding siblings ...)
  2017-05-13 23:15 ` [PATCH v2 29/29] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason
@ 2017-05-15  9:09 ` Junio C Hamano
  29 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2017-05-15  9:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Jeffrey Walton, Michał Kiedrowicz, J Smith,
	Victor Leschuk, Nguyễn Thái Ngọc Duy,
	Fredrik Kuivinen, Brandon Williams, Stefan Beller,
	Johannes Schindelin

Except for a few nits and huh?s, I found this round quite a pleasant
read in general.  Thanks.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 27/29] pack-objects: fix buggy warning about threads
  2017-05-15  8:59   ` Junio C Hamano
@ 2017-05-15 17:16     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 17:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

On Mon, May 15, 2017 at 10:59 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Fix a buggy warning about threads under NO_PTHREADS=YesPlease. Due to
>> re-using the delta_search_threads variable for both the state of the
>> "pack.threads" config & the --threads option, setting "pack.threads"
>> but not supplying --threads would trigger the warning for both
>> "pack.threads" & --threads.
>>
>> Solve this bug by resetting the delta_search_threads variable in
>> git_pack_config(), it might then be set by --threads again and be
>> subsequently warned about, as the test I'm changing here asserts.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/pack-objects.c | 4 +++-
>>  t/t5300-pack-object.sh | 3 +--
>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index 0fe35d1b5a..f1baf05dfe 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -2472,8 +2472,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
>>                       die("invalid number of threads specified (%d)",
>>                           delta_search_threads);
>>  #ifdef NO_PTHREADS
>> -             if (delta_search_threads != 1)
>> +             if (delta_search_threads != 1) {
>>                       warning("no threads support, ignoring %s", k);
>> +                     delta_search_threads = 0;
>> +             }
>>  #endif
>>               return 0;
>>       }
>> diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
>> index 1629fa80b0..0ec25c4966 100755
>> --- a/t/t5300-pack-object.sh
>> +++ b/t/t5300-pack-object.sh
>> @@ -445,8 +445,7 @@ test_expect_success !PTHREADS,C_LOCALE_OUTPUT 'pack-objects --threads=N or pack.
>>       git -c pack.threads=2 pack-objects --stdout --all </dev/null >/dev/null 2>err &&
>>       cat err &&
>>       grep ^warning: err >warnings &&
>> -     test_line_count = 2 warnings &&
>> -     grep -F "no threads support, ignoring --threads" err &&
>> +     test_line_count = 1 warnings &&
>>       grep -F "no threads support, ignoring pack.threads" err &&
>>       git -c pack.threads=2 pack-objects --threads=4 --stdout --all </dev/null >/dev/null 2>err &&
>>       grep ^warning: err >warnings &&
>
> Commenting on both 26 and 27.
>
> The usual way we document a known breakage to be fixed is to write a
> test that checks for the desired outcome with test_expect_failure,
> and when a patch corrects the behaviour we just flip it to expect
> success instead.  On the other hand, when we document a behaviour
> that is accepted/acceptable we would have a test that checks for the
> then-accepted behaviour with test_expect_success, and a patch that
> improves the behaviour would update the expectation.
>
> This follows the second pattern, even though the log message for the
> patches claim this is about an existing bug and its fix.
>
> Now, I am not saying (at least not yet) that 26 & 27 violates the
> established practice and they must be changed to expect seeing only
> one line of output in warnings with test_expect_failure in patch 26
> which is flipped to test_expect_success in patch 27.  Yes, it does
> not follow the usual pattern, but it gives a good food-for-thought.
>
> Perhaps our usual pattern may be suboptimal in illustrating in what
> way the current behaviour is not desirable, and the way these two
> patches document the current breakage and then documents the fixed
> behaviour may be a better example to follow?  With our usual way, it
> is not apparent until you actually run the tests with the current
> code what the questionable current behaviour is, but with the way
> patch 26 is written, we can tell that two warnings are given,
> instead of one.
>
> I dunno.  What do others think?

I think it makes sense to make use of test_expect_failure here. I'll
do that in v3.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config
  2017-05-15  4:57   ` Junio C Hamano
@ 2017-05-15 17:38     ` Ævar Arnfjörð Bjarmason
  2017-05-16  0:50       ` Junio C Hamano
  0 siblings, 1 reply; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 17:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

On Mon, May 15, 2017 at 6:57 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> +     echo 2e >expect &&
>> +     # In PCRE \d in [\d] is like saying "0-9", and matches the 2
>> +     # in 2e...
>> +     git -C num_commits log -1 --pretty="tformat:%s" -F -E --perl-regexp --grep="[\d]" >actual &&
>> +     test_cmp expect actual &&
>> +     echo 1d >expect &&
>> +     # ...in POSIX basic & extended it is the same as [d],
>> +     # i.e. "d", which matches 1d, but not and does not match 2e.
>
> s/not and//; I think.

Will fix.

>> +     git -C num_commits log -1 --pretty="tformat:%s" -F -E --grep="[\d]" >actual &&
>>       test_cmp expect actual
>>  '
>>
>> @@ -280,6 +301,77 @@ test_expect_success 'log with grep.patternType configuration and command line' '
>>       test_cmp expect actual
>>  '
>>
>> +test_expect_success 'log with various grep.patternType configurations & command-lines' '
>> +     git init pattern-type &&
>> +     (
>> +             cd pattern-type &&
>> +             test_commit 1 file A &&
>> +
>> +             # The tagname is overridden here because creating a
>> +             # tag called "(1|2)" as test_commit would otherwise
>> +             # implicitly do would fail on e.g. MINGW.
>
> Thanks.
>
>> +             # POSIX extended needs to have | escaped to match it
>> +             # literally, whereas under basic this is the same as
>> +             # (|2), i.e. it would also match "1". This test checks
>> +             # for extended by asserting that it is not matching
>> +             # what basic would match.
>> +             git -c grep.patternType=extended log --pretty=tformat:%s \
>> +                     --grep="\|2" >actual.extended &&
>
> Makes sense.
>
>> +             if test_have_prereq PCRE
>> +             then
>> +                     # Only PCRE would match [\d]\| with only
>> +                     # "(1|2)" due to [\d]. POSIX basic would match
>> +                     # both it and "1", and POSIX extended would
>> +                     # match neither.
>
> OK.  BRE would match because the other side of "\|" is empty to
> match anything?

Yes. I'll clarify this. It's not just a POSIX basic feature. The same
is true of extended and perl. E.g.:

    git grep [-E|-P] 'foo|bar'

Both match the same as:

    git grep [-E|-P] '(foo|bar)'

>> +                     git -c grep.patternType=perl log --pretty=tformat:%s \
>> +                             --grep="[\d]\|" >actual.perl &&
>> +                     test_cmp expect.perl actual.perl
>> +             fi &&
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 18/29] grep: catch a missing enum in switch statement
  2017-05-15  5:50   ` Junio C Hamano
@ 2017-05-15 17:39     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 17:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

On Mon, May 15, 2017 at 7:50 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Add a die(...) to a default case for the switch statement selecting
>> between grep pattern types under --recurse-submodules.
>>
>> Normally this would be caught by -Wswitch, but the grep_pattern_type
>> type is converted to int by going through parse_options(). Changing
>> the argument type passed to compile_submodule_options() won't work,
>> the value will just get coerced. The -Wswitch-default warning will
>> warn about it, but that produces a lot of noise across the codebase,
>> this potential issue would be drowned in that noise.
>>
>> Thus catching this at runtime is the least worst option. This won't
>
> least bad, I guess?

Will fix.

>> ever trigger in practice, but if a new pattern type were to be added
>> this catches an otherwise silent bug during development.
>
> Good future-proofing.  When this and Peff's "BUG" thing both
> graduates, we can turn this into a BUG, I think.

Yup, this and a bunch of other stuff presumably. The BUG feature is nice.

> Thanks.
>
>>
>> See commit 0281e487fd ("grep: optionally recurse into submodules",
>> 2016-12-16) for the initial addition of this code.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  builtin/grep.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 3ffb5b4e81..a191e2976b 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -495,6 +495,8 @@ static void compile_submodule_options(const struct grep_opt *opt,
>>               break;
>>       case GREP_PATTERN_TYPE_UNSPECIFIED:
>>               break;
>> +     default:
>> +             die("BUG: Added a new grep pattern type without updating switch statement");
>>       }
>>
>>       for (pattern = opt->pattern_list; pattern != NULL;

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments
  2017-05-15  6:14   ` Junio C Hamano
@ 2017-05-15 17:41     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 17:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

On Mon, May 15, 2017 at 8:14 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Remove redundant assignments to the "regflags" variable. There are no
>> code paths that have previously set the regflags to anything, and
>> certainly not to `|= REG_EXTENDED`.
>>
>> This code gave the impression that it had to reset its environment,
>> but it doesn't. This dates back to the initial introduction of
>> git-grep in commit 5010cb5fcc ("built-in "git grep"", 2006-04-30).
>
> Back in 5010cb5fcc, we did do "opt.regflags &= ~REG_EXTENDED" upon
> seeing "-G" on the command line and flipped the bit on upon seeing
> "-E", but I think that was perfectly sensible and it would have been
> a bug if we didn't.  They were part of the command line parsing that
> could have seen "-E" on the command line earlier.
>
> If we want to find a commit to assign blame to, I think it is more
> correct to say that this came from the same one as 19/29 fixes.
>
> When cca2c172 ("git-grep: do not die upon -F/-P when
> grep.extendedRegexp is set.", 2011-05-09) switched the command line
> parsing to "read into a 'tentatively this is what we saw the last'
> variable and then finally commit just once", we didn't touch
> opt.regflags for PCRE and FIXED, but we still had to flip regflags
> between BRE and ERE, because parsing of grep.extendedregexp
> configuration variable directly touched opt.regflags back then,
> which was done by b22520a3 ("grep: allow -E and -n to be turned on
> by default via configuration", 2011-03-30).  When 84befcd0 ("grep:
> add a grep.patternType configuration setting", 2012-08-03)
> introduced extended_regexp_option field, we stopped flipping
> regflags while reading the configuration, and that was when we
> should have noticed and stopped dropping REG_EXTENDED bit in the
> "now we can commit what type to use" helper function.

Thanks for the history. I'll amend the commit message to note this.

> In any case, I think this change is safe to do in the current
> codebase.  I wonder if this and 19/29 should be a single patch,
> though, as the unnecessary bit-flipping all are blamed to the same
> origin.

Squashing these two makes sense. I'll do that.

>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  grep.c | 5 +----
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/grep.c b/grep.c
>> index 59ae7809f2..bf6c2494fd 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -179,7 +179,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
>>       case GREP_PATTERN_TYPE_BRE:
>>               opt->fixed = 0;
>>               opt->pcre = 0;
>> -             opt->regflags &= ~REG_EXTENDED;
>>               break;
>>
>>       case GREP_PATTERN_TYPE_ERE:
>> @@ -191,7 +190,6 @@ static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, st
>>       case GREP_PATTERN_TYPE_FIXED:
>>               opt->fixed = 1;
>>               opt->pcre = 0;
>> -             opt->regflags &= ~REG_EXTENDED;
>>               break;
>>
>>       case GREP_PATTERN_TYPE_PCRE:
>> @@ -414,10 +412,9 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>>  {
>>       struct strbuf sb = STRBUF_INIT;
>>       int err;
>> -     int regflags;
>> +     int regflags = opt->regflags;
>>
>>       basic_regex_quote_buf(&sb, p->pattern);
>> -     regflags = opt->regflags & ~REG_EXTENDED;
>>       if (opt->ignore_case)
>>               regflags |= REG_ICASE;
>>       err = regcomp(&p->regexp, sb.buf, regflags);

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function
  2017-05-15  6:24   ` Junio C Hamano
@ 2017-05-15 18:07     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 42+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 18:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

On Mon, May 15, 2017 at 8:24 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Factor the test for \0 in grep patterns into a function. Since commit
>> 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a
>> \0 is considered fixed as regcomp() can't handle it.
>>
>> This limitation was never documented, and other some regular
>> expression engines are capable of compiling a pattern containing a
>> \0. Factoring this out makes a subsequent change which does that
>> smaller.
>
> The latter half of the first sentence of this paragraph does not
> parse well, around "and other some".  "never documented" makes
> readers expect "... so let's document it", but I think you are
> driving in a different direction, something like...

I'll try to reword it.

>         Since some other regular expression engines are capable of a
>         pattern to match a string with NUL in it, it would be nice
>         if we can use such an engine and match against NUL; as this
>         "patterns with NUL always use --fixed match" is not documented,
>         let's hope that nobody depend on that current behaviour.
>         This step does not yet change the behaviour yet, but it
>         makes it easier to do so in later steps.
>
> perhaps?  I tend to agree that it is OK to break users who depended
> on that "patterns with NULL match with --fixed" (partly because it
> is so unintuitive and not useful behaviour, and partly because it is
> easy for them to explicitly say -F), but let's make sure we clearly
> say that we will be knowingly breaking them.

I think there's a few different things going on here, which I'll
address accordingly.

 * Yes, if someone is relying on the undocumented behavior of a \0
pattern implying -F if and when I change that their stuff will break.
I'm going to actually just drop this whole discussion from this
commit, we can have an explanation for that when and if I actually
change it in a later series.

* I'm making things more confusing by saying "engines" here, but it's
important to note that git has never in its docs promised to use the C
library implementation of POSIX regexes, it's just advertised that it
uses POSIX regular *expressions*, which are a different thing.
Implementation != syntax.

E.g. GNU grep uses POSIX regex syntax, but its engine does not have
the same limitation as ours:

    $ (command cd /tmp && printf "æ\0ð\n" >a && printf "æ\n" >> a;
printf "æ\0[öð]" >f; grep -a -f f a; echo $?)
    æð
    0

I went down this particular rabbit hole because I was genuinely
surprised that this didn't work with git-grep, until I realized that
of course the implementation detail of using C strings for this under
the hood was bleeding through.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config
  2017-05-15 17:38     ` Ævar Arnfjörð Bjarmason
@ 2017-05-16  0:50       ` Junio C Hamano
  0 siblings, 0 replies; 42+ messages in thread
From: Junio C Hamano @ 2017-05-16  0:50 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Jeff King, Jeffrey Walton,
	Michał Kiedrowicz, J Smith, Victor Leschuk,
	Nguyễn Thái Ngọc Duy, Fredrik Kuivinen,
	Brandon Williams, Stefan Beller, Johannes Schindelin

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, May 15, 2017 at 6:57 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>>> +             if test_have_prereq PCRE
>>> +             then
>>> +                     # Only PCRE would match [\d]\| with only
>>> +                     # "(1|2)" due to [\d]. POSIX basic would match
>>> +                     # both it and "1", and POSIX extended would
>>> +                     # match neither.
>>
>> OK.  BRE would match because the other side of "\|" is empty to
>> match anything?
>
> Yes. I'll clarify this. It's not just a POSIX basic feature. The same
> is true of extended and perl. E.g.:

Yes, but "\|" won't be taken as alternative in ERE or PCRE, and that
is why "[\d]\|" would match everything with BRE (as opposed to
others---PCRE matches "(1|2)" not because "\|" is an alternative but
because the pattern looks for a digit followed by a literal vert-bar,
and ERE does not match any because there is no 'd' followed by a
literal vert-bar).

I was mostly reacting to "BRE would match both it and '1'", which
singled out "1" as if "1" is special and gives a false impression
that it wouldn't have matched if it were "7".

>
>     git grep [-E|-P] 'foo|bar'
>
> Both match the same as:
>
>     git grep [-E|-P] '(foo|bar)'
>
>>> +                     git -c grep.patternType=perl log --pretty=tformat:%s \
>>> +                             --grep="[\d]\|" >actual.perl &&
>>> +                     test_cmp expect.perl actual.perl
>>> +             fi &&
>>

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2017-05-16  0:50 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-13 23:14 [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 01/29] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 02/29] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 03/29] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 04/29] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
2017-05-15  4:57   ` Junio C Hamano
2017-05-15 17:38     ` Ævar Arnfjörð Bjarmason
2017-05-16  0:50       ` Junio C Hamano
2017-05-13 23:14 ` [PATCH v2 05/29] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 06/29] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 07/29] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 08/29] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 09/29] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 10/29] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 11/29] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 12/29] grep: prepare for testing binary regexes containing rx metacharacters Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 13/29] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 14/29] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 15/29] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 16/29] perf: add a performance comparison test of grep -G, -E and -P Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 17/29] perf: add a performance comparison of fixed-string grep Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 18/29] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
2017-05-15  5:50   ` Junio C Hamano
2017-05-15 17:39     ` Ævar Arnfjörð Bjarmason
2017-05-13 23:14 ` [PATCH v2 19/29] grep: remove redundant regflags assignment under PCRE Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 20/29] grep: remove redundant `regflags &= ~REG_EXTENDED` assignments Ævar Arnfjörð Bjarmason
2017-05-15  6:14   ` Junio C Hamano
2017-05-15 17:41     ` Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 21/29] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
2017-05-15  6:24   ` Junio C Hamano
2017-05-15 18:07     ` Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 22/29] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 23/29] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 24/29] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 25/29] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 26/29] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 27/29] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
2017-05-15  8:59   ` Junio C Hamano
2017-05-15 17:16     ` Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 28/29] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
2017-05-13 23:15 ` [PATCH v2 29/29] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason
2017-05-15  9:09 ` [PATCH v2 00/29] Easy to review grep & pre-PCRE changes Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.