All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Git Mailing List" <git@vger.kernel.org>,
	"Jeff King" <peff@peff.net>,
	"Jeffrey Walton" <noloader@gmail.com>,
	"Michał Kiedrowicz" <michal.kiedrowicz@gmail.com>,
	"J Smith" <dark.panda@gmail.com>,
	"Victor Leschuk" <vleschuk@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Fredrik Kuivinen" <frekui@gmail.com>,
	"Brandon Williams" <bmwill@google.com>
Subject: Re: [PATCH v3 05/30] log: make --regexp-ignore-case work with --perl-regexp
Date: Sun, 21 May 2017 08:58:24 +0200	[thread overview]
Message-ID: <CACBZZX6Hp4Q4TOj_X1fbdCA4twoXF5JemZ5ZbEn7wmkA=1KO2g@mail.gmail.com> (raw)
In-Reply-To: <xmqqlgprqe9j.fsf@gitster.mtv.corp.google.com>

On Sun, May 21, 2017 at 1:50 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> Make the --regexp-ignore-case option work with --perl-regexp. This
>> never worked, and there was no test for this. Fix the bug and add a
>> test.
>>
>> When PCRE support was added in commit 63e7e9d8b6 ("git-grep: Learn
>> PCRE", 2011-05-09) compile_pcre_regexp() would only check
>> opt->ignore_case, but when the --perl-regexp option was added in
>> commit 727b6fc3ed ("log --grep: accept --basic-regexp and
>> --perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case.
>>
>> Change the test suite to test for -i and --invert-regexp with
>> basic/extended/perl patterns in addition to fixed, which was the only
>> patternType that was tested for before in combination with those
>> options.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  revision.c     |  1 +
>>  t/t4202-log.sh | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>  2 files changed, 56 insertions(+), 5 deletions(-)
>>
>> diff --git a/revision.c b/revision.c
>> index 8a8c1789c7..4883cdd2d0 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -1991,6 +1991,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
>>       } else if (!strcmp(arg, "--extended-regexp") || !strcmp(arg, "-E")) {
>>               revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_ERE;
>>       } else if (!strcmp(arg, "--regexp-ignore-case") || !strcmp(arg, "-i")) {
>> +             revs->grep_filter.ignore_case = 1;
>>               revs->grep_filter.regflags |= REG_ICASE;
>>               DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE);
>>       } else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
>
> Looks good.
>
> I however wonder if it is a better approach in the longer term to
> treat the .ignore_case field just like .extended_regexp_option
> field, i.e. not committing immediately to .regflags but commit it
> after config and command line parsing is done, just like we make the
> "BRE? ERE?" decision in grep_commit_pattern_type().

I started hacking up a patch to fix the root cause of this, i.e. the
users of the grep API should only set `.ignore_case = 1` and not care
about setting regflags, but it was more than a trivial change, so I
didn't include it in this series:

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ffb5b4e81..be28c37265 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1151,8 +1151,6 @@ int cmd_grep(int argc, const char **argv, const
char *prefix)

        if (!opt.pattern_list)
                die(_("no pattern given."));
-       if (!opt.fixed && opt.ignore_case)
-               opt.regflags |= REG_ICASE;

        compile_grep_patterns(&opt);

diff --git a/grep.c b/grep.c
index 47cee45067..7b13ee1043 100644
--- a/grep.c
+++ b/grep.c
@@ -435,12 +435,11 @@ static void compile_fixed_regexp(struct grep_pat
*p, struct grep_opt *opt)

 static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 {
-       int icase, ascii_only;
+       int ascii_only;
        int err;

        p->word_regexp = opt->word_regexp;
        p->ignore_case = opt->ignore_case;
-       icase          = opt->regflags & REG_ICASE || p->ignore_case;
        ascii_only     = !has_non_ascii(p->pattern);

        /*
@@ -456,12 +455,12 @@ static void compile_regexp(struct grep_pat *p,
struct grep_opt *opt)
         * want to use kws.
         */
        if (opt->fixed || is_fixed(p->pattern, p->patternlen))
-               p->fixed = !icase || ascii_only;
+               p->fixed = !p->ignore_case || ascii_only;
        else
                p->fixed = 0;

        if (p->fixed) {
-               p->kws = kwsalloc(icase ? tolower_trans_tbl : NULL);
+               p->kws = kwsalloc(p->ignore_case ? tolower_trans_tbl : NULL);
                kwsincr(p->kws, p->pattern, p->patternlen);
                kwsprep(p->kws);
                return;
@@ -480,6 +479,8 @@ static void compile_regexp(struct grep_pat *p,
struct grep_opt *opt)
                return;
        }

+       if (p->ignore_case)
+               opt->regflags |= REG_ICASE;
        err = regcomp(&p->regexp, p->pattern, opt->regflags);
        if (err) {
                char errbuf[1024];
diff --git a/revision.c b/revision.c
index 4883cdd2d0..30c23a1098 100644
--- a/revision.c
+++ b/revision.c
@@ -1992,7 +1992,6 @@ static int handle_revision_opt(struct rev_info
*revs, int argc, const char **arg
                revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_ERE;
        } else if (!strcmp(arg, "--regexp-ignore-case") || !strcmp(arg, "-i")) {
                revs->grep_filter.ignore_case = 1;
-               revs->grep_filter.regflags |= REG_ICASE;
                DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE);
        } else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
                revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;

But an even better solution is to get rid of passing the regflags
field in grep_opt entirely, this conflicts with some of my later
patches:

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ffb5b4e81..be28c37265 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1151,8 +1151,6 @@ int cmd_grep(int argc, const char **argv, const
char *prefix)

        if (!opt.pattern_list)
                die(_("no pattern given."));
-       if (!opt.fixed && opt.ignore_case)
-               opt.regflags |= REG_ICASE;

        compile_grep_patterns(&opt);
diff --git a/grep.c b/grep.c
index 47cee45067..1bde7037ba 100644
--- a/grep.c
+++ b/grep.c
@@ -34,7 +34,6 @@ void init_grep_defaults(void)
        memset(opt, 0, sizeof(*opt));
        opt->relative = 1;
        opt->pathname = 1;
-       opt->regflags = REG_NEWLINE;
        opt->max_depth = -1;
        opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
        opt->extended_regexp_option = 0;
@@ -156,7 +155,6 @@ void grep_init(struct grep_opt *opt, const char *prefix)
        opt->linenum = def->linenum;
        opt->max_depth = def->max_depth;
        opt->pathname = def->pathname;
-       opt->regflags = def->regflags;
        opt->relative = def->relative;
        opt->output = def->output;

@@ -179,25 +177,25 @@ static void grep_set_pattern_type_option(enum
grep_pattern_type pattern_type, st
        case GREP_PATTERN_TYPE_BRE:
                opt->fixed = 0;
                opt->pcre = 0;
-               opt->regflags &= ~REG_EXTENDED;
+               opt->extended = 0;
                break;
         case GREP_PATTERN_TYPE_ERE:
                opt->fixed = 0;
                opt->pcre = 0;
-               opt->regflags |= REG_EXTENDED;
+               opt->extended = 1;
                break;

        case GREP_PATTERN_TYPE_FIXED:
                opt->fixed = 1;
                opt->pcre = 0;
-               opt->regflags &= ~REG_EXTENDED;
+               opt->extended = 0;
                break;

        case GREP_PATTERN_TYPE_PCRE:
                opt->fixed = 0;
                opt->pcre = 1;
-               opt->regflags &= ~REG_EXTENDED;
+               opt->extended = 0;
                break;
        }
 }
@@ -415,10 +413,9 @@ static void compile_fixed_regexp(struct grep_pat
*p, struct grep_opt *opt)
 {
        struct strbuf sb = STRBUF_INIT;
        int err;
-       int regflags;
+       int regflags = REG_NEWLINE;

        basic_regex_quote_buf(&sb, p->pattern);
-       regflags = opt->regflags & ~REG_EXTENDED;
        if (opt->ignore_case)
                regflags |= REG_ICASE;
        err = regcomp(&p->regexp, sb.buf, regflags);
@@ -435,12 +432,12 @@ static void compile_fixed_regexp(struct grep_pat
*p, struct grep_opt *opt)

 static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
 {
-       int icase, ascii_only;
+       int ascii_only;
        int err;
+       int regflags = REG_NEWLINE;

        p->word_regexp = opt->word_regexp;
        p->ignore_case = opt->ignore_case;
-       icase          = opt->regflags & REG_ICASE || p->ignore_case;
        ascii_only     = !has_non_ascii(p->pattern);

        /*
@@ -456,12 +453,12 @@ static void compile_regexp(struct grep_pat *p,
struct grep_opt *opt)
         * want to use kws.
         */
        if (opt->fixed || is_fixed(p->pattern, p->patternlen))
-               p->fixed = !icase || ascii_only;
+               p->fixed = !p->ignore_case || ascii_only;
        else
                p->fixed = 0;

        if (p->fixed) {
-               p->kws = kwsalloc(icase ? tolower_trans_tbl : NULL);
+               p->kws = kwsalloc(p->ignore_case ? tolower_trans_tbl : NULL);
                kwsincr(p->kws, p->pattern, p->patternlen);
                kwsprep(p->kws);
                return;
@@ -480,7 +477,11 @@ static void compile_regexp(struct grep_pat *p,
struct grep_opt *opt)
                return;
        }

-       err = regcomp(&p->regexp, p->pattern, opt->regflags);
+       if (p->ignore_case)
+               regflags |= REG_ICASE;
+       if (opt->extended)
+               regflags |= REG_EXTENDED;
+       err = regcomp(&p->regexp, p->pattern, regflags);
        if (err) {
                char errbuf[1024];
                regerror(err, &p->regexp, errbuf, 1024);
diff --git a/grep.h b/grep.h
index 267534ca24..d9d603deb1 100644
--- a/grep.h
+++ b/grep.h
@@ -129,7 +129,6 @@ struct grep_opt {
        char color_match_selected[COLOR_MAXLEN];
        char color_selected[COLOR_MAXLEN];
        char color_sep[COLOR_MAXLEN];
-       int regflags;
        unsigned pre_context;
        unsigned post_context;
        unsigned last_shown;
diff --git a/revision.c b/revision.c
index 4883cdd2d0..67240d38af 100644
--- a/revision.c
+++ b/revision.c
@@ -1362,7 +1362,6 @@ void init_revisions(struct rev_info *revs, const
char *prefix)
        init_grep_defaults();
        grep_init(&revs->grep_filter, prefix);
        revs->grep_filter.status_only = 1;
-       revs->grep_filter.regflags = REG_NEWLINE;

        diff_setup(&revs->diffopt);
        if (prefix && !revs->diffopt.prefix) {
@@ -1992,7 +1991,6 @@ static int handle_revision_opt(struct rev_info
*revs, int argc, const char **arg
                revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_ERE;
        } else if (!strcmp(arg, "--regexp-ignore-case") || !strcmp(arg, "-i")) {
                revs->grep_filter.ignore_case = 1;
-               revs->grep_filter.regflags |= REG_ICASE;
                DIFF_OPT_SET(&revs->diffopt, PICKAXE_IGNORE_CASE);
        } else if (!strcmp(arg, "--fixed-strings") || !strcmp(arg, "-F")) {
                revs->grep_filter.pattern_type_option = GREP_PATTERN_TYPE_FIXED;

But as all this code cleanup isn't needed for fixing this bug, and I'd
really like to get this series merged into next/master ASAP so I can
start submitting the grep/pcre patches that are actually interesting,
let's leave this orthogonal code cleanup for now.

  reply	other threads:[~2017-05-21  6:58 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-20 21:42 [PATCH v3 00/30] Easy to review grep & pre-PCRE changes Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 01/30] Makefile & configure: reword inaccurate comment about PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 02/30] grep & rev-list doc: stop promising libpcre for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 03/30] test-lib: rename the LIBPCRE prerequisite to PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 04/30] log: add exhaustive tests for pattern style options & config Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 05/30] log: make --regexp-ignore-case work with --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-20 23:50   ` Junio C Hamano
2017-05-21  6:58     ` Ævar Arnfjörð Bjarmason [this message]
2017-05-22  0:17       ` Junio C Hamano
2017-06-28 21:58       ` [PATCH 0/5] grep: remove redundant code & reflags from API Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 0/6] " Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 1/6] grep: remove redundant double assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 2/6] grep: adjust a redundant grep pattern type assignment Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 3/6] grep: remove redundant "fixed" field re-assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 4/6] grep: remove redundant and verbose re-assignments " Ævar Arnfjörð Bjarmason
2017-06-29 22:22         ` [PATCH v2 5/6] grep: remove regflags from the public grep_opt API Ævar Arnfjörð Bjarmason
2017-06-30 17:20           ` Junio C Hamano
2017-06-29 22:22         ` [PATCH v2 6/6] grep: remove redundant REG_NEWLINE when compiling fixed regex Ævar Arnfjörð Bjarmason
2017-06-28 21:58       ` [PATCH 1/5] grep: remove redundant double assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-28 21:58       ` [PATCH 2/5] grep: remove redundant grep pattern type assignment Ævar Arnfjörð Bjarmason
2017-06-29 17:03         ` Stefan Beller
2017-06-29 17:50           ` Ævar Arnfjörð Bjarmason
2017-06-28 21:58       ` [PATCH 3/5] grep: remove redundant "fixed" field re-assignment to 0 Ævar Arnfjörð Bjarmason
2017-06-29 17:10         ` Stefan Beller
2017-06-28 21:58       ` [PATCH 4/5] grep: remove redundant and verbose re-assignments " Ævar Arnfjörð Bjarmason
2017-06-28 21:58       ` [PATCH 5/5] grep: remove regflags from the public grep_opt API Ævar Arnfjörð Bjarmason
2017-06-29 17:43         ` Stefan Beller
2017-06-29 18:16           ` Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 06/30] grep: add a test asserting that --perl-regexp dies when !PCRE Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 07/30] grep: add a test for backreferences in PCRE patterns Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 08/30] grep: change non-ASCII -i test to stop using --debug Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 09/30] grep: add tests for --threads=N and grep.threads Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 10/30] grep: amend submodule recursion test for regex engine testing Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 11/30] grep: add tests for grep pattern types being passed to submodules Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 12/30] grep: add a test helper function for less verbose -f \0 tests Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 13/30] grep: prepare for testing binary regexes containing rx metacharacters Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 14/30] grep: add tests to fix blind spots with \0 patterns Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 15/30] perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Ævar Arnfjörð Bjarmason
2017-05-20 23:50   ` Junio C Hamano
2017-05-21  6:23     ` Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 16/30] perf: emit progress output when unpacking & building Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 17/30] perf: add a comparison test of grep regex engines Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 18/30] perf: add a comparison test of grep regex engines with -F Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 19/30] perf: add a comparison test of log --grep regex engines Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 20/30] grep: catch a missing enum in switch statement Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 21/30] grep: remove redundant regflags assignments Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 22/30] grep: factor test for \0 in grep patterns into a function Ævar Arnfjörð Bjarmason
2017-05-23 21:17   ` Brandon Williams
2017-05-20 21:42 ` [PATCH v3 23/30] grep: change the internal PCRE macro names to be PCRE1 Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 24/30] grep: change internal *pcre* variable & function names to be *pcre1* Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 25/30] grep: move is_fixed() earlier to avoid forward declaration Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 26/30] test-lib: add a PTHREADS prerequisite Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 27/30] pack-objects & index-pack: add test for --threads warning Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 28/30] pack-objects: fix buggy warning about threads Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 29/30] grep: given --threads with NO_PTHREADS=YesPlease, warn Ævar Arnfjörð Bjarmason
2017-05-20 21:42 ` [PATCH v3 30/30] grep: assert that threading is enabled when calling grep_{lock,unlock} Ævar Arnfjörð Bjarmason
2017-05-20 23:50 ` [PATCH v3 00/30] Easy to review grep & pre-PCRE changes Junio C Hamano
2017-05-23 19:24   ` [PATCH v2 0/7] PCRE v2, PCRE v1 JIT, log -P & fixes Ævar Arnfjörð Bjarmason
2017-05-23 19:24   ` [PATCH v2 1/7] grep: don't redundantly compile throwaway patterns under threading Ævar Arnfjörð Bjarmason
2017-05-24  4:42     ` Junio C Hamano
2017-05-25 10:33       ` Ævar Arnfjörð Bjarmason
2017-05-26  0:58         ` Junio C Hamano
2017-05-26  8:06           ` Ævar Arnfjörð Bjarmason
2017-05-26  9:49             ` Junio C Hamano
2017-05-23 19:24   ` [PATCH v2 2/7] grep: skip pthreads overhead when using one thread Ævar Arnfjörð Bjarmason
2017-05-24  4:45     ` Junio C Hamano
2017-05-23 19:24   ` [PATCH v2 3/7] log: add -P as a synonym for --perl-regexp Ævar Arnfjörð Bjarmason
2017-05-23 19:24   ` [PATCH v2 4/7] grep: add support for the PCRE v1 JIT API Ævar Arnfjörð Bjarmason
2017-05-24  5:17     ` Junio C Hamano
2017-05-24  7:37       ` Ævar Arnfjörð Bjarmason
2017-05-23 19:24   ` [PATCH v2 5/7] grep: un-break building with PCRE < 8.32 Ævar Arnfjörð Bjarmason
2017-05-24  6:00     ` Junio C Hamano
2017-05-24  6:38       ` Junio C Hamano
2017-05-23 19:24   ` [PATCH v2 6/7] grep: un-break building with PCRE < 8.20 Ævar Arnfjörð Bjarmason
2017-05-23 19:24   ` [PATCH v2 7/7] grep: add support for PCRE v2 Ævar Arnfjörð Bjarmason
2017-05-24  6:23     ` Junio C Hamano
2017-05-25  9:49       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACBZZX6Hp4Q4TOj_X1fbdCA4twoXF5JemZ5ZbEn7wmkA=1KO2g@mail.gmail.com' \
    --to=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=dark.panda@gmail.com \
    --cc=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=michal.kiedrowicz@gmail.com \
    --cc=noloader@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=vleschuk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.