All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Carlo Arenas <carenas@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Beat Bolli <dev+git@drbeat.li>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v2 4/8] grep: consistently use "p->fixed" in compile_regexp()
Date: Mon, 29 Jul 2019 11:13:12 +0200	[thread overview]
Message-ID: <87zhkx5hlj.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <871ry96wiq.fsf@evledraar.gmail.com>


On Mon, Jul 29 2019, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Jul 29 2019, Carlo Arenas wrote:
>
>> On Fri, Jul 26, 2019 at 8:09 AM Ævar Arnfjörð Bjarmason
>> <avarab@gmail.com> wrote:
>>>
>>> It's less confusing to use that variable consistently that switch back
>>> & forth between the two.
>>>
>>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>>> ---
>>>  grep.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/grep.c b/grep.c
>>> index 9c2b259771..b94e998680 100644
>>> --- a/grep.c
>>> +++ b/grep.c
>>> @@ -616,7 +616,7 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
>>>                 die(_("given pattern contains NULL byte (via -f <file>). This is only supported with -P under PCRE v2"));
>>>
>>>         pat_is_fixed = is_fixed(p->pattern, p->patternlen);
>>> -       if (opt->fixed || pat_is_fixed) {
>>> +       if (p->fixed || pat_is_fixed) {
>>
>> at the end of this series we have:
>>
>>   if (p->fixed || p->is_fixed)
>>
>> which doesn't make sense; at least with opt->fixed it was clear that
>> what was meant is that grep was passed -P
>
> I assume you mean "was passed -F...".
>
>> maybe is_fixed shouldn't exist and fixed when applied to the pattern
>> means we had determined it was a fixed
>> pattern and overridden the user selection of engine.
>
> They're two flags because p->fixed is "--fixed-strings", and p->is_fixed
> is "there's no metachars here". So the former case needs escaping, as
> the code just below might do (the two aren't mutually exclusive).
>
> I don't get how you think we can always fold them into one flag, but
> maybe I'm missing something...
>
>> that at least will give us a logical way to fix the pattern reported
>> in [1] and that currently requires the user to know
>> git's grep internals and know he can skip the "is_fixed" optimization
>> by doing something like :
>>
>>   $ git grep 'foo[ ]bar'
>>
>> [1] https://public-inbox.org/git/20190728235427.41425-1-carenas@gmail.com/
>
> As I noted in a reply there this seems like a way to fix a bug in "next"
> with a config knob. Yes we should fix the bug, but we've had the kwset
> code in git for years without needing this distinction, so after we work
> out the bugs I don't see why we'd need this.
>
> The reason we ignore the user's choice here is because you might
> e.g. set grep.patternType=extended in your config, and you'd still want
> grepping for a fixed "foo" to be fast.

...and more generally, for any future sanity of implementation and
maintenance I think we should only make the promise that we support
certain syntax & semantics, not that the -F, -G, -E, -P options are
guaranteed to dispatch to a given codepath.

Internally we should be free to switch between those, so e.g. if a
pattern is fixed and you configure "basic" regexp, but we know your C
library is faster for those matches with REG_EXTENDED we should just
pass that regardless of -G or -E.

Of course that means we *must* expose the same semantics (to some
reasonable extent), which means I have a lot of bugs in "next" to
address.

I'm just saying that the presence of those bugs means we should be
inclined to fix them / back out certain changes, not work around them
with user-servicable knobs.

  reply	other threads:[~2019-07-29  9:13 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-21 19:40 [PATCH] grep: use custom JIT stack with pcre2 Carlo Marcelo Arenas Belón
2019-07-24 15:14 ` [PATCH 0/3] grep: PCRE JIT fixes Ævar Arnfjörð Bjarmason
2019-07-24 16:18   ` Junio C Hamano
2019-07-24 20:03     ` Ævar Arnfjörð Bjarmason
2019-07-26 15:08   ` [PATCH v2 0/8] grep: PCRE JIT fixes + ab/no-kwset fix Ævar Arnfjörð Bjarmason
2019-07-26 20:27     ` Junio C Hamano
2019-07-29  9:20       ` Ævar Arnfjörð Bjarmason
2019-07-29 16:12         ` Junio C Hamano
2019-07-26 15:08   ` [PATCH v2 1/8] grep: remove overly paranoid BUG(...) code Ævar Arnfjörð Bjarmason
2019-07-26 15:08   ` [PATCH v2 2/8] grep: stop "using" a custom JIT stack with PCRE v2 Ævar Arnfjörð Bjarmason
2019-07-29  0:33     ` Carlo Arenas
2019-07-26 15:08   ` [PATCH v2 3/8] grep: stop using a custom JIT stack with PCRE v1 Ævar Arnfjörð Bjarmason
2019-07-29  1:26     ` Carlo Arenas
2019-07-26 15:08   ` [PATCH v2 4/8] grep: consistently use "p->fixed" in compile_regexp() Ævar Arnfjörð Bjarmason
2019-07-29  1:48     ` Carlo Arenas
2019-07-29  9:05       ` Ævar Arnfjörð Bjarmason
2019-07-29  9:13         ` Ævar Arnfjörð Bjarmason [this message]
2019-07-29 16:23           ` Junio C Hamano
2019-07-26 15:08   ` [PATCH v2 5/8] grep: create a "is_fixed" member in "grep_pat" Ævar Arnfjörð Bjarmason
2019-07-26 15:08   ` [PATCH v2 6/8] grep: stess test PCRE v2 on invalid UTF-8 data Ævar Arnfjörð Bjarmason
2019-07-26 20:34     ` Junio C Hamano
2019-07-26 21:55       ` Ævar Arnfjörð Bjarmason
2019-07-29  3:06     ` Carlo Arenas
2019-11-26 21:50     ` [PATCH] t7812: add missing redirects Andreas Schwab
2019-11-26 22:27       ` Johannes Schindelin
2019-11-26 23:11         ` Andreas Schwab
2019-11-27 11:58           ` Jeff King
2019-11-30  0:46       ` [PATCH] t7812: expect failure for grep -i with invalid UTF-8 data Todd Zullinger
2019-11-30  8:00         ` Andreas Schwab
2019-12-01 16:33           ` Junio C Hamano
2019-12-01 17:09             ` Andreas Schwab
2019-12-01 18:32             ` Todd Zullinger
2019-12-02  6:13               ` Junio C Hamano
2019-07-26 15:08   ` [PATCH v2 7/8] grep: do not enter PCRE2_UTF mode on fixed matching Ævar Arnfjörð Bjarmason
2019-07-26 20:36     ` Junio C Hamano
2019-07-26 15:08   ` [PATCH v2 8/8] grep: optimistically use PCRE2_MATCH_INVALID_UTF Ævar Arnfjörð Bjarmason
2019-07-26 21:07     ` Junio C Hamano
2019-07-26 21:53       ` Ævar Arnfjörð Bjarmason
2019-07-26 21:57         ` Ævar Arnfjörð Bjarmason
2021-01-24  2:12     ` [PATCH v3 0/4] grep: better support invalid UTF-8 haystacks Ævar Arnfjörð Bjarmason
2021-01-24 11:48       ` [PATCH v4 0/2] " Ævar Arnfjörð Bjarmason
2021-01-24 17:28         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
2021-01-24 17:28         ` [PATCH v5 1/2] grep/pcre2 tests: don't rely on invalid UTF-8 data test Ævar Arnfjörð Bjarmason
2021-01-24 17:28         ` [PATCH v5 2/2] grep/pcre2: better support invalid UTF-8 haystacks Ævar Arnfjörð Bjarmason
2021-01-24 11:48       ` [PATCH v4 1/2] grep/pcre2 tests: don't rely on invalid UTF-8 data test Ævar Arnfjörð Bjarmason
2021-01-24 11:48       ` [PATCH v4 2/2] grep/pcre2: better support invalid UTF-8 haystacks Ævar Arnfjörð Bjarmason
2021-01-24 13:53         ` Ramsay Jones
2021-01-24 14:24           ` Ramsay Jones
2021-01-24 14:49             ` Ævar Arnfjörð Bjarmason
2021-01-24 16:10               ` Ramsay Jones
2021-01-24 17:29                 ` Ævar Arnfjörð Bjarmason
2021-01-24  2:12     ` [PATCH v3 1/4] grep/pcre2 tests: don't rely on invalid UTF-8 data test Ævar Arnfjörð Bjarmason
2021-01-24  2:12     ` [PATCH v3 2/4] grep/pcre2: simplify boolean spaghetti Ævar Arnfjörð Bjarmason
2021-01-24  5:33       ` Junio C Hamano
2021-01-24 10:45         ` Johannes Sixt
2021-01-24  2:12     ` [PATCH v3 3/4] grep/pcre2: further " Ævar Arnfjörð Bjarmason
2021-01-24  2:12     ` [PATCH v3 4/4] grep/pcre2: better support invalid UTF-8 haystacks Ævar Arnfjörð Bjarmason
2019-07-24 15:14 ` [PATCH 1/3] grep: remove overly paranoid BUG(...) code Ævar Arnfjörð Bjarmason
2019-07-24 15:14 ` [PATCH 2/3] grep: stop "using" a custom JIT stack with PCRE v2 Ævar Arnfjörð Bjarmason
2019-07-24 16:24   ` Junio C Hamano
2019-07-24 20:06     ` Ævar Arnfjörð Bjarmason
2019-07-25  5:11       ` Carlo Arenas
2019-07-24 15:14 ` [PATCH 3/3] grep: stop using a custom JIT stack with PCRE v1 Ævar Arnfjörð Bjarmason
2019-07-26 13:15   ` Carlo Arenas
2019-07-26 13:50     ` Ævar Arnfjörð Bjarmason
2019-07-26 14:12       ` Carlo Arenas
2019-07-26 14:43         ` Ævar Arnfjörð Bjarmason
2019-07-26 20:26           ` [RFC PATCH 0/2] PCRE1 cleanup Carlo Marcelo Arenas Belón
2019-07-26 20:26             ` [RFC PATCH 1/2] grep: make sure NO_LIBPCRE1_JIT disable JIT in PCRE1 Carlo Marcelo Arenas Belón
2019-07-26 20:26             ` [RFC PATCH 2/2] grep: refactor and simplify PCRE1 support Carlo Marcelo Arenas Belón

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhkx5hlj.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=carenas@gmail.com \
    --cc=dev+git@drbeat.li \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.