All of lore.kernel.org
 help / color / mirror / Atom feed
* git grep -P is multiline for negative lookahead/behind
@ 2016-08-01  9:09 Michael Giuffrida
  2016-08-01 21:35 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Giuffrida @ 2016-08-01  9:09 UTC (permalink / raw)
  To: git

Negative lookahead/lookbehind with `git grep -P` considers the
surrounding lines of an otherwise positive match. This differs from
`grep -P` behavior. It does not repro for *positive*
lookahead/lookbehind.

Example:
$ echo -e 'Bar\nBar Baz\nBat' > test.txt && git add test.txt
$ git grep -P 'Bar(?!\sB)' test.txt  # Find 'Bar' not followed by '\sB'

With regular grep, this works:

$ grep -P 'Bar(?!\sB)' test.txt
Bar

Because \s includes \n, the negative lookahead unexpectedly finds the
B on line 2, and therefore line 1 fails to match the full expression.

Positive lookahead works like `grep -P`, in that it isn't unexpectedly
multiline:

$ git grep -P 'Bar(?=\sB)' test.txt  # Find 'Bar' followed by '\sB'
test.txt:Bar Baz
$ grep -P 'Bar(?=\sB)' test.txt
Bar Baz

Is this expected behavior, and if so, why/where is this documented?

git: git version 2.8.0.rc3.226.g39d4020, also 2.9.2.517.gf8f7adc, both
with libpcre
grep: grep (GNU grep) 2.16

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep -P is multiline for negative lookahead/behind
  2016-08-01  9:09 git grep -P is multiline for negative lookahead/behind Michael Giuffrida
@ 2016-08-01 21:35 ` Junio C Hamano
  2016-08-04 18:54   ` Michael Giuffrida
  0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2016-08-01 21:35 UTC (permalink / raw)
  To: Michael Giuffrida; +Cc: git

Michael Giuffrida <michaelpg@chromium.org> writes:

> Is this expected behavior, and if so, why/where is this documented?

I do not think "git grep" was designed to do multi-line anything,
with or without lookahead.  If you imagine that the implementation
attempts its matches line-by-line, does that explain the observed
symptom?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep -P is multiline for negative lookahead/behind
  2016-08-01 21:35 ` Junio C Hamano
@ 2016-08-04 18:54   ` Michael Giuffrida
  2016-08-04 22:09     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Giuffrida @ 2016-08-04 18:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Giuffrida, git

On Mon, Aug 1, 2016 at 2:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Michael Giuffrida <michaelpg@chromium.org> writes:
>
>> Is this expected behavior, and if so, why/where is this documented?
>
> I do not think "git grep" was designed to do multi-line anything,
> with or without lookahead.  If you imagine that the implementation
> attempts its matches line-by-line, does that explain the observed
> symptom?

No. If it worked line-by-line, it would produce more results. It is
not producing the expected matches because it *is* considering the
previous line in negative lookbehind, when I don't want or expect it
to. Thus it throws out results that should match.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git grep -P is multiline for negative lookahead/behind
  2016-08-04 18:54   ` Michael Giuffrida
@ 2016-08-04 22:09     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2016-08-04 22:09 UTC (permalink / raw)
  To: Michael Giuffrida; +Cc: Git Mailing List

On Thu, Aug 4, 2016 at 11:54 AM, Michael Giuffrida
<michaelpg@chromium.org> wrote:
> On Mon, Aug 1, 2016 at 2:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> I do not think "git grep" was designed to do multi-line anything,
>> with or without lookahead.  If you imagine that the implementation
>> attempts its matches line-by-line, does that explain the observed
>> symptom?
>
> No. If it worked line-by-line, it would produce more results. It is
> not producing the expected matches because it *is* considering the
> previous line in negative lookbehind, when I don't want or expect it
> to. Thus it throws out results that should match.

If that is the case I do not know what is going on; perhaps
somebody more familiar with the pcre codepath can help.

Sorry.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-08-04 22:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-01  9:09 git grep -P is multiline for negative lookahead/behind Michael Giuffrida
2016-08-01 21:35 ` Junio C Hamano
2016-08-04 18:54   ` Michael Giuffrida
2016-08-04 22:09     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.