All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Taylor Blau" <me@ttaylorr.com>,
	git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep'
Date: Wed, 9 May 2018 19:00:10 -0700	[thread overview]
Message-ID: <20180510020010.GA5375@syl.local> (raw)
In-Reply-To: <20180508172517.GA934@sigill.intra.peff.net>

On Tue, May 08, 2018 at 01:25:17PM -0400, Jeff King wrote:
> On Sat, May 05, 2018 at 08:49:43AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> > > +test_expect_success 'grep --only-matching --heading' '
> > > +	git grep --only-matching --heading --line-number --column mmap file >actual &&
> > > +	test_cmp expected actual
> > > +'
> > > +
> > >  cat >expected <<EOF
> > >  <BOLD;GREEN>hello.c<RESET>
> > >  4:int main(int argc, const <BLACK;BYELLOW>char<RESET> **argv)
> >
> > We should test this a lot more, I think a good way to do that would be
> > to extend this series by first importing GNU grep's -o tests, see
> > http://git.savannah.gnu.org/cgit/grep.git/tree/tests/foad1 they are
> > license-compatible. Then change the grep_test() function to call git
> > grep instead.
>
> I'm trying to figure out what GNU grep's tests are actually checking
> that we don't have. I see:
>
>  - they check that "-i" returns the actual found string in its original
>    case. This seems like a subset of finding a non-trivial regex. I.e.,
>    "foo.*" should find "foobar". We probably should have a test like
>    that.
>
>  - they test multiple hits on the same line, which seems like an
>    important and easy-to-screw-up case; we do that, too.

Agree.

>  - they test everything other thing with and without "-i" because those
>    are apparently separate code paths in GNU grep, but I don't think
>    that applies to us.
>
>  - they test each case with "-b", but we don't have that (we do test
>    with "--column", which is good)

Right, I think that we can ignore these groups.

>  - they test with "-n", which we do here (we don't test without, but
>    that seems like an unlikely bug, knowing how it is implemented)

Fair, let's leave that as is.

>  - they test with -H, but that is already the default for git-grep

Good, we can ignore this one.

>  - they test with context (-C3) for us. It looks like GNU grep omits
>    context lines with "-o", but we show a bunch of blank lines. This is
>    I guess a bug (though it seems kind of an odd combination to specify
>    in the first place)

I'm torn on what to do here. Currently, with -C3, I get a bunch of lines
like:

  <file>-

Which I think is technically _right_, but I agree that it is certainly
an odd combination of things to give to 'git grep'. I think that we
could either:

  1. Continue outputting blank lines,
  2. Ignore '-C<n>' with '-o', or
  3. die() when given this combination.

I think that (1) is the most "correct" (for some definition), so I'm
inclined to adopt that approach. I suppose that (2) is closer to what
GNU grep offers, and that is the point of this series, so perhaps it
makes sense to pick that instead.

I'll go with (2) for now, but I would appreciate others' thoughts before
sending a subsequent re-roll of this series.

> So I think it probably makes sense to just pick through the list I just
> wrote and write our own tests than to try to import GNU grep's specific
> tests (and there's a ton of other unrelated tests in that file that may
> or may not even run with git-grep).

I agree. Since some of these cases are already covered, and some don't
have analogues, I think that it is most sensible to go through the above
and add _those_, instead of porting the whole test suite over from GNU.

> > It should also be tested with the various grep.patternType options to
> > make sure it works with basic, extended, perl, fixed etc.
>
> Hmm, this code is all external to the actual matching. So unless one of
> those is totally screwing up the regmatch_t return, I'm not sure that's
> accomplishing much (and if one of them is, it's totally broken for
> coloring, "-c", and maybe other features).
>
> We've usually taken a pretty white-box approach to our testing, covering
> things that seem likely to go wrong given the general problem space and
> our implementation. But maybe I'm missing something in particular that
> you think might be tricky.
>
> -Peff

Thanks,
Taylor

  reply	other threads:[~2018-05-10  2:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-05  4:03 [PATCH 0/2] builtin/grep.c: teach '-o', '--only-matching' Taylor Blau
2018-05-05  4:03 ` [PATCH 1/2] grep.c: extract show_line_header() Taylor Blau
2018-05-05  7:30   ` Eric Sunshine
2018-05-08  0:24     ` Taylor Blau
2018-05-05  4:03 ` [PATCH 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep' Taylor Blau
2018-05-05  6:49   ` Ævar Arnfjörð Bjarmason
2018-05-08 17:25     ` Jeff King
2018-05-10  2:00       ` Taylor Blau [this message]
2018-05-10  6:40         ` Jeff King
2018-05-05  7:36   ` Eric Sunshine
2018-05-08  0:27     ` Taylor Blau
2018-05-12  3:21 ` [PATCH v2 0/2] builtin/grep.c: teach '-o', '--only-matching' Taylor Blau
2018-05-12  3:21   ` [PATCH v2 1/2] grep.c: extract show_line_header() Taylor Blau
2018-05-12  3:21   ` [PATCH v2 2/2] builtin/grep.c: teach '-o', '--only-matching' to 'git-grep' Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180510020010.GA5375@syl.local \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.