All of lore.kernel.org
 help / color / mirror / Atom feed
* git-grep: option parsing conflicts with prefix-dash searches
@ 2010-02-05 23:09 Jan Engelhardt
  2010-02-05 23:17 ` Jan Engelhardt
  2010-02-05 23:31 ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Jan Engelhardt @ 2010-02-05 23:09 UTC (permalink / raw)
  To: git

Greetings.


Just about now I wanted to grep for accesses of a particular struct 
member. Needless to say that it was not a very amusing experience.
I would expect that (1) probably fails:

(1)	$ git grep '->cnt' net/ipv4/netfilter/
	error: unknown switch `>'

So far so good, seems reasonable and matches what I would expect from 
most other userspace tools. So let's add -- to terminate the option 
list:

(2)	$ git grep -- '->cnt' net/ipv4/netfilter/
	fatal: bad flag '->cnt' used after filename

*bzzt*. This does not match typical behavior. Let alone that "--"
is not a filename.

What works is (3).

(3)	$ git grep -- -- '->cnt' net/ipv4/netfilter/

But it almost looks like Morse code. Or Perl. Imagine I were to
approxmiately search for all options in iptables's in-code help texts:

	git grep -- -- -- .

I think that overloading "--" was a bad choice. The option parser has
many more awkward behavior, such as not allowing to bundle most
options (`git log -z -p` vs. `git log -zp`) yet forcing it on other
options (`git log -Spattern` vs `git log -S pattern`). The use of
historic counts (cf. `git log -30` and `tail -30`) compared to a more
modern `tail -n30`) also prohibits using many standard parsers
(most notably getopt(3)), as they would recognize that as -3 -0.

As I said, it's a mess. And I know not whether any code can convince
the "but we need to watch compatibility"-sayers, because this would
definitely be a flag change.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-05 23:09 git-grep: option parsing conflicts with prefix-dash searches Jan Engelhardt
@ 2010-02-05 23:17 ` Jan Engelhardt
  2010-02-05 23:27   ` Santi Béjar
  2010-02-05 23:34   ` Junio C Hamano
  2010-02-05 23:31 ` Junio C Hamano
  1 sibling, 2 replies; 12+ messages in thread
From: Jan Engelhardt @ 2010-02-05 23:17 UTC (permalink / raw)
  To: git

On Saturday 2010-02-06 00:09, Jan Engelhardt wrote:

>What works is (3).
>
>(3)	$ git grep -- -- '->cnt' net/ipv4/netfilter/

No, I spoke too soon. This command will search for --, not ->cnt.
So git cannot search for patterns starting with a dash at all,
as I see it. This is getting fun..

>As I said, it's a mess. And I know not whether any code can convince
>the "but we need to watch compatibility"-sayers, because this would
>definitely be a flag change.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-05 23:17 ` Jan Engelhardt
@ 2010-02-05 23:27   ` Santi Béjar
  2010-02-05 23:34   ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Santi Béjar @ 2010-02-05 23:27 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

On Sat, Feb 6, 2010 at 12:17 AM, Jan Engelhardt <jengelh@medozas.de> wrote:
> On Saturday 2010-02-06 00:09, Jan Engelhardt wrote:
>
>>What works is (3).
>>
>>(3)    $ git grep -- -- '->cnt' net/ipv4/netfilter/
>
> No, I spoke too soon. This command will search for --, not ->cnt.
> So git cannot search for patterns starting with a dash at all,
> as I see it. This is getting fun..

You should use -e:

     -e
           The next parameter is the pattern. This option has to be
used for patterns starting with - and should be used in scripts
passing user input to grep.

The working command is:

$ git grep -e '->cnt' net/ipv4/netfilter/

Although there is not such pattern in net/ipv4/netfilter, maybe you
wanted '->counters' :-)

HTH,
Santi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-05 23:09 git-grep: option parsing conflicts with prefix-dash searches Jan Engelhardt
  2010-02-05 23:17 ` Jan Engelhardt
@ 2010-02-05 23:31 ` Junio C Hamano
  2010-02-06  3:51   ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-02-05 23:31 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

Jan Engelhardt <jengelh@medozas.de> writes:

> Just about now I wanted to grep for accesses of a particular struct 
> member. Needless to say that it was not a very amusing experience.
> I would expect that (1) probably fails:
>
> (1)	$ git grep '->cnt' net/ipv4/netfilter/
> 	error: unknown switch `>'
>
> So far so good, seems reasonable and matches what I would expect from 
> most other userspace tools. So let's add -- to terminate the option 
> list:

Also you can say "grep -e '->cnt'".  Not just "git grep" but regular grep
understands this, too.

> (2)	$ git grep -- '->cnt' net/ipv4/netfilter/
> 	fatal: bad flag '->cnt' used after filename
>
> *bzzt*.

This indeed is bzzt, especially if you had a file called "./->cnt" in the
work tree.  That would mean that you cannot tell the command to look for a
pattern in the work tree.

But because you are not giving anything before "--", that "git grep" is
not looking for anything.  Indeed, (2) is a user error.  If you try this:

        $ git grep a -- '->cnt' net/ipv4/netfilter/

does do what the command line specifies:  Look for a pattern "a" in files
whose names match given pathspecs ('->cnt' or 'net/ipv4/netfilter/').

> What works is (3).
>
> (3)	$ git grep -- -- '->cnt' net/ipv4/netfilter/

Huh?  Now I am lost.  Weren't you looking for a pattern "->cnt"?

And if this command looks for and finds the string '->cnt' in files whose
path match net/ipv4/netfilter/ pathspec, I would say it _is_ a bug.

The command line looks for "--" (the first one) as a pattern, and
interprets the second "--" as your attempt to tell git that '->cnt' is not
an option but is a pathspec.  So it looks for a pattern "--" in files
whose names match given pathspecs( again '->cnt' or 'net/ipv4/netfilter/').

> But it almost looks like Morse code.

Indeed.  But did (3) really work?  I tried it myself in a copy of the
kernel repository, and it found lines that contain '--' in files whose
names match net/ipv4/netfilter/ pathspec, as my copy of the kernel source
does not have a file '->cnt' at all.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-05 23:17 ` Jan Engelhardt
  2010-02-05 23:27   ` Santi Béjar
@ 2010-02-05 23:34   ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-02-05 23:34 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: git

Jan Engelhardt <jengelh@medozas.de> writes:

> On Saturday 2010-02-06 00:09, Jan Engelhardt wrote:
>
>>What works is (3).
>>
>>(3)	$ git grep -- -- '->cnt' net/ipv4/netfilter/
>
> No, I spoke too soon. This command will search for --, not ->cnt.
> So git cannot search for patterns starting with a dash at all,
> as I see it. This is getting fun..
>
>>As I said, it's a mess. And I know not whether any code can convince
>>the "but we need to watch compatibility"-sayers, because this would
>>definitely be a flag change.

I guess our mails crossed.  You don't have to worry about "flag change",
as I don't think there is any change necessary.

You just need to learn:

 (1) "-e" can come before a string to tell git: "this might look like
     an option to you but it isn't; it is what I am looking for"; and

 (2) Everything that follows "--" are pathspecs, not revs nor options.

And all is well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-05 23:31 ` Junio C Hamano
@ 2010-02-06  3:51   ` Jeff King
  2010-02-06  4:53     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-02-06  3:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jan Engelhardt, git

On Fri, Feb 05, 2010 at 03:31:06PM -0800, Junio C Hamano wrote:

> Jan Engelhardt <jengelh@medozas.de> writes:
> 
> > Just about now I wanted to grep for accesses of a particular struct 
> > member. Needless to say that it was not a very amusing experience.
> > I would expect that (1) probably fails:
> >
> > (1)	$ git grep '->cnt' net/ipv4/netfilter/
> > 	error: unknown switch `>'
> >
> > So far so good, seems reasonable and matches what I would expect from 
> > most other userspace tools. So let's add -- to terminate the option 
> > list:
> 
> Also you can say "grep -e '->cnt'".  Not just "git grep" but regular grep
> understands this, too.

GNU grep understands "grep -- '->cnt'", so I find myself typing it a
lot. Even though "--" is used for revision and pathname separation, I
don't think there is a conflict in also using it to separate options
from patterns for the case that there is no "-e" at all. In other words:

  git grep -- foo

is not ambiguous. That "--" could not possibly be separating revisions
from pathnames because we have not yet seen any pattern, which is bogus.
In a case like:

  git grep -e pattern -- foo

I think it is clear that the "--" is separating pathnames, because we
already have a pathname. This matches standard grep, which will treat
the first non-option as a pattern only if we have no "-e" pattern.

So I think we can do just do this:

diff --git a/builtin-grep.c b/builtin-grep.c
index 0ef849c..46ffc1d 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -889,6 +889,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		/* die the same way as if we did it at the beginning */
 		setup_git_directory();
 
+	/*
+	 * skip a -- separator; we know it cannot be
+	 * separating revisions from pathnames if
+	 * we haven't even had any patterns yet
+	 */
+	if (argc > 0 && !opt.pattern_list && !strcmp(argv[0], "--")) {
+		argv++;
+		argc--;
+	}
+
 	/* First unrecognized non-option token */
 	if (argc > 0 && !opt.pattern_list) {
 		append_grep_pattern(&opt, argv[0], "command line", 0,

and make everybody happy. But I admit I haven't thought about it more
than 5 minutes or so, so perhaps there is a case I am missing that will
be ambiguous or confusing. The worst I could come up with is the
double-double-dash case:

  git grep -- pattern revision -- pathname

It is perhaps not as pretty as

  git grep -e pattern revision -- pathname

but I don't think it is ambiguous.

> > (2)	$ git grep -- '->cnt' net/ipv4/netfilter/
> > 	fatal: bad flag '->cnt' used after filename
> >
> > *bzzt*.
> 
> This indeed is bzzt, especially if you had a file called "./->cnt" in the
> work tree.  That would mean that you cannot tell the command to look for a
> pattern in the work tree.
> 
> But because you are not giving anything before "--", that "git grep" is
> not looking for anything.  Indeed, (2) is a user error.  If you try this:

That is not quite true. The way "git grep" is implemented now, it is
actually grepping for "--". parse_options stops at the "--", leaving it
in argv, and then we assume whatever is left by parse_options is a
pattern (since we saw no "-e"). But of course it is looking in the
'->cnt' pathspec (or revision!), which is bogus (and gets you "bad flag used
after filename").

But as you noted with:

> > What works is (3).
> >
> > (3)	$ git grep -- -- '->cnt' net/ipv4/netfilter/

...disambiguating the pathspec with the extra "--" gets it past option
parsing and looking for "--".

So actually my patch above is breaking somebody who truly wanted to grep
for "--" by doing

  git grep --

but that is sufficiently insane that I'm not too worried about it.

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-06  3:51   ` Jeff King
@ 2010-02-06  4:53     ` Junio C Hamano
  2010-02-06  8:17       ` Miles Bader
  2010-02-06 11:58       ` Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-02-06  4:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Jan Engelhardt, git

Jeff King <peff@peff.net> writes:

> The worst I could come up with is the
> double-double-dash case:
>
>   git grep -- pattern revision -- pathname
>
> It is perhaps not as pretty as
>
>   git grep -e pattern revision -- pathname
>
> but I don't think it is ambiguous.

I don't think if "ambiguous or not" is what we are after to begin with.

I have known GNU extended grep implementations long enough but never saw
that "--" used to quote a pattern.  Is it worth supporting to begin with?

> So actually my patch above is breaking somebody who truly wanted to grep
> for "--" by doing
>
>   git grep --
>
> but that is sufficiently insane that I'm not too worried about it.

I would say "git grep -- pattern" is sufficiently insane enough that
I'm not worried about it at all.  Interpreting "git grep --" as a request
to look for double-dash feels million times saner than that, actually.

Unless somebody comes up with example of that pattern's wide use.  Point
me to some well known open source software's source trees that use "--"
for such a purpose in one of its shell script or Makefile.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-06  4:53     ` Junio C Hamano
@ 2010-02-06  8:17       ` Miles Bader
  2010-02-06 11:58       ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Miles Bader @ 2010-02-06  8:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jan Engelhardt, git

Junio C Hamano <gitster@pobox.com> writes:
> I have known GNU extended grep implementations long enough but never saw
> that "--" used to quote a pattern.  Is it worth supporting to begin with?

Well, it is a natural consequence of the way command-line parsing works
in typical GNU progs; "--" doesn't mean "files follow", it means "no
more options"....

-Miles

-- 
I'd rather be consing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-06  4:53     ` Junio C Hamano
  2010-02-06  8:17       ` Miles Bader
@ 2010-02-06 11:58       ` Jeff King
  2010-02-06 17:39         ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-02-06 11:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jan Engelhardt, git

On Fri, Feb 05, 2010 at 08:53:36PM -0800, Junio C Hamano wrote:

> >   git grep -- pattern revision -- pathname
> [...]
> I don't think if "ambiguous or not" is what we are after to begin with.
> 
> I have known GNU extended grep implementations long enough but never saw
> that "--" used to quote a pattern.  Is it worth supporting to begin with?

I think so. It was the first thing the original poster in this thread
tried. It is also something I have tried (and still do, then grumblingly
retype "-e pattern"). And it certainly makes sense from a user
perspective; it is the same end-of-options signal that most other
programs take.

So I think it is a convenient interface improvement, nothing more. If it
were somehow onerous to support, I would say that no, it is not worth
it. But it really is only a few lines of code, and I do not think the
behavior change is hurting any real-world cases (which is what I was
trying to show earlier).

I suspect you are not familiar with it because you are enough of an
old-timer to have worked with the many non-GNU greps that require "-e"
to specify a funny pattern and so got used to that habit.

> I would say "git grep -- pattern" is sufficiently insane enough that
> I'm not worried about it at all.  Interpreting "git grep --" as a request
> to look for double-dash feels million times saner than that, actually.

I don't think "grep --" is sane at all, since it is broken under GNU
grep. And because "--" is a special token in option parsing, I would
expect it to need "git grep -e --".

> Unless somebody comes up with example of that pattern's wide use.  Point
> me to some well known open source software's source trees that use "--"
> for such a purpose in one of its shell script or Makefile.

OK. Try:

  http://www.google.com/codesearch?hl=en&sa=N&q=grep.*%5Cs--%5Cs++lang:shell&ct=rr&cs_r=lang:shell

Some are false positives, but it looks like libtool's generated
configure scripts use it (which is in literally hundreds of projects),
openssh's fixpaths script, ffmpeg's configure script, even a use in a
plan9 script.

And that's just the first page of results. So I think I am not the only
one.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-06 11:58       ` Jeff King
@ 2010-02-06 17:39         ` Junio C Hamano
  2010-02-07  4:44           ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2010-02-06 17:39 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jan Engelhardt, git

Jeff King <peff@peff.net> writes:

> On Fri, Feb 05, 2010 at 08:53:36PM -0800, Junio C Hamano wrote:
>
>> >   git grep -- pattern revision -- pathname
>> [...]
>> I don't think if "ambiguous or not" is what we are after to begin with.
>> 
>> I have known GNU extended grep implementations long enough but never saw
>> that "--" used to quote a pattern.  Is it worth supporting to begin with?
>
> I think so. It was the first thing the original poster in this thread
> tried. It is also something I have tried (and still do, then grumblingly
> retype "-e pattern"). And it certainly makes sense from a user
> perspective; it is the same end-of-options signal that most other
> programs take.

Ok, then let's take that (perhaps before 1.7.0 perhaps after).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-06 17:39         ` Junio C Hamano
@ 2010-02-07  4:44           ` Jeff King
  2010-02-08  1:06             ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2010-02-07  4:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jan Engelhardt, git

On Sat, Feb 06, 2010 at 09:39:32AM -0800, Junio C Hamano wrote:

> > I think so. It was the first thing the original poster in this thread
> > tried. It is also something I have tried (and still do, then grumblingly
> > retype "-e pattern"). And it certainly makes sense from a user
> > perspective; it is the same end-of-options signal that most other
> > programs take.
> 
> Ok, then let's take that (perhaps before 1.7.0 perhaps after).

Here it is with a commit message and some tests. While it is a minor
change, we are pretty late in the release cycle, so perhaps it is best
to leave it post-1.7.0 just to be on the safe side.

-- >8 --
Subject: [PATCH] accept "git grep -- pattern"

Currently the only way to "quote" a grep pattern that might
begin with a dash is to use "git grep -e pattern". This
works just fine, and is also the way right way to do it on
many traditional grep implemenations.

Some people prefer to use "git grep -- pattern", however, as
"--" is the usual "end of options" marker, and at least GNU
grep and Solaris 10 grep support this. This patch makes that
syntax work.

There is a slight behavior change, in that "git grep -- $X"
used to be interpreted as "grep for -- in $X". However, that
usage is questionable. "--" is usually the end-of-options
marker, so "git grep" was unlike many other greps in
treating it as a literal pattern (e.g., both GNU grep and
Solaris 10 grep will treat "grep --" as missing a pattern).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin-grep.c  |   10 ++++++++++
 t/t7002-grep.sh |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 0 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 26d4deb..63d4b95 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -861,6 +861,16 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			     PARSE_OPT_STOP_AT_NON_OPTION |
 			     PARSE_OPT_NO_INTERNAL_HELP);
 
+	/*
+	 * skip a -- separator; we know it cannot be
+	 * separating revisions from pathnames if
+	 * we haven't even had any patterns yet
+	 */
+	if (argc > 0 && !opt.pattern_list && !strcmp(argv[0], "--")) {
+		argv++;
+		argc--;
+	}
+
 	/* First unrecognized non-option token */
 	if (argc > 0 && !opt.pattern_list) {
 		append_grep_pattern(&opt, argv[0], "command line", 0,
diff --git a/t/t7002-grep.sh b/t/t7002-grep.sh
index 7144f81..0b583cb 100755
--- a/t/t7002-grep.sh
+++ b/t/t7002-grep.sh
@@ -434,4 +434,37 @@ test_expect_success 'grep -Fi' '
 	test_cmp expected actual
 '
 
+test_expect_success 'setup double-dash tests' '
+cat >double-dash <<EOF &&
+--
+->
+other
+EOF
+git add double-dash
+'
+
+cat >expected <<EOF
+double-dash:->
+EOF
+test_expect_success 'grep -- pattern' '
+	git grep -- "->" >actual &&
+	test_cmp expected actual
+'
+test_expect_success 'grep -- pattern -- pathspec' '
+	git grep -- "->" -- double-dash >actual &&
+	test_cmp expected actual
+'
+test_expect_success 'grep -e pattern -- path' '
+	git grep -e "->" -- double-dash >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<EOF
+double-dash:--
+EOF
+test_expect_success 'grep -e -- -- path' '
+	git grep -e -- -- double-dash >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
1.7.0.rc1.58.g769126

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: git-grep: option parsing conflicts with prefix-dash searches
  2010-02-07  4:44           ` Jeff King
@ 2010-02-08  1:06             ` Junio C Hamano
  0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2010-02-08  1:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jan Engelhardt, git

Jeff King <peff@peff.net> writes:

> Here it is with a commit message and some tests. While it is a minor
> change, we are pretty late in the release cycle, so perhaps it is best
> to leave it post-1.7.0 just to be on the safe side.

Your explanation that "-- marks the end of options" makes perfect sense
and it is not inconsistent with what we do either.  "grep" is an oddball
that among its non-option arguments the first one _can_ be a non-path
(namely, the single pattern you are looking for), but with your patch,
that logic is nicely expressed, too.

I am tempted to push it into 1.7.0 but I'd keep it in 'next' for now.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-02-08  1:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-05 23:09 git-grep: option parsing conflicts with prefix-dash searches Jan Engelhardt
2010-02-05 23:17 ` Jan Engelhardt
2010-02-05 23:27   ` Santi Béjar
2010-02-05 23:34   ` Junio C Hamano
2010-02-05 23:31 ` Junio C Hamano
2010-02-06  3:51   ` Jeff King
2010-02-06  4:53     ` Junio C Hamano
2010-02-06  8:17       ` Miles Bader
2010-02-06 11:58       ` Jeff King
2010-02-06 17:39         ` Junio C Hamano
2010-02-07  4:44           ` Jeff King
2010-02-08  1:06             ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.