All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] git-grep: allow patterns starting with -
@ 2006-06-25 15:38 Matthias Lederhofer
  2006-06-25 15:47 ` Timo Hirvonen
  0 siblings, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-25 15:38 UTC (permalink / raw)
  To: git

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
---
I did not find another way to use patterns starting with -, if it is
possible without the patch please tell me and ignore the patch :)
example:
% git grep -- --bla HEAD HEAD~1 -- --foo
HEAD:--foo/bla:test --bla foo

 builtin-grep.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 2e7986c..d0677cc 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -817,8 +817,13 @@ int cmd_grep(int argc, const char **argv
 			}
 			usage(builtin_grep_usage);
 		}
-		if (!strcmp("--", arg))
+		if (!strcmp("--", arg)) {
+			if (!opt.pattern_list && argc > 0) {
+				argc--; argv++;
+				add_pattern(&opt, *argv, "command line", 0);
+			}
 			break;
+		}
 		if (*arg == '-')
 			usage(builtin_grep_usage);
 
-- 
1.4.1.rc1.g29f4a-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: allow patterns starting with -
  2006-06-25 15:38 [PATCH] git-grep: allow patterns starting with - Matthias Lederhofer
@ 2006-06-25 15:47 ` Timo Hirvonen
  2006-06-25 16:07   ` [PATCH] correct documentation for git grep Matthias Lederhofer
  2006-06-25 16:18   ` [PATCH] git-grep: allow patterns starting with - Matthias Lederhofer
  0 siblings, 2 replies; 32+ messages in thread
From: Timo Hirvonen @ 2006-06-25 15:47 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Matthias Lederhofer <matled@gmx.net> wrote:

> Signed-off-by: Matthias Lederhofer <matled@gmx.net>
> ---
> I did not find another way to use patterns starting with -, if it is
> possible without the patch please tell me and ignore the patch :)
> example:
> % git grep -- --bla HEAD HEAD~1 -- --foo
> HEAD:--foo/bla:test --bla foo

git grep -e --bla

It's not very well documented.

-- 
http://onion.dynserv.net/~timo/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] correct documentation for git grep
  2006-06-25 15:47 ` Timo Hirvonen
@ 2006-06-25 16:07   ` Matthias Lederhofer
  2006-06-25 23:10     ` Johannes Schindelin
  2006-06-25 16:18   ` [PATCH] git-grep: allow patterns starting with - Matthias Lederhofer
  1 sibling, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-25 16:07 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git

---
> git grep -e --bla
> 
> It's not very well documented.
Let's change that!

 Documentation/git-grep.txt |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 7b810df..62a8e7f 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 	   [-n] [-l | --files-with-matches] [-L | --files-without-match]
 	   [-c | --count]
 	   [-A <post-context>] [-B <pre-context>] [-C <context>]
-	   [-f <file>] [-e <pattern>]
+	   [-f <file>] [-e] <pattern>
 	   [<tree>...]
 	   [--] [<path>...]
 
@@ -71,6 +71,11 @@ OPTIONS
 -f <file>::
 	Read patterns from <file>, one per line.
 
+-e::
+	The next parameter is the pattern. This option has to be
+	used for patterns starting with - and should be used in
+	scripts passing user input to grep.
+
 `<tree>...`::
 	Search blobs in the trees for specified patterns.
 
-- 
1.4.1.rc1.gc594

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: allow patterns starting with -
  2006-06-25 15:47 ` Timo Hirvonen
  2006-06-25 16:07   ` [PATCH] correct documentation for git grep Matthias Lederhofer
@ 2006-06-25 16:18   ` Matthias Lederhofer
  1 sibling, 0 replies; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-25 16:18 UTC (permalink / raw)
  To: Timo Hirvonen; +Cc: git

> Matthias Lederhofer <matled@gmx.net> wrote:
> 
> > Signed-off-by: Matthias Lederhofer <matled@gmx.net>
> > ---
> > I did not find another way to use patterns starting with -, if it is
> > possible without the patch please tell me and ignore the patch :)
> > example:
> > % git grep -- --bla HEAD HEAD~1 -- --foo
> > HEAD:--foo/bla:test --bla foo
> 
> git grep -e --bla
Perhaps the original patch may be applied anyways for consistency with
the GNU grep? :)
But it's really not important to me and well, having -- twice in the
command line is a bit strange too.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] correct documentation for git grep
  2006-06-25 16:07   ` [PATCH] correct documentation for git grep Matthias Lederhofer
@ 2006-06-25 23:10     ` Johannes Schindelin
  2006-06-25 23:39       ` Matthias Lederhofer
  2006-06-26  0:02       ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
  0 siblings, 2 replies; 32+ messages in thread
From: Johannes Schindelin @ 2006-06-25 23:10 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: Timo Hirvonen, git

Hi,

On Sun, 25 Jun 2006, Matthias Lederhofer wrote:

> +-e::
> +	The next parameter is the pattern. This option has to be
> +	used for patterns starting with - and should be used in
> +	scripts passing user input to grep.

... and by the far the most common use is to pass more than one pattern. 
Also, the usage is "[-e] <pattern> [-e <pattern>...]".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] correct documentation for git grep
  2006-06-25 23:10     ` Johannes Schindelin
@ 2006-06-25 23:39       ` Matthias Lederhofer
  2006-06-26  0:06         ` Matthias Lederhofer
  2006-06-26  0:02       ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
  1 sibling, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-25 23:39 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
---
> ... and by the far the most common use is to pass more than one pattern. 
> Also, the usage is "[-e] <pattern> [-e <pattern>...]".
Ok, so I changed the patch :)

 Documentation/git-grep.txt |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 7b810df..3dd1bdd 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 	   [-n] [-l | --files-with-matches] [-L | --files-without-match]
 	   [-c | --count]
 	   [-A <post-context>] [-B <pre-context>] [-C <context>]
-	   [-f <file>] [-e <pattern>]
+	   [-f <file>] [-e] <pattern> [-e <pattern> [..]]
 	   [<tree>...]
 	   [--] [<path>...]
 
@@ -71,6 +71,12 @@ OPTIONS
 -f <file>::
 	Read patterns from <file>, one per line.
 
+-e::
+	The next parameter is a pattern. This option has to be
+	used for patterns starting with - and should be used in
+	scripts passing user input to grep. You can specify multiple
+	patterns which will be combined by or.
+
 `<tree>...`::
 	Search blobs in the trees for specified patterns.
 
-- 
1.4.1.rc1.g72a4-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-25 23:10     ` Johannes Schindelin
  2006-06-25 23:39       ` Matthias Lederhofer
@ 2006-06-26  0:02       ` Matthias Lederhofer
  2006-06-29 22:20         ` Thomas Glanzmann
  1 sibling, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-26  0:02 UTC (permalink / raw)
  To: git

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
---
> ... and by the far the most common use is to pass more than one pattern. 
> Also, the usage is "[-e] <pattern> [-e <pattern>...]".

Here is a patch to allow combination of patterns with 'and' instead of
'or'. This makes it easier to search for combinations of words in a line
without using grep multiple times combined by pipes. So it is still
possible to use -A/-B/-C (something I miss in normal grep). --and
cannot be passed down, so we have to use the built-in version if it is
set.

 Documentation/git-grep.txt |    5 ++++-
 builtin-grep.c             |   17 +++++++++++++----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index ebfe51b..df9d705 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 	   [-n] [-l | --files-with-matches] [-L | --files-without-match]
 	   [-c | --count]
 	   [-A <post-context>] [-B <pre-context>] [-C <context>]
-	   [-f <file>] [-e] <pattern> [-e <pattern> [..]]
+	   [-f <file>] [-e] <pattern> [-e <pattern> [..]] [--and]
 	   [<tree>...]
 	   [--] [<path>...]
 
@@ -77,6 +77,9 @@ OPTIONS
 	scripts passing user input to grep. You can specify multiple
 	patterns which will be combined by 'or'.
 
+--and::
+	Combine multiple patterns by 'and' instead of 'or'.
+
 `<tree>...`::
 	Search blobs in the trees for specified patterns.
 
diff --git a/builtin-grep.c b/builtin-grep.c
index d0677cc..a2a034a 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -96,6 +96,7 @@ struct grep_opt {
 	regex_t regexp;
 	unsigned linenum:1;
 	unsigned invert:1;
+	unsigned and:1;
 	unsigned name_only:1;
 	unsigned unmatch_name_only:1;
 	unsigned count:1;
@@ -268,7 +269,11 @@ static int grep_buffer(struct grep_opt *
 				    word_char(bol[pmatch[0].rm_eo]))
 					hit = 0;
 			}
-			if (hit)
+			if (opt->and && !hit) {
+				hit = 0;
+				break;
+			}
+			if (!opt->and && hit)
 				break;
 		}
 		/* "grep -v -e foo -e bla" should list lines
@@ -553,10 +558,10 @@ static int grep_cache(struct grep_opt *o
 #ifdef __unix__
 	/*
 	 * Use the external "grep" command for the case where
-	 * we grep through the checked-out files. It tends to
-	 * be a lot more optimized
+	 * we grep through the checked-out files and do not use
+	 * non-standard options. It tends to be a lot more optimized.
 	 */
-	if (!cached) {
+	if (!cached && !opt->and) {
 		hit = external_grep(opt, paths, cached);
 		if (hit >= 0)
 			return hit;
@@ -690,6 +695,10 @@ int cmd_grep(int argc, const char **argv
 			opt.binary = GREP_BINARY_TEXT;
 			continue;
 		}
+		if (!strcmp("--and", arg)) {
+			opt.and = 1;
+			continue;
+		}
 		if (!strcmp("-i", arg) ||
 		    !strcmp("--ignore-case", arg)) {
 			opt.regflags |= REG_ICASE;
-- 
1.4.1.rc1.g72a4-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH] correct documentation for git grep
  2006-06-25 23:39       ` Matthias Lederhofer
@ 2006-06-26  0:06         ` Matthias Lederhofer
  2006-06-26  6:59           ` Johannes Schindelin
  0 siblings, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-26  0:06 UTC (permalink / raw)
  To: git

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
---
The 'or' as logic or should be marked in the text. I did
that in the patch with --and too so if this is accepted the
documentation should be consistent. Sorry for the noise.

 Documentation/git-grep.txt |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 7b810df..ebfe51b 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -16,7 +16,7 @@ SYNOPSIS
 	   [-n] [-l | --files-with-matches] [-L | --files-without-match]
 	   [-c | --count]
 	   [-A <post-context>] [-B <pre-context>] [-C <context>]
-	   [-f <file>] [-e <pattern>]
+	   [-f <file>] [-e] <pattern> [-e <pattern> [..]]
 	   [<tree>...]
 	   [--] [<path>...]
 
@@ -71,6 +71,12 @@ OPTIONS
 -f <file>::
 	Read patterns from <file>, one per line.
 
+-e::
+	The next parameter is a pattern. This option has to be
+	used for patterns starting with - and should be used in
+	scripts passing user input to grep. You can specify multiple
+	patterns which will be combined by 'or'.
+
 `<tree>...`::
 	Search blobs in the trees for specified patterns.
 
-- 
1.4.1.rc1.g72a4-dirty

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] correct documentation for git grep
  2006-06-26  0:06         ` Matthias Lederhofer
@ 2006-06-26  6:59           ` Johannes Schindelin
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Schindelin @ 2006-06-26  6:59 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Hi,

On Mon, 26 Jun 2006, Matthias Lederhofer wrote:

> -	   [-f <file>] [-e <pattern>]
> +	   [-f <file>] [-e] <pattern> [-e <pattern> [..]]
>  	   [<tree>...]
>  	   [--] [<path>...]

Minor nit: as you can see from the two latter lines, "<bla>..." is the 
standard notation, whereas "<bla> [..]" is not.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-26  0:02       ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
@ 2006-06-29 22:20         ` Thomas Glanzmann
  2006-06-29 22:44           ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Glanzmann @ 2006-06-29 22:20 UTC (permalink / raw)
  To: git; +Cc: Matthias Lederhofer

Hello,

> *AND* more than one pattern. (something I miss in normal grep)

so do I, is it possible to use git-grep outside of git for files that
are not in a repository? Or are there any grep implementations available
which bring this feature to me?

        Thomas

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-29 22:20         ` Thomas Glanzmann
@ 2006-06-29 22:44           ` Junio C Hamano
  2006-06-30  2:25             ` Matthias Lederhofer
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2006-06-29 22:44 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: git, Matthias Lederhofer

Thomas Glanzmann <sithglan@stud.uni-erlangen.de> writes:

> Hello,
>
>> *AND* more than one pattern. (something I miss in normal grep)
>
> so do I.

So do I.

I am wondering if we would rather want to do something like
expression `find` command let's you build.  In other words:

	git grep --extended-expression '(' 'foo' -o 'bar' ')' -a 'frotz'

might be what we would eventually want.  And I have this nagging
suspicion that if we allow to say something like this

	git grep --and -e a -e b

right now, it would make it more cumbersome (read: backward
compatibility wart) to support both styles later.

I could be talked into

	git grep -e a -a -e b

but that would already be building that expression engine, so...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-29 22:44           ` Junio C Hamano
@ 2006-06-30  2:25             ` Matthias Lederhofer
  2006-06-30  4:13               ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30  2:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

So here is my proposal how extended expressions would work:

extended expressions have three operators:
    AND, OR (binary), NOT (unary)
extended expressions do not need an extra option. They will be usable
by adding operators between the expressions; a default operator is
used if no operator is specified. The default operator is by default
OR because currently multiple patterns are combined by OR.

OR and AND have precedence, so there are two possibilities, I'd take
the first one.
1. OR, AND:
    This will make it easier to read because OR can be skipped:
      pat1 pat2 AND pat3 pat4
    = pat1 OR pat2 AND pat3 OR pat4
    = (pat1 OR pat2) AND (pat3 OR pat4)
2. AND, OR:
    This is a bit more logic if you think of AND as * and OR as +.

Parenthesis may be used to explicitly override the default precedence.

With this setup we can add an option -FOO (I don't now how to call it,
it is the --and from the patch) which changes the default operator and
the precedence.  With -FOO you'd get AND as default operator and
precedence AND, OR.  Without this option it was easy to write the
formula in a conjungtive form (conjunction of disjunctions), now it is
easy to write a disjunctive form (disjunction of conjunctions):
  pat1 pat2 OR pat3 pat4
= pat1 AND pat2 OR pat3 AND pat4
= (pat1 AND pat2) OR (pat3 AND pat4)

With all this as plan for extended expressions we may also introduce
-FOO now with exactly the behaviour of --and in my patch because
currently no explicit operators and parenthesis are allowed, so only
the default operator may be used and -FOO would change the default
operator.

A short example:
(pat1 AND pat2 AND pat3) OR pat4
could be written as
-FOO pat1 pat2 pat3 OR pat4
which is imho quite readable.

So the next problem are names for the options. We would need
 - AND: between patterns
 - OR:  between patterns
 - NOT: before a pattern
 - FOO: change default operator and precedence
Unfortunately -o, -a, -n are taken and I think the options should be
unique even though they are only allowed at certain positions of the
argument list. I'll think about it a bit, perhaps someone else has a
good idea. FOO should not be named --and imo but I don't have any idea
for a good name atm.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30  2:25             ` Matthias Lederhofer
@ 2006-06-30  4:13               ` Junio C Hamano
  2006-06-30  7:46                 ` Matthias Lederhofer
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30  4:13 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Matthias Lederhofer <matled@gmx.net> writes:

> OR and AND have precedence, so there are two possibilities, I'd take
> the first one.
> 1. OR, AND:
>     This will make it easier to read because OR can be skipped:
>       pat1 pat2 AND pat3 pat4
>     = pat1 OR pat2 AND pat3 OR pat4
>     = (pat1 OR pat2) AND (pat3 OR pat4)
> 2. AND, OR:
>     This is a bit more logic if you think of AND as * and OR as +.

> ... FOO should not be named --and imo but I don't have any idea
> for a good name atm.

I personally feel FOO should not even exist.  An option that
covers the entire expression to make operator precedence in it
sounds quite evil.  

I would say make --and bind tighter than --or and use
parentheses as needed.  Making --or optional sounds fine as that
would make the default "multiple -e" case similar to what GNU
grep does without any --and nor --or.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30  4:13               ` Junio C Hamano
@ 2006-06-30  7:46                 ` Matthias Lederhofer
  2006-06-30  7:56                   ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30  7:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

(Junio: please reply to this one, forgot the Cc in the first one :/)

Junio C Hamano wrote:
> Matthias Lederhofer <matled@gmx.net> writes:
> 
> > OR and AND have precedence, so there are two possibilities, I'd take
> > the first one.
> > 1. OR, AND:
> >     This will make it easier to read because OR can be skipped:
> >       pat1 pat2 AND pat3 pat4
> >     = pat1 OR pat2 AND pat3 OR pat4
> >     = (pat1 OR pat2) AND (pat3 OR pat4)
> > 2. AND, OR:
> >     This is a bit more logic if you think of AND as * and OR as +.
> 
> > ... FOO should not be named --and imo but I don't have any idea
> > for a good name atm.
> 
> I personally feel FOO should not even exist.  An option that
> covers the entire expression to make operator precedence in it
> sounds quite evil.  
> 
> I would say make --and bind tighter than --or and use parentheses as
> needed.
Ok, perhaps changing operator precedence is a bit much. What do you
think of that then:
Operator precedence AND, OR. The FOO options changes the default
operator to AND. This also seems quite natural if you think of
AND as * and OR as +:
A B + C D = A * B + C * D = (A * B) + (C * D)

A few examples to get an impression how the command line could look
like:
A OR B OR (C AND D)    => A B C AND D
(A OR B OR C) AND D    => (A B C) AND D
A AND B AND (C OR D)   => -FOO A B (C OR D)
(A AND B AND C) OR D   => -FOO A B C OR D

Perhaps we even could use options which are similar to * and +, for
example:
 - -* and -+ (-* should not be expanded often but is annoying anyway)
 - -. and -+
 - -t and -p (A -t B is A times B, A -p B is A plus B)

> Making --or optional sounds fine as that
> would make the default "multiple -e" case similar to what GNU
> grep does without any --and nor --or.
That's exactly what I was thinking about: make extended expressions
compatible to current grep options. This will confuse less people and
there is no need for an extra option to activate this.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30  7:46                 ` Matthias Lederhofer
@ 2006-06-30  7:56                   ` Junio C Hamano
  2006-06-30 10:08                     ` [PATCH] git-grep: boolean expression on pattern matching Junio C Hamano
  2006-06-30 10:57                     ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
  0 siblings, 2 replies; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30  7:56 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Matthias Lederhofer <matled@gmx.net> writes:

> (Junio: please reply to this one, forgot the Cc in the first one :/)

Huh?

>> I personally feel FOO should not even exist.  An option that
>> covers the entire expression to make operator precedence in it
>> sounds quite evil.  
>> 
>> I would say make --and bind tighter than --or and use parentheses as
>> needed.
> Ok, perhaps changing operator precedence is a bit much. What do you
> think of that then:
> ...

I see you are trying hard to think of a way to justify your
original prefix "--and" (or --FOO) implementation, but I simply
do not see much point in that.  I doubt changing the default
operator from --or to --and is less confusing than changing the
precedence for the users, so you would hear the same "I
personally feel FOO should not even exist" objection from me.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH] git-grep: boolean expression on pattern matching.
  2006-06-30  7:56                   ` Junio C Hamano
@ 2006-06-30 10:08                     ` Junio C Hamano
  2006-06-30 10:24                       ` Jakub Narebski
  2006-06-30 15:11                       ` Matthias Lederhofer
  2006-06-30 10:57                     ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
  1 sibling, 2 replies; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 10:08 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

This extends the behaviour of git-grep when multiple -e options
are given.  So far, we allowed multiple -e to behave just like
regular grep with multiple -e, i.e. the patterns are OR'ed
together.

With this change, you can also have multiple patterns AND'ed
together, or form boolean expressions, like this (the
parentheses are quoted from the shell in this example):

	$ git grep -e _PATTERN --and \( -e atom -e token \)

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * OR'ing together, admittably, can be done easily by saying
   something like -e 'atom\|token', so being able to say --and
   as you argued is of more practical importance, and doing
   boolean expression like this might be too much frill.

   Only very lightly tested; it is obviously not slated for
   1.4.1.

 builtin-grep.c |  378 ++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 327 insertions(+), 51 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 2e7986c..70b1fd2 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -82,17 +82,47 @@ static int pathspec_matches(const char *
 	return 0;
 }
 
+enum grep_pat_token {
+	GREP_PATTERN,
+	GREP_AND,
+	GREP_OPEN_PAREN,
+	GREP_CLOSE_PAREN,
+	GREP_NOT,
+	GREP_OR,
+};
+
 struct grep_pat {
 	struct grep_pat *next;
 	const char *origin;
 	int no;
+	enum grep_pat_token token;
 	const char *pattern;
 	regex_t regexp;
 };
 
+enum grep_expr_node {
+	GREP_NODE_ATOM,
+	GREP_NODE_NOT,
+	GREP_NODE_AND,
+	GREP_NODE_OR,
+};
+
+struct grep_expr {
+	enum grep_expr_node node;
+	union {
+		struct grep_pat *atom;
+		struct grep_expr *unary;
+		struct {
+			struct grep_expr *left;
+			struct grep_expr *right;
+		} binary;
+	} u;
+};
+
 struct grep_opt {
 	struct grep_pat *pattern_list;
 	struct grep_pat **pattern_tail;
+	struct grep_expr *pattern_expression;
 	regex_t regexp;
 	unsigned linenum:1;
 	unsigned invert:1;
@@ -105,43 +135,224 @@ #define GREP_BINARY_DEFAULT	0
 #define GREP_BINARY_NOMATCH	1
 #define GREP_BINARY_TEXT	2
 	unsigned binary:2;
+	unsigned extended:1;
 	int regflags;
 	unsigned pre_context;
 	unsigned post_context;
 };
 
 static void add_pattern(struct grep_opt *opt, const char *pat,
-			const char *origin, int no)
+			const char *origin, int no, enum grep_pat_token t)
 {
 	struct grep_pat *p = xcalloc(1, sizeof(*p));
 	p->pattern = pat;
 	p->origin = origin;
 	p->no = no;
+	p->token = t;
 	*opt->pattern_tail = p;
 	opt->pattern_tail = &p->next;
 	p->next = NULL;
 }
 
+static void compile_regexp(struct grep_pat *p, struct grep_opt *opt)
+{
+	int err = regcomp(&p->regexp, p->pattern, opt->regflags);
+	if (err) {
+		char errbuf[1024];
+		char where[1024];
+		if (p->no)
+			sprintf(where, "In '%s' at %d, ",
+				p->origin, p->no);
+		else if (p->origin)
+			sprintf(where, "%s, ", p->origin);
+		else
+			where[0] = 0;
+		regerror(err, &p->regexp, errbuf, 1024);
+		regfree(&p->regexp);
+		die("%s'%s': %s", where, p->pattern, errbuf);
+	}
+}
+
+#if DEBUG
+static inline void indent(int in)
+{
+	int i;
+	for (i = 0; i < in; i++) putchar(' ');
+}
+
+static void dump_pattern_exp(struct grep_expr *x, int in)
+{
+	switch (x->node) {
+	case GREP_NODE_ATOM:
+		indent(in);
+		puts(x->u.atom->pattern);
+		break;
+	case GREP_NODE_NOT:
+		indent(in);
+		puts("--not");
+		dump_pattern_exp(x->u.unary, in+1);
+		break;
+	case GREP_NODE_AND:
+		dump_pattern_exp(x->u.binary.left, in+1);
+		indent(in);
+		puts("--and");
+		dump_pattern_exp(x->u.binary.right, in+1);
+		break;
+	case GREP_NODE_OR:
+		dump_pattern_exp(x->u.binary.left, in+1);
+		indent(in);
+		puts("--or");
+		dump_pattern_exp(x->u.binary.right, in+1);
+		break;
+	}
+}
+
+static void looking_at(const char *msg, struct grep_pat **list)
+{
+	struct grep_pat *p = *list;
+	fprintf(stderr, "%s: looking at ", msg);
+	if (!p)
+		fprintf(stderr, "empty\n");
+	else
+		fprintf(stderr, "<%s>\n", p->pattern);
+}
+#else
+#define looking_at(a,b) do {} while(0)
+#endif
+
+static struct grep_expr *compile_pattern_expr(struct grep_pat **);
+static struct grep_expr *compile_pattern_atom(struct grep_pat **list)
+{
+	struct grep_pat *p;
+	struct grep_expr *x;
+
+	looking_at("atom", list);
+
+	p = *list;
+	switch (p->token) {
+	case GREP_PATTERN: /* atom */
+		x = xcalloc(1, sizeof (struct grep_expr));
+		x->node = GREP_NODE_ATOM;
+		x->u.atom = p;
+		*list = p->next;
+		return x;
+	case GREP_OPEN_PAREN:
+		*list = p->next;
+		x = compile_pattern_expr(list);
+		if (!x)
+			return NULL;
+		if (!*list || (*list)->token != GREP_CLOSE_PAREN)
+			die("unmatched parenthesis");
+		*list = (*list)->next;
+		return x;
+	default:
+		return NULL;
+	}
+}
+
+static struct grep_expr *compile_pattern_not(struct grep_pat **list)
+{
+	struct grep_pat *p;
+	struct grep_expr *x;
+
+	looking_at("not", list);
+
+	p = *list;
+	switch (p->token) {
+	case GREP_NOT:
+		if (!p->next)
+			die("--not not followed by pattern expression");
+		*list = p->next;
+		x = xcalloc(1, sizeof (struct grep_expr));
+		x->node = GREP_NODE_NOT;
+		x->u.unary = compile_pattern_not(list);
+		if (!x->u.unary)
+			die("--not followed by non pattern expression");
+		return x;
+	default:
+		return compile_pattern_atom(list);
+	}
+}
+
+static struct grep_expr *compile_pattern_and(struct grep_pat **list)
+{
+	struct grep_pat *p;
+	struct grep_expr *x, *y, *z;
+
+	looking_at("and", list);
+
+	x = compile_pattern_not(list);
+	p = *list;
+	if (p && p->token == GREP_AND) {
+		if (!p->next)
+			die("--and not followed by pattern expression");
+		*list = p->next;
+		y = compile_pattern_and(list);
+		if (!y)
+			die("--and not followed by pattern expression");
+		z = xcalloc(1, sizeof (struct grep_expr));
+		z->node = GREP_NODE_AND;
+		z->u.binary.left = x;
+		z->u.binary.right = y;
+		return z;
+	}
+	return x;
+}
+
+static struct grep_expr *compile_pattern_or(struct grep_pat **list)
+{
+	struct grep_pat *p;
+	struct grep_expr *x, *y, *z;
+
+	looking_at("or", list);
+
+	x = compile_pattern_and(list);
+	p = *list;
+	if (x && p && p->token != GREP_CLOSE_PAREN) {
+		y = compile_pattern_or(list);
+		if (!y)
+			die("not a pattern expression %s", p->pattern);
+		z = xcalloc(1, sizeof (struct grep_expr));
+		z->node = GREP_NODE_OR;
+		z->u.binary.left = x;
+		z->u.binary.right = y;
+		return z;
+	}
+	return x;
+}
+
+static struct grep_expr *compile_pattern_expr(struct grep_pat **list)
+{
+	looking_at("expr", list);
+
+	return compile_pattern_or(list);
+}
+
 static void compile_patterns(struct grep_opt *opt)
 {
 	struct grep_pat *p;
+
+	/* First compile regexps */
 	for (p = opt->pattern_list; p; p = p->next) {
-		int err = regcomp(&p->regexp, p->pattern, opt->regflags);
-		if (err) {
-			char errbuf[1024];
-			char where[1024];
-			if (p->no)
-				sprintf(where, "In '%s' at %d, ",
-					p->origin, p->no);
-			else if (p->origin)
-				sprintf(where, "%s, ", p->origin);
-			else
-				where[0] = 0;
-			regerror(err, &p->regexp, errbuf, 1024);
-			regfree(&p->regexp);
-			die("%s'%s': %s", where, p->pattern, errbuf);
-		}
+		if (p->token == GREP_PATTERN)
+			compile_regexp(p, opt);
+		else
+			opt->extended = 1;
 	}
+
+	if (!opt->extended)
+		return;
+
+	/* Then bundle them up in an expression.
+	 * A classic recursive descent parser would do.
+	 */
+	p = opt->pattern_list;
+	opt->pattern_expression = compile_pattern_expr(&p);
+#if DEBUG
+	dump_pattern_exp(opt->pattern_expression, 0);
+#endif
+	if (p)
+		die("incomplete pattern expression: %s", p->pattern);
 }
 
 static char *end_of_line(char *cp, unsigned long *left)
@@ -196,6 +407,79 @@ static int fixmatch(const char *pattern,
 	}
 }
 
+static int match_one_pattern(struct grep_opt *opt, struct grep_pat *p, char *bol, char *eol)
+{
+	int hit = 0;
+	regmatch_t pmatch[10];
+
+	if (!opt->fixed) {
+		regex_t *exp = &p->regexp;
+		hit = !regexec(exp, bol, ARRAY_SIZE(pmatch),
+			       pmatch, 0);
+	}
+	else {
+		hit = !fixmatch(p->pattern, bol, pmatch);
+	}
+
+	if (hit && opt->word_regexp) {
+		/* Match beginning must be either
+		 * beginning of the line, or at word
+		 * boundary (i.e. the last char must
+		 * not be alnum or underscore).
+		 */
+		if ((pmatch[0].rm_so < 0) ||
+		    (eol - bol) <= pmatch[0].rm_so ||
+		    (pmatch[0].rm_eo < 0) ||
+		    (eol - bol) < pmatch[0].rm_eo)
+			die("regexp returned nonsense");
+		if (pmatch[0].rm_so != 0 &&
+		    word_char(bol[pmatch[0].rm_so-1]))
+			hit = 0;
+		if (pmatch[0].rm_eo != (eol-bol) &&
+		    word_char(bol[pmatch[0].rm_eo]))
+			hit = 0;
+	}
+	return hit;
+}
+
+static int match_expr_eval(struct grep_opt *opt,
+			   struct grep_expr *x,
+			   char *bol, char *eol)
+{
+	switch (x->node) {
+	case GREP_NODE_ATOM:
+		return match_one_pattern(opt, x->u.atom, bol, eol);
+		break;
+	case GREP_NODE_NOT:
+		return !match_expr_eval(opt, x->u.unary, bol, eol);
+	case GREP_NODE_AND:
+		return (match_expr_eval(opt, x->u.binary.left, bol, eol) &&
+			match_expr_eval(opt, x->u.binary.right, bol, eol));
+	case GREP_NODE_OR:
+		return (match_expr_eval(opt, x->u.binary.left, bol, eol) ||
+			match_expr_eval(opt, x->u.binary.right, bol, eol));
+	}
+	die("Unexpected node type (internal error) %d\n", x->node);
+}
+
+static int match_expr(struct grep_opt *opt, char *bol, char *eol)
+{
+	struct grep_expr *x = opt->pattern_expression;
+	return match_expr_eval(opt, x, bol, eol);
+}
+
+static int match_line(struct grep_opt *opt, char *bol, char *eol)
+{
+	struct grep_pat *p;
+	if (opt->extended)
+		return match_expr(opt, bol, eol);
+	for (p = opt->pattern_list; p; p = p->next) {
+		if (match_one_pattern(opt, p, bol, eol))
+			return 1;
+	}
+	return 0;
+}
+
 static int grep_buffer(struct grep_opt *opt, const char *name,
 		       char *buf, unsigned long size)
 {
@@ -231,46 +515,15 @@ static int grep_buffer(struct grep_opt *
 		hunk_mark = "--\n";
 
 	while (left) {
-		regmatch_t pmatch[10];
 		char *eol, ch;
 		int hit = 0;
-		struct grep_pat *p;
 
 		eol = end_of_line(bol, &left);
 		ch = *eol;
 		*eol = 0;
 
-		for (p = opt->pattern_list; p; p = p->next) {
-			if (!opt->fixed) {
-				regex_t *exp = &p->regexp;
-				hit = !regexec(exp, bol, ARRAY_SIZE(pmatch),
-					       pmatch, 0);
-			}
-			else {
-				hit = !fixmatch(p->pattern, bol, pmatch);
-			}
+		hit = match_line(opt, bol, eol);
 
-			if (hit && opt->word_regexp) {
-				/* Match beginning must be either
-				 * beginning of the line, or at word
-				 * boundary (i.e. the last char must
-				 * not be alnum or underscore).
-				 */
-				if ((pmatch[0].rm_so < 0) ||
-				    (eol - bol) <= pmatch[0].rm_so ||
-				    (pmatch[0].rm_eo < 0) ||
-				    (eol - bol) < pmatch[0].rm_eo)
-					die("regexp returned nonsense");
-				if (pmatch[0].rm_so != 0 &&
-				    word_char(bol[pmatch[0].rm_so-1]))
-					hit = 0;
-				if (pmatch[0].rm_eo != (eol-bol) &&
-				    word_char(bol[pmatch[0].rm_eo]))
-					hit = 0;
-			}
-			if (hit)
-				break;
-		}
 		/* "grep -v -e foo -e bla" should list lines
 		 * that do not have either, so inversion should
 		 * be done outside.
@@ -452,6 +705,8 @@ static int external_grep(struct grep_opt
 	char *argptr = randarg;
 	struct grep_pat *p;
 
+	if (opt->extended)
+		return -1;
 	len = nr = 0;
 	push_arg("grep");
 	if (opt->fixed)
@@ -801,16 +1056,36 @@ int cmd_grep(int argc, const char **argv
 				/* ignore empty line like grep does */
 				if (!buf[0])
 					continue;
-				add_pattern(&opt, strdup(buf), argv[1], ++lno);
+				add_pattern(&opt, strdup(buf), argv[1], ++lno,
+					    GREP_PATTERN);
 			}
 			fclose(patterns);
 			argv++;
 			argc--;
 			continue;
 		}
+		if (!strcmp("--not", arg)) {
+			add_pattern(&opt, arg, "command line", 0, GREP_NOT);
+			continue;
+		}
+		if (!strcmp("--and", arg)) {
+			add_pattern(&opt, arg, "command line", 0, GREP_AND);
+			continue;
+		}
+		if (!strcmp("--or", arg))
+			continue; /* no-op */
+		if (!strcmp("(", arg)) {
+			add_pattern(&opt, arg, "command line", 0, GREP_OPEN_PAREN);
+			continue;
+		}
+		if (!strcmp(")", arg)) {
+			add_pattern(&opt, arg, "command line", 0, GREP_CLOSE_PAREN);
+			continue;
+		}
 		if (!strcmp("-e", arg)) {
 			if (1 < argc) {
-				add_pattern(&opt, argv[1], "-e option", 0);
+				add_pattern(&opt, argv[1], "-e option", 0,
+					    GREP_PATTERN);
 				argv++;
 				argc--;
 				continue;
@@ -824,7 +1099,8 @@ int cmd_grep(int argc, const char **argv
 
 		/* First unrecognized non-option token */
 		if (!opt.pattern_list) {
-			add_pattern(&opt, arg, "command line", 0);
+			add_pattern(&opt, arg, "command line", 0,
+				    GREP_PATTERN);
 			break;
 		}
 		else {
-- 
1.4.1.rc2.gfff62

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: boolean expression on pattern matching.
  2006-06-30 10:08                     ` [PATCH] git-grep: boolean expression on pattern matching Junio C Hamano
@ 2006-06-30 10:24                       ` Jakub Narebski
  2006-06-30 10:29                         ` Junio C Hamano
  2006-06-30 15:11                       ` Matthias Lederhofer
  1 sibling, 1 reply; 32+ messages in thread
From: Jakub Narebski @ 2006-06-30 10:24 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> This extends the behaviour of git-grep when multiple -e options
> are given.  So far, we allowed multiple -e to behave just like
> regular grep with multiple -e, i.e. the patterns are OR'ed
> together.
> 
> With this change, you can also have multiple patterns AND'ed
> together, or form boolean expressions, like this (the
> parentheses are quoted from the shell in this example):
> 
>       $ git grep -e _PATTERN --and \( -e atom -e token \)

And where is documentation update?
 
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: boolean expression on pattern matching.
  2006-06-30 10:24                       ` Jakub Narebski
@ 2006-06-30 10:29                         ` Junio C Hamano
  0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 10:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

>> 
>>       $ git grep -e _PATTERN --and \( -e atom -e token \)
>
> And where is documentation update?

Heh, real men do not do documentation ;-).

I am going to bed now, and am hoping a kind soul would be
sending out a patch while I am sleeping.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30  7:56                   ` Junio C Hamano
  2006-06-30 10:08                     ` [PATCH] git-grep: boolean expression on pattern matching Junio C Hamano
@ 2006-06-30 10:57                     ` Matthias Lederhofer
  2006-06-30 15:57                       ` Junio C Hamano
  1 sibling, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30 10:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano wrote:
> I see you are trying hard to think of a way to justify your
> original prefix "--and" (or --FOO) implementation, but I simply
> do not see much point in that.  I doubt changing the default
> operator from --or to --and is less confusing than changing the
> precedence for the users, so you would hear the same "I
> personally feel FOO should not even exist" objection from me.

It just happens to make more sense to me and I don't see a reason not to
add this. If no one else is interested in this I'll just stop arguing :)
Here again an overview of the arguments if anyone is interested:
- Less to type for common searches using only AND (or more ANDs than
  ORs).
- Easy to implement (both with and without extended expressions).
- AND/* is the normal implicit operator in other contexts than grep
  (math).
- The high precedence operator (AND) should be implicit rather than
  the low precedence one (OR) (so this is only fulfilled when the
  option is used).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: boolean expression on pattern matching.
  2006-06-30 10:08                     ` [PATCH] git-grep: boolean expression on pattern matching Junio C Hamano
  2006-06-30 10:24                       ` Jakub Narebski
@ 2006-06-30 15:11                       ` Matthias Lederhofer
  1 sibling, 0 replies; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30 15:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

> This extends the behaviour of git-grep when multiple -e options
> are given.  So far, we allowed multiple -e to behave just like
> regular grep with multiple -e, i.e. the patterns are OR'ed
> together.
> 
> With this change, you can also have multiple patterns AND'ed
> together, or form boolean expressions, like this (the
> parentheses are quoted from the shell in this example):
> 
> 	$ git grep -e _PATTERN --and \( -e atom -e token \)
This looks really nice. So for a few trivial tests it did not fail :)

I noticed an unrelated bug. The context separators ("--") are missing
between matches in different files:

$ git-grep -e foobar -A 1 (this uses external grep)
Documentation/git-diff-tree.txt:I.e. "foo" does not pick up `foobar.h`.  "foo" does match `foo/bar.h`
Documentation/git-diff-tree.txt-so it can be used to name subdirectories.
--
git-send-email.perl:#$initial_reply_to = ''; #<20050203173208.GA23964@foobar.com>';
git-send-email.perl-
--
[..]

$ git-grep -e foobar -A 1 master (this is internal grep)
master:Documentation/git-diff-tree.txt:I.e. "foo" does not pick up `foobar.h`.  "foo" does match `foo/bar.h`
master:Documentation/git-diff-tree.txt-so it can be used to name subdirectories.
master:git-send-email.perl:#$initial_reply_to = ''; #<20050203173208.GA23964@foobar.com>';
master:git-send-email.perl-
[..]

I think this cannot be fixed in the loop in builtin-grep.c:grep_cache
because after the last hit there should be no separator but it is not
known if a grep_sha1/grep_file will match and produce output. So I
think there has to be a variable passed down which tells those
functions to print the separator before any other output.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 10:57                     ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
@ 2006-06-30 15:57                       ` Junio C Hamano
  2006-06-30 17:04                         ` Matthias Lederhofer
  2006-07-03  7:54                         ` Andreas Ericsson
  0 siblings, 2 replies; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 15:57 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Matthias Lederhofer <matled@gmx.net> writes:

> Junio C Hamano wrote:
>> I see you are trying hard to think of a way to justify your
>> original prefix "--and" (or --FOO) implementation, but I simply
>> do not see much point in that.  I doubt changing the default
>> operator from --or to --and is less confusing than changing the
>> precedence for the users, so you would hear the same "I
>> personally feel FOO should not even exist" objection from me.
>
> It just happens to make more sense to me and I don't see a reason not to
> add this. If no one else is interested in this I'll just stop arguing :)
> Here again an overview of the arguments if anyone is interested:
> - Less to type for common searches using only AND (or more ANDs than
>   ORs).
> - Easy to implement (both with and without extended expressions).
> - AND/* is the normal implicit operator in other contexts than grep
>   (math).
> - The high precedence operator (AND) should be implicit rather than
>   the low precedence one (OR) (so this is only fulfilled when the
>   option is used).

Side note.  It would be interesting to have a slightly different
form of --and called --near.  You would use it like this:

	git grep -C -e AND --near -e OR

to find lines that has AND on it, and within the context
distance there is a line that has OR on it.  The lines that are
hit with such a query are still the ones that have AND on them
(in other words, a line that has OR is used to further filter
out the results so it will be prefixed with '-', not ':', unless
that line happens to also have AND on it).

With your syntax perhaps this is spelled as "--near -C -e AND -e
OR".

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 15:57                       ` Junio C Hamano
@ 2006-06-30 17:04                         ` Matthias Lederhofer
  2006-06-30 17:18                           ` Junio C Hamano
  2006-07-03  7:54                         ` Andreas Ericsson
  1 sibling, 1 reply; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30 17:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

> Side note.  It would be interesting to have a slightly different
> form of --and called --near.  You would use it like this:
> 
> 	git grep -C -e AND --near -e OR
> 
> to find lines that has AND on it, and within the context
> distance there is a line that has OR on it.  The lines that are
> hit with such a query are still the ones that have AND on them
> (in other words, a line that has OR is used to further filter
> out the results so it will be prefixed with '-', not ':', unless
> that line happens to also have AND on it).
Nice idea even though I don't now about practical importance but it
sounds quite handy.  A few questions about this (some or all of those
features may make it quite complex):
1. Should the context of near be the same as -[ABC] or perhaps
   --near=N / --near=N:M (default could be the same as specified by
   -[ABC]).
2. Should it be possible to specify another boolean expression after
   --near? e.g. --near ( -e foo --or ( -e bar --and -e baz )) to match
   if the context contains foo or 'bar and baz'.
3. Is --near just another subexpression? e.g. search for foo with
   either A or B in the context:
   -e foo --and ( --near A --or --near B )
   This does not make sense without 1 and 2.

With some or all of those features quite mighty and complex
expressions can be build:
-e A --and --near=3:-1 ( -e B --and --near=0:0 ( -e foo --and -e bar ) )
This could mean: find lines containing A and have B in any of the 3
lines before A (without the line containing A). Additionally foo and
bar have to be found on the same line before A.

I'm really not asking for this, just telling about some ideas that
come to my mind for --near.

> With your syntax perhaps this is spelled as "--near -C -e AND -e
> OR".
Huh? What do you mean by "my syntax"? The only thing different is the
option to change the default operator to 'and'.

With the new extended expressions it would be really nice if git-grep
could also be used outside a git repository :)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:04                         ` Matthias Lederhofer
@ 2006-06-30 17:18                           ` Junio C Hamano
  2006-06-30 17:33                             ` Jakub Narebski
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 17:18 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git

Matthias Lederhofer <matled@gmx.net> writes:

> 1. Should the context of near be the same as -[ABC] or perhaps
>    --near=N / --near=N:M (default could be the same as specified by
>    -[ABC]).

As an end-user, I do not care either way.

> 2. Should it be possible to specify another boolean expression after
>    --near? e.g. --near ( -e foo --or ( -e bar --and -e baz )) to match
>    if the context contains foo or 'bar and baz'.

I would say why not.

> 3. Is --near just another subexpression? e.g. search for foo with
>    either A or B in the context:
>    -e foo --and ( --near A --or --near B )
>    This does not make sense without 1 and 2.

Ah, interesting.  I was thinking --near to be weaker form of --and,
but you made it to be a unary predicate (like --not).  That
would be neater.

> With some or all of those features quite mighty and complex
> expressions can be build:
> -e A --and --near=3:-1 ( -e B --and --near=0:0 ( -e foo --and -e bar ) )
> This could mean: find lines containing A and have B in any of the 3
> lines before A (without the line containing A). Additionally foo and
> bar have to be found on the same line before A.

Having said that, I suspect the above made-up example may not be
so useful in practice.  I think a more realistic usage is "I
want to find lines that contain `made-up' and `realistic' but
the paragraph might have been filled by the editor and they may
be found on separate nearby lines.  Instead of saying `-e
made-up --and -e realistic', I would say `-e made-up --near -e
realistic' to find what I want".  That would find the first two
lines of this paragraph, among others.

> With the new extended expressions it would be really nice if git-grep
> could also be used outside a git repository :)

I am not sure about `outside' but it might be useful to extend
the working tree walker and glob filter used there to match what
ls-files uses so that it can do untracked files as well.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:18                           ` Junio C Hamano
@ 2006-06-30 17:33                             ` Jakub Narebski
  2006-06-30 17:49                               ` Matthias Lederhofer
  0 siblings, 1 reply; 32+ messages in thread
From: Jakub Narebski @ 2006-06-30 17:33 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Matthias Lederhofer <matled@gmx.net> writes:

>> 3. Is --near just another subexpression? e.g. search for foo with
>>    either A or B in the context:
>>    -e foo --and ( --near A --or --near B )
>>    This does not make sense without 1 and 2.
> 
> Ah, interesting.  I was thinking --near to be weaker form of --and,
> but you made it to be a unary predicate (like --not).  That
> would be neater.

I think --near _has_ to be non-symmetric binary operator, i.e. first
argument specifies line to be found, second argument has to be in context
for first line if it is found.

So the above expression would be written as:

  -e foo --near \( A --or B \)


BTW. we can make -e equivalent to --or, and empty (default) operator to
--and, but of course you have to delimit expression from files, i.e. either

  git grep A B C D -- files

or

  git grep -e \( A B C D \) files

which would be equivalent to

  git grep A --and B --and C --and D files

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:33                             ` Jakub Narebski
@ 2006-06-30 17:49                               ` Matthias Lederhofer
  2006-06-30 17:58                                 ` Junio C Hamano
  2006-06-30 18:03                                 ` Jakub Narebski
  0 siblings, 2 replies; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30 17:49 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski wrote:
> I think --near _has_ to be non-symmetric binary operator, i.e. first
> argument specifies line to be found, second argument has to be in context
> for first line if it is found.
> 
> So the above expression would be written as:
> 
>   -e foo --near \( A --or B \)
Why is that?
-e foo --and --near \( -e A -- or -e B \)
would mean lines containing foo and either A or B in the context and
-e foo --or  --near \( -e A -- or -e B \)
would mean lines containing foo or having A or B in the context.

> BTW. we can make -e equivalent to --or, and empty (default) operator to
> --and, but of course you have to delimit expression from files, i.e. either
> 
>   git grep A B C D -- files
This is incompatible with the current implementation.
'git grep A B C D -- files' means A is the pattern, B, C, D are
revisions and files is the pathspec.

> or
> 
>   git grep -e \( A B C D \) files
> 
> which would be equivalent to
> 
>   git grep A --and B --and C --and D files
I think this could probably be used.  But I think having two different
implicit operators depending on the context is too confusing.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:49                               ` Matthias Lederhofer
@ 2006-06-30 17:58                                 ` Junio C Hamano
  2006-06-30 18:20                                   ` Matthias Lederhofer
  2006-06-30 18:03                                 ` Jakub Narebski
  1 sibling, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 17:58 UTC (permalink / raw)
  To: Matthias Lederhofer; +Cc: git, Jakub Narebski

Matthias Lederhofer <matled@gmx.net> writes:

> -e foo --or  --near \( -e A -- or -e B \)
> would mean lines containing foo or having A or B in the context.

How would that "--near" be useful?  You will see A or B either way.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:49                               ` Matthias Lederhofer
  2006-06-30 17:58                                 ` Junio C Hamano
@ 2006-06-30 18:03                                 ` Jakub Narebski
  2006-06-30 18:16                                   ` Junio C Hamano
  1 sibling, 1 reply; 32+ messages in thread
From: Jakub Narebski @ 2006-06-30 18:03 UTC (permalink / raw)
  To: git

Matthias Lederhofer wrote:

> Jakub Narebski wrote:
>> I think --near _has_ to be non-symmetric binary operator, i.e. first
>> argument specifies line to be found, second argument has to be in context
>> for first line if it is found.
>> 
>> So the above expression would be written as:
>> 
>>   -e foo --near \( A --or B \)
> Why is that?
>   -e foo --and --near \( -e A --or -e B \)
> would mean lines containing foo and either A or B in the context and
>   -e foo --or  --near \( -e A --or -e B \)
> would mean lines containing foo or having A or B in the context.

Because --near needs an expression it check context for (context is for
found match of lhs expression). So

  -e foo --near \( -e A --or -e B \)

means lines containing foo and either A or B in the context _for "foo"_.

--and --near could be shorthand for --and-near, and --or --near for
--or-near... except that the second one doesn't have much sense:

What is the difference between
  -e foo --or --near \( -e A --or -e B \)
and
  -e foo --or \( -e A --or -e B \)

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 18:03                                 ` Jakub Narebski
@ 2006-06-30 18:16                                   ` Junio C Hamano
  2006-06-30 19:11                                     ` Jakub Narebski
  0 siblings, 1 reply; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 18:16 UTC (permalink / raw)
  To: jnareb; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Because --near needs an expression it check context for (context is for
> found match of lhs expression). So
>
>   -e foo --near \( -e A --or -e B \)
>
> means lines containing foo and either A or B in the context _for "foo"_.

The syntax and semantics of --near I suggested (and you are
following) and what Matthias discusses are different and I think
that is why you two are talking past each other.

What I originally suggested is that you can (syntactically)
replace --near with --and.  That is, the LHS is the match and
RHS is "the LHS must match, but in addition RHS must match but
unlike --and RHS does not have to be exactly on the same line
but it is OK if it is a line somewhere nearby".

The --near Matthias talk about is syntactically not like --and
but more like --not.  It takes a condition for a line after
that, and loosens it to cover nearby lines.  So "-e A"
means "the line must have A on it" but "--near -e A" means "the
line must be nearby a line that satisfies `-e A'".

Matthias's "--near EXP" is spelled as "-e '' --near EXP" (the
first one is always true) with our syntax, in other words.

I do not think either of these semantics is invalid; they are
just different.  The version by Matthias is more general and
more expressive.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 17:58                                 ` Junio C Hamano
@ 2006-06-30 18:20                                   ` Matthias Lederhofer
  0 siblings, 0 replies; 32+ messages in thread
From: Matthias Lederhofer @ 2006-06-30 18:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski

Junio C Hamano wrote:
> Matthias Lederhofer <matled@gmx.net> writes:
> 
> > -e foo --or  --near \( -e A -- or -e B \)
> > would mean lines containing foo or having A or B in the context.
> 
> How would that "--near" be useful?  You will see A or B either way.
Ok, this example was quite bad.

If --near is binary
-e foo --and ( --near=3:0 -e A --or --near=0:3 -e B )
could not be done anymore, could it (without repeating the first
pattern)? (Find foo with A in the 3 lines before or B in the 3 lines
after the line.)
Without different contexts for multiple --near it probably does not
matter if --near is binary or unary.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 18:16                                   ` Junio C Hamano
@ 2006-06-30 19:11                                     ` Jakub Narebski
  2006-06-30 20:26                                       ` Junio C Hamano
  0 siblings, 1 reply; 32+ messages in thread
From: Jakub Narebski @ 2006-06-30 19:11 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> The --near Matthias talk about is syntactically not like --and
> but more like --not.  It takes a condition for a line after
> that, and loosens it to cover nearby lines.  So "-e A"
> means "the line must have A on it" but "--near -e A" means "the
> line must be nearby a line that satisfies `-e A'".
> 
> Matthias's "--near EXP" is spelled as "-e '' --near EXP" (the
> first one is always true) with our syntax, in other words.
> 
> I do not think either of these semantics is invalid; they are
> just different.  The version by Matthias is more general and
> more expressive.

It also uses the fact that grep search for _lines_, the fact I have forgot
about. But if we cannot search for multiline regexp using git-grep,
Matthias version is truly more expressive, especially with context limiting
extension. 

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 19:11                                     ` Jakub Narebski
@ 2006-06-30 20:26                                       ` Junio C Hamano
  0 siblings, 0 replies; 32+ messages in thread
From: Junio C Hamano @ 2006-06-30 20:26 UTC (permalink / raw)
  To: jnareb; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Matthias version is truly more expressive, especially with context limiting
> extension. 

That's orthogonal.  I do not think there is any reason you
cannot make the version whose --near is similar to --and to
understand different ranges for each "neighbor search"
expression using --near=M:N syntax.

Now stop talking and code it up, please ;-).

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] git-grep: --and to combine patterns with and instead of or
  2006-06-30 15:57                       ` Junio C Hamano
  2006-06-30 17:04                         ` Matthias Lederhofer
@ 2006-07-03  7:54                         ` Andreas Ericsson
  1 sibling, 0 replies; 32+ messages in thread
From: Andreas Ericsson @ 2006-07-03  7:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Matthias Lederhofer, git

Junio C Hamano wrote:
> Matthias Lederhofer <matled@gmx.net> writes:
> 
> 
>>Junio C Hamano wrote:
>>
>>>I see you are trying hard to think of a way to justify your
>>>original prefix "--and" (or --FOO) implementation, but I simply
>>>do not see much point in that.  I doubt changing the default
>>>operator from --or to --and is less confusing than changing the
>>>precedence for the users, so you would hear the same "I
>>>personally feel FOO should not even exist" objection from me.
>>
>>It just happens to make more sense to me and I don't see a reason not to
>>add this. If no one else is interested in this I'll just stop arguing :)
>>Here again an overview of the arguments if anyone is interested:
>>- Less to type for common searches using only AND (or more ANDs than
>>  ORs).
>>- Easy to implement (both with and without extended expressions).
>>- AND/* is the normal implicit operator in other contexts than grep
>>  (math).
>>- The high precedence operator (AND) should be implicit rather than
>>  the low precedence one (OR) (so this is only fulfilled when the
>>  option is used).
> 
> 
> Side note.  It would be interesting to have a slightly different
> form of --and called --near.  You would use it like this:
> 
> 	git grep -C -e AND --near -e OR
> 
> to find lines that has AND on it, and within the context
> distance there is a line that has OR on it.  The lines that are
> hit with such a query are still the ones that have AND on them
> (in other words, a line that has OR is used to further filter
> out the results so it will be prefixed with '-', not ':', unless
> that line happens to also have AND on it).
> 

It would also be neat to have --inside main or some such, to make it 
only check for things inside whatever's printed on the diff --git line.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2006-07-03  7:54 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-25 15:38 [PATCH] git-grep: allow patterns starting with - Matthias Lederhofer
2006-06-25 15:47 ` Timo Hirvonen
2006-06-25 16:07   ` [PATCH] correct documentation for git grep Matthias Lederhofer
2006-06-25 23:10     ` Johannes Schindelin
2006-06-25 23:39       ` Matthias Lederhofer
2006-06-26  0:06         ` Matthias Lederhofer
2006-06-26  6:59           ` Johannes Schindelin
2006-06-26  0:02       ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
2006-06-29 22:20         ` Thomas Glanzmann
2006-06-29 22:44           ` Junio C Hamano
2006-06-30  2:25             ` Matthias Lederhofer
2006-06-30  4:13               ` Junio C Hamano
2006-06-30  7:46                 ` Matthias Lederhofer
2006-06-30  7:56                   ` Junio C Hamano
2006-06-30 10:08                     ` [PATCH] git-grep: boolean expression on pattern matching Junio C Hamano
2006-06-30 10:24                       ` Jakub Narebski
2006-06-30 10:29                         ` Junio C Hamano
2006-06-30 15:11                       ` Matthias Lederhofer
2006-06-30 10:57                     ` [PATCH] git-grep: --and to combine patterns with and instead of or Matthias Lederhofer
2006-06-30 15:57                       ` Junio C Hamano
2006-06-30 17:04                         ` Matthias Lederhofer
2006-06-30 17:18                           ` Junio C Hamano
2006-06-30 17:33                             ` Jakub Narebski
2006-06-30 17:49                               ` Matthias Lederhofer
2006-06-30 17:58                                 ` Junio C Hamano
2006-06-30 18:20                                   ` Matthias Lederhofer
2006-06-30 18:03                                 ` Jakub Narebski
2006-06-30 18:16                                   ` Junio C Hamano
2006-06-30 19:11                                     ` Jakub Narebski
2006-06-30 20:26                                       ` Junio C Hamano
2006-07-03  7:54                         ` Andreas Ericsson
2006-06-25 16:18   ` [PATCH] git-grep: allow patterns starting with - Matthias Lederhofer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.