All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Julia Lawall <julia.lawall@lip6.fr>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: git log -S or -G
Date: Tue, 09 Oct 2018 14:45:06 +0200	[thread overview]
Message-ID: <87lg77cmr1.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <xmqqwoqrr8y2.fsf@gitster-ct.c.googlers.com>


On Tue, Oct 09 2018, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
>
>> I think that is the best we could do for "-S", though, which is
>> inherently about counting hits.
>>
>> For "-G", we are literally grepping the diff. It does not seem
>> unreasonable to add the ability to grep only "-" or "+" lines, and the
>> interface for that should be pretty straightforward (a tri-state flag to
>> look in remove, added, or both lines).
>
> Yeah, here is a lunchtime hack that hasn't even been compile tested.
>
>  diff.c             |  4 ++++
>  diff.h             |  2 ++
>  diffcore-pickaxe.c | 22 ++++++++++++++++++++--
>  3 files changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index f0c7557b40..d1f2780844 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -5068,6 +5068,10 @@ int diff_opt_parse(struct diff_options *options,
>  	}
>  	else if (!strcmp(arg, "--pickaxe-all"))
>  		options->pickaxe_opts |= DIFF_PICKAXE_ALL;
> +	else if (!strcmp(arg, "--pickaxe-ignore-add"))
> +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_ADD;
> +	else if (!strcmp(arg, "--pickaxe-ignore-del"))
> +		options->pickaxe_opts |= DIFF_PICKAXE_IGNORE_DEL;
>  	else if (!strcmp(arg, "--pickaxe-regex"))
>  		options->pickaxe_opts |= DIFF_PICKAXE_REGEX;
>  	else if ((argcount = short_opt('O', av, &optarg))) {
> diff --git a/diff.h b/diff.h
> index a30cc35ec3..147c47ace7 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -358,6 +358,8 @@ int git_config_rename(const char *var, const char *value);
>  				 DIFF_PICKAXE_KIND_OBJFIND)
>
>  #define DIFF_PICKAXE_IGNORE_CASE	32
> +#define DIFF_PICKAXE_IGNORE_ADD	64
> +#define DIFF_PICKAXE_IGNORE_DEL 128
>
>  void diffcore_std(struct diff_options *);
>  void diffcore_fix_diff_index(struct diff_options *);
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 800a899c86..826dde6bd4 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -16,6 +16,7 @@ typedef int (*pickaxe_fn)(mmfile_t *one, mmfile_t *two,
>
>  struct diffgrep_cb {
>  	regex_t *regexp;
> +	struct diff_options *diff_options;
>  	int hit;
>  };
>
> @@ -23,9 +24,14 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
>  {
>  	struct diffgrep_cb *data = priv;
>  	regmatch_t regmatch;
> +	unsigned pickaxe_opts = data->diff_options->pickaxe_opts;
>
>  	if (line[0] != '+' && line[0] != '-')
>  		return;
> +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD) &&	line[0] == '+')
> +		return;
> +	if ((pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL) &&	line[0] == '-')
> +		return;
>  	if (data->hit)
>  		/*

Looks good, but I wonder if a more general version of this couldn't be
that instead of returning early if the line doesn't start with +/-
above, we have an option to skip that early return.

Then you can simply specify a regex that starts by matching a + or - at
the start of the line, but you also get the poweruser tool of matching
lines around those lines, as tweaked by the -U option. I.e. this:

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 800a899c86..90625a110c 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -24,15 +24,13 @@ static void diffgrep_consume(void *priv, char *line, unsigned long len)
        struct diffgrep_cb *data = priv;
        regmatch_t regmatch;

-       if (line[0] != '+' && line[0] != '-')
-               return;
        if (data->hit)
                /*
                 * NEEDSWORK: we should have a way to terminate the
                 * caller early.
                 */
                return;
-       data->hit = !regexec_buf(data->regexp, line + 1, len - 1, 1,
+       data->hit = !regexec_buf(data->regexp, line, len, 1,
                                 &regmatch, 0);
 }

That patch obviously breaks existing -G, so it would need to be
optional, but allows me e.g. on git.git to do:

    ~/g/git/git --exec-path=/home/avar/g/git log -G'^ .*marc\.info' -p -U2 -- README.md

To find a change whose first line of context is a line mentioning
marc.info, and then I can use -G'^\+<rx>' to find added lines matching
<rx> etc.

Then the --pickaxe-ignore-add and --pickaxe-ignore-del options in your
patch could just be implemented in terms of that feature, i.e. by
implicitly adding a "^-" or "^\+" to the beginning of the regex,
respectively, and implicitly turning on a new --pickaxe-raw-lines or
whatever we'd call it.

>  		 * NEEDSWORK: we should have a way to terminate the
> @@ -45,13 +51,20 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
>  	xpparam_t xpp;
>  	xdemitconf_t xecfg;
>
> -	if (!one)
> +	if (!one) {
> +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> +			return 0;
>  		return !regexec_buf(regexp, two->ptr, two->size,
>  				    1, &regmatch, 0);
> -	if (!two)
> +	}
> +	if (!two) {
> +		if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> +			return 0;
>  		return !regexec_buf(regexp, one->ptr, one->size,
>  				    1, &regmatch, 0);
> +	}
>
> +	ecbdata.diff_options = o;
>  	/*
>  	 * We have both sides; need to run textual diff and see if
>  	 * the pattern appears on added/deleted lines.
> @@ -113,6 +126,11 @@ static int has_changes(mmfile_t *one, mmfile_t *two,
>  {
>  	unsigned int one_contains = one ? contains(one, regexp, kws) : 0;
>  	unsigned int two_contains = two ? contains(two, regexp, kws) : 0;
> +
> +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_ADD)
> +		return one_contains > two_contains;
> +	if (o->pickaxe_opts & DIFF_PICKAXE_IGNORE_DEL)
> +		return one_contains < two_contains;
>  	return one_contains != two_contains;
>  }
>

  reply	other threads:[~2018-10-09 12:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-06 15:14 git log -S or -G Julia Lawall
2018-10-06 16:22 ` Ævar Arnfjörð Bjarmason
2018-10-07  0:48   ` Junio C Hamano
2018-10-07  5:21     ` Julia Lawall
2018-10-08 23:09       ` Junio C Hamano
2018-10-09  3:21         ` Jeff King
2018-10-09  3:58           ` Jacob Keller
2018-10-09  6:39             ` Julia Lawall
2018-10-09  5:21           ` Junio C Hamano
2018-10-09 12:45             ` Ævar Arnfjörð Bjarmason [this message]
2018-10-09 13:51               ` Julia Lawall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lg77cmr1.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=julia.lawall@lip6.fr \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.