git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] saner xdiff hunk callbacks
@ 2018-11-02  6:31 Jeff King
  2018-11-02  6:35 ` [PATCH 1/9] xdiff: provide a separate emit callback for hunks Jeff King
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:31 UTC (permalink / raw)
  To: git

As part of my -Wunused-parameter hunt, I noticed that parse_hunk_header()
can read out-of-bounds in the array it is given. But it turns out not to
be a big problem because we only ever use it to parse lines that xdiff
has just generated.

This is fixable, and I'll show my patch to do so at the end of this
email. But it seems rather silly that we have xdiff generate a string
just so that we can parse it from within the same process. So instead I
improved the xdiff interface to pass the actual integers out, made use
of it as appropriate, and then in the final patch we can just drop the
buggy function entirely.

So that's this series, which I think is the better fix:

  [1/9]: xdiff: provide a separate emit callback for hunks
  [2/9]: xdiff-interface: provide a separate consume callback for hunks
  [3/9]: diff: avoid generating unused hunk header lines
  [4/9]: diff: discard hunk headers for patch-ids earlier
  [5/9]: diff: use hunk callback for word-diff
  [6/9]: combine-diff: use an xdiff hunk callback
  [7/9]: diff: convert --check to use a hunk callback
  [8/9]: range-diff: use a hunk callback
  [9/9]: xdiff-interface: drop parse_hunk_header()

 builtin/merge-tree.c |  3 +-
 builtin/rerere.c     |  3 +-
 combine-diff.c       | 67 ++++++++++++++++++++------------------
 diff.c               | 48 ++++++++++++++--------------
 diffcore-pickaxe.c   |  3 +-
 range-diff.c         | 10 +++++-
 xdiff-interface.c    | 76 ++++++++++++++++++--------------------------
 xdiff-interface.h    | 19 ++++++++---
 xdiff/xdiff.h        |  6 +++-
 xdiff/xutils.c       | 21 +++++++++---
 10 files changed, 141 insertions(+), 115 deletions(-)

For reference/comparison, here's the minimal fix patch.

-- >8 --
Subject: [PATCH DO NOT APPLY] xdiff-interface: refactor parse_hunk_header

The parse_hunk_header() function takes its input as a ptr/len pair, but
does not respect the "len" half, and instead treats it like a
NUL-terminated string. This works out in practice because we only ever
parse strings generated by xdiff. These do omit the trailing NUL, but
since they're always valid, we never parse past the newline.

Still, it would be easy to misuse it, so let's fix it.

While we're doing so, we can improve a few other things:

  - by using helpers like skip_prefix_mem(), we can do the whole parse
    within a conditional, avoiding the need to jump around (backwards,
    even!) to the error block with a goto. The result is hopefully much
    more readable.

  - the check for the final "@@" would trigger an error return code if
    it failed, but wouldn't actually emit an error message (like the
    rest of the parse errors do)

  - we used to blindly assume the caller checked that the string starts
    with "@@ -"; we now check it ourselves and let the caller know what
    we found

  - the input line does not need to be modified, and can be marked
    "const"

Signed-off-by: Jeff King <peff@peff.net>
---
The diff is pretty horrid, but hopefully the post-image is pleasant to
read.

 combine-diff.c    | 13 +++++----
 diff.c            |  5 ++--
 xdiff-interface.c | 68 +++++++++++++++++++++++++----------------------
 xdiff-interface.h |  6 ++---
 4 files changed, 50 insertions(+), 42 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 10155e0ec8..0318f39103 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -348,11 +348,14 @@ struct combine_diff_state {
 static void consume_line(void *state_, char *line, unsigned long len)
 {
 	struct combine_diff_state *state = state_;
-	if (5 < len && !memcmp("@@ -", line, 4)) {
-		if (parse_hunk_header(line, len,
-				      &state->ob, &state->on,
-				      &state->nb, &state->nn))
-			return;
+	switch (maybe_parse_hunk_header(line, len,
+					&state->ob, &state->on,
+					&state->nb, &state->nn)) {
+	case 0:
+		break; /* not a hunk header */
+	case -1:
+		return; /* parse error */
+	default:
 		state->lno = state->nb;
 		if (state->nn == 0) {
 			/* @@ -X,Y +N,0 @@ removed Y lines
diff --git a/diff.c b/diff.c
index 8647db3d30..af6c01c4d0 100644
--- a/diff.c
+++ b/diff.c
@@ -1921,8 +1921,9 @@ static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
 	struct diff_options *opt = diff_words->opt;
 	const char *line_prefix;
 
-	if (line[0] != '@' || parse_hunk_header(line, len,
-			&minus_first, &minus_len, &plus_first, &plus_len))
+	if (maybe_parse_hunk_header(line, len,
+				    &minus_first, &minus_len,
+				    &plus_first, &plus_len) <= 0)
 		return;
 
 	assert(opt);
diff --git a/xdiff-interface.c b/xdiff-interface.c
index e7af95db86..f574ee49cd 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -14,49 +14,53 @@ struct xdiff_emit_state {
 	struct strbuf remainder;
 };
 
-static int parse_num(char **cp_p, int *num_p)
+static int parse_num(const char **cp_p, size_t *len, int *num_p)
 {
-	char *cp = *cp_p;
+	const char *cp = *cp_p;
+	const char *end = cp + *len;
 	int num = 0;
 
-	while ('0' <= *cp && *cp <= '9')
+	while (cp < end && '0' <= *cp && *cp <= '9')
 		num = num * 10 + *cp++ - '0';
 	if (!(cp - *cp_p))
-		return -1;
+		return 0;
+	*len -= cp - *cp_p;
 	*cp_p = cp;
 	*num_p = num;
-	return 0;
+	return 1;
 }
 
-int parse_hunk_header(char *line, int len,
-		      int *ob, int *on,
-		      int *nb, int *nn)
+static int parse_hunk_len(const char **cp_p, size_t *len, int *out)
 {
-	char *cp;
-	cp = line + 4;
-	if (parse_num(&cp, ob)) {
-	bad_line:
-		return error("malformed diff output: %s", line);
+	/* hunk lengths are optional; if omitted, default to 1 */
+	if (skip_prefix_mem(*cp_p, *len, ",", cp_p, len)) {
+		if (!parse_num(cp_p, len, out))
+			return 0;
+	} else {
+		*out = 1;
 	}
-	if (*cp == ',') {
-		cp++;
-		if (parse_num(&cp, on))
-			goto bad_line;
-	}
-	else
-		*on = 1;
-	if (*cp++ != ' ' || *cp++ != '+')
-		goto bad_line;
-	if (parse_num(&cp, nb))
-		goto bad_line;
-	if (*cp == ',') {
-		cp++;
-		if (parse_num(&cp, nn))
-			goto bad_line;
-	}
-	else
-		*nn = 1;
-	return -!!memcmp(cp, " @@", 3);
+
+	return 1;
+}
+
+int maybe_parse_hunk_header(const char *line, size_t len,
+			    int *ob, int *on,
+			    int *nb, int *nn)
+{
+	const char *cp;
+
+	if (!skip_prefix_mem(line, len, "@@ -", &cp, &len))
+		return 0;
+
+	if (!parse_num(&cp, &len, ob) ||
+	    !parse_hunk_len(&cp, &len, on) ||
+	    !skip_prefix_mem(cp, len, " +", &cp, &len) ||
+	    !parse_num(&cp, &len, nb) ||
+	    !parse_hunk_len(&cp, &len, nn) ||
+	    !skip_prefix_mem(cp, len, " @@", &cp, &len))
+		return error("malformed diff output: %s", line);
+
+	return 1;
 }
 
 static void consume_one(void *priv_, char *s, unsigned long size)
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 135fc05d72..cddd56bae6 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -17,9 +17,9 @@ int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t co
 int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 		  xdiff_emit_consume_fn fn, void *consume_callback_data,
 		  xpparam_t const *xpp, xdemitconf_t const *xecfg);
-int parse_hunk_header(char *line, int len,
-		      int *ob, int *on,
-		      int *nb, int *nn);
+int maybe_parse_hunk_header(const char *line, size_t len,
+			    int *ob, int *on,
+			    int *nb, int *nn);
 int read_mmfile(mmfile_t *ptr, const char *filename);
 void read_mmblob(mmfile_t *ptr, const struct object_id *oid);
 int buffer_is_binary(const char *ptr, unsigned long size);
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 1/9] xdiff: provide a separate emit callback for hunks
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
@ 2018-11-02  6:35 ` Jeff King
  2018-11-02  6:35 ` [PATCH 2/9] xdiff-interface: provide a separate consume " Jeff King
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:35 UTC (permalink / raw)
  To: git

The xdiff library always emits hunk header lines to our callbacks as
formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're
going to output a diff, but less so if we actually need to compute using
those numbers, which requires re-parsing the line.

In preparation for moving away from this, let's teach xdiff a new
callback function which gets the broken-out hunk information. To help
callers that don't want to use this new callback, if it's NULL we'll
continue to format the hunk header into a string.

Note that this function renames the "outf" callback to "out_line", as
well. This isn't strictly necessary, but helps in two ways:

  1. Now that there are two callbacks, it's nice to use more descriptive
     names.

  2. Many callers did not zero the emit_callback_data struct, and needed
     to be modified to set ecb.out_hunk to NULL. By changing the name of
     the existing struct member, that guarantees that any new callers
     from in-flight topics will break the build and be examined
     manually.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/merge-tree.c |  3 ++-
 builtin/rerere.c     |  3 ++-
 xdiff-interface.c    |  2 +-
 xdiff/xdiff.h        |  6 +++++-
 xdiff/xutils.c       | 21 +++++++++++++++++----
 5 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 8fc108d305..70f6fc9167 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -110,7 +110,8 @@ static void show_diff(struct merge_list *entry)
 	xpp.flags = 0;
 	memset(&xecfg, 0, sizeof(xecfg));
 	xecfg.ctxlen = 3;
-	ecb.outf = show_outf;
+	ecb.out_hunk = NULL;
+	ecb.out_line = show_outf;
 	ecb.priv = NULL;
 
 	src.ptr = origin(entry, &size);
diff --git a/builtin/rerere.c b/builtin/rerere.c
index e89ccbc524..d78eeaed32 100644
--- a/builtin/rerere.c
+++ b/builtin/rerere.c
@@ -41,7 +41,8 @@ static int diff_two(const char *file1, const char *label1,
 	xpp.flags = 0;
 	memset(&xecfg, 0, sizeof(xecfg));
 	xecfg.ctxlen = 3;
-	ecb.outf = outf;
+	ecb.out_hunk = NULL;
+	ecb.out_line = outf;
 	ret = xdi_diff(&minus, &plus, &xpp, &xecfg, &ecb);
 
 	free(minus.ptr);
diff --git a/xdiff-interface.c b/xdiff-interface.c
index e7af95db86..25c185aff4 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -152,7 +152,7 @@ int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 	state.consume = fn;
 	state.consume_callback_data = consume_callback_data;
 	memset(&ecb, 0, sizeof(ecb));
-	ecb.outf = xdiff_outf;
+	ecb.out_line = xdiff_outf;
 	ecb.priv = &state;
 	strbuf_init(&state.remainder, 0);
 	ret = xdi_diff(mf1, mf2, xpp, xecfg, &ecb);
diff --git a/xdiff/xdiff.h b/xdiff/xdiff.h
index 2356da5f78..b158369020 100644
--- a/xdiff/xdiff.h
+++ b/xdiff/xdiff.h
@@ -86,7 +86,11 @@ typedef struct s_xpparam {
 
 typedef struct s_xdemitcb {
 	void *priv;
-	int (*outf)(void *, mmbuffer_t *, int);
+	int (*out_hunk)(void *,
+			long old_begin, long old_nr,
+			long new_begin, long new_nr,
+			const char *func, long funclen);
+	int (*out_line)(void *, mmbuffer_t *, int);
 } xdemitcb_t;
 
 typedef long (*find_func_t)(const char *line, long line_len, char *buffer, long buffer_size, void *priv);
diff --git a/xdiff/xutils.c b/xdiff/xutils.c
index 88e5995535..963e1c58b9 100644
--- a/xdiff/xutils.c
+++ b/xdiff/xutils.c
@@ -54,7 +54,7 @@ int xdl_emit_diffrec(char const *rec, long size, char const *pre, long psize,
 		mb[2].size = strlen(mb[2].ptr);
 		i++;
 	}
-	if (ecb->outf(ecb->priv, mb, i) < 0) {
+	if (ecb->out_line(ecb->priv, mb, i) < 0) {
 
 		return -1;
 	}
@@ -344,8 +344,9 @@ int xdl_num_out(char *out, long val) {
 	return str - out;
 }
 
-int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
-		      const char *func, long funclen, xdemitcb_t *ecb) {
+static int xdl_format_hunk_hdr(long s1, long c1, long s2, long c2,
+			       const char *func, long funclen,
+			       xdemitcb_t *ecb) {
 	int nb = 0;
 	mmbuffer_t mb;
 	char buf[128];
@@ -387,9 +388,21 @@ int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
 
 	mb.ptr = buf;
 	mb.size = nb;
-	if (ecb->outf(ecb->priv, &mb, 1) < 0)
+	if (ecb->out_line(ecb->priv, &mb, 1) < 0)
 		return -1;
+	return 0;
+}
 
+int xdl_emit_hunk_hdr(long s1, long c1, long s2, long c2,
+		      const char *func, long funclen,
+		      xdemitcb_t *ecb) {
+	if (!ecb->out_hunk)
+		return xdl_format_hunk_hdr(s1, c1, s2, c2, func, funclen, ecb);
+	if (ecb->out_hunk(ecb->priv,
+			  c1 ? s1 : s1 - 1, c1,
+			  c2 ? s2 : s2 - 1, c2,
+			  func, funclen) < 0)
+		return -1;
 	return 0;
 }
 
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/9] xdiff-interface: provide a separate consume callback for hunks
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
  2018-11-02  6:35 ` [PATCH 1/9] xdiff: provide a separate emit callback for hunks Jeff King
@ 2018-11-02  6:35 ` Jeff King
  2018-11-02  6:36 ` [PATCH 3/9] diff: avoid generating unused hunk header lines Jeff King
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:35 UTC (permalink / raw)
  To: git

The previous commit taught xdiff to optionally provide the hunk header
data to a specialized callback. But most users of xdiff actually use our
more convenient xdi_diff_outf() helper, which ensures that our callbacks
are always fed whole lines.

Let's plumb the special hunk-callback through this interface, too. It
will follow the same rule as xdiff when the hunk callback is NULL (i.e.,
continue to pass a stringified hunk header to the line callback). Since
we add NULL to each caller, there should be no behavior change yet.

Signed-off-by: Jeff King <peff@peff.net>
---
 combine-diff.c     |  4 ++--
 diff.c             | 20 ++++++++++----------
 diffcore-pickaxe.c |  2 +-
 range-diff.c       |  2 +-
 xdiff-interface.c  | 30 ++++++++++++++++++++++++++----
 xdiff-interface.h  | 10 ++++++++--
 6 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 10155e0ec8..3c479cfc3e 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -421,8 +421,8 @@ static void combine_diff(struct repository *r,
 	state.num_parent = num_parent;
 	state.n = n;
 
-	if (xdi_diff_outf(&parent_file, result_file, consume_line, &state,
-			  &xpp, &xecfg))
+	if (xdi_diff_outf(&parent_file, result_file, NULL, consume_line,
+			  &state, &xpp, &xecfg))
 		die("unable to generate combined diff for %s",
 		    oid_to_hex(parent));
 	free(parent_file.ptr);
diff --git a/diff.c b/diff.c
index 8647db3d30..07be5879e5 100644
--- a/diff.c
+++ b/diff.c
@@ -2074,8 +2074,8 @@ static void diff_words_show(struct diff_words_data *diff_words)
 	xpp.flags = 0;
 	/* as only the hunk header will be parsed, we need a 0-context */
 	xecfg.ctxlen = 0;
-	if (xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, diff_words,
-			  &xpp, &xecfg))
+	if (xdi_diff_outf(&minus, &plus, NULL, fn_out_diff_words_aux,
+			  diff_words, &xpp, &xecfg))
 		die("unable to generate word diff");
 	free(minus.ptr);
 	free(plus.ptr);
@@ -3526,8 +3526,8 @@ static void builtin_diff(const char *name_a,
 			xecfg.ctxlen = strtoul(v, NULL, 10);
 		if (o->word_diff)
 			init_diff_words_data(&ecbdata, o, one, two);
-		if (xdi_diff_outf(&mf1, &mf2, fn_out_consume, &ecbdata,
-				  &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, NULL, fn_out_consume,
+				  &ecbdata, &xpp, &xecfg))
 			die("unable to generate diff for %s", one->path);
 		if (o->word_diff)
 			free_diff_words_data(&ecbdata);
@@ -3637,8 +3637,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		xpp.anchors_nr = o->anchors_nr;
 		xecfg.ctxlen = o->context;
 		xecfg.interhunkctxlen = o->interhunkcontext;
-		if (xdi_diff_outf(&mf1, &mf2, diffstat_consume, diffstat,
-				  &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, NULL, diffstat_consume,
+				  diffstat, &xpp, &xecfg))
 			die("unable to generate diffstat for %s", one->path);
 	}
 
@@ -3686,8 +3686,8 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		memset(&xecfg, 0, sizeof(xecfg));
 		xecfg.ctxlen = 1; /* at least one context line */
 		xpp.flags = 0;
-		if (xdi_diff_outf(&mf1, &mf2, checkdiff_consume, &data,
-				  &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, NULL, checkdiff_consume,
+				  &data, &xpp, &xecfg))
 			die("unable to generate checkdiff for %s", one->path);
 
 		if (data.ws_rule & WS_BLANK_AT_EOF) {
@@ -5771,8 +5771,8 @@ static int diff_get_patch_id(struct diff_options *options, struct object_id *oid
 		xpp.flags = 0;
 		xecfg.ctxlen = 3;
 		xecfg.flags = 0;
-		if (xdi_diff_outf(&mf1, &mf2, patch_id_consume, &data,
-				  &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, NULL, patch_id_consume,
+				  &data, &xpp, &xecfg))
 			return error("unable to generate patch-id diff for %s",
 				     p->one->path);
 	}
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index d2361e06a1..58df35670a 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -62,7 +62,7 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
 	ecbdata.hit = 0;
 	xecfg.ctxlen = o->context;
 	xecfg.interhunkctxlen = o->interhunkcontext;
-	if (xdi_diff_outf(one, two, diffgrep_consume, &ecbdata, &xpp, &xecfg))
+	if (xdi_diff_outf(one, two, NULL, diffgrep_consume, &ecbdata, &xpp, &xecfg))
 		return 0;
 	return ecbdata.hit;
 }
diff --git a/range-diff.c b/range-diff.c
index bd8083f2d1..245fc17fee 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -210,7 +210,7 @@ static int diffsize(const char *a, const char *b)
 	mf2.size = strlen(b);
 
 	cfg.ctxlen = 3;
-	if (!xdi_diff_outf(&mf1, &mf2, diffsize_consume, &count, &pp, &cfg))
+	if (!xdi_diff_outf(&mf1, &mf2, NULL, diffsize_consume, &count, &pp, &cfg))
 		return count;
 
 	error(_("failed to generate diff"));
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 25c185aff4..16d37ce6be 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -9,7 +9,8 @@
 #include "xdiff/xutils.h"
 
 struct xdiff_emit_state {
-	xdiff_emit_consume_fn consume;
+	xdiff_emit_hunk_fn hunk_fn;
+	xdiff_emit_line_fn line_fn;
 	void *consume_callback_data;
 	struct strbuf remainder;
 };
@@ -59,6 +60,22 @@ int parse_hunk_header(char *line, int len,
 	return -!!memcmp(cp, " @@", 3);
 }
 
+static int xdiff_out_hunk(void *priv_,
+			  long old_begin, long old_nr,
+			  long new_begin, long new_nr,
+			  const char *func, long funclen)
+{
+	struct xdiff_emit_state *priv = priv_;
+
+	if (priv->remainder.len)
+		BUG("xdiff emitted hunk in the middle of a line");
+
+	priv->hunk_fn(priv->consume_callback_data,
+		      old_begin, old_nr, new_begin, new_nr,
+		      func, funclen);
+	return 0;
+}
+
 static void consume_one(void *priv_, char *s, unsigned long size)
 {
 	struct xdiff_emit_state *priv = priv_;
@@ -67,7 +84,7 @@ static void consume_one(void *priv_, char *s, unsigned long size)
 		unsigned long this_size;
 		ep = memchr(s, '\n', size);
 		this_size = (ep == NULL) ? size : (ep - s + 1);
-		priv->consume(priv->consume_callback_data, s, this_size);
+		priv->line_fn(priv->consume_callback_data, s, this_size);
 		size -= this_size;
 		s += this_size;
 	}
@@ -141,7 +158,9 @@ int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t co
 }
 
 int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
-		  xdiff_emit_consume_fn fn, void *consume_callback_data,
+		  xdiff_emit_hunk_fn hunk_fn,
+		  xdiff_emit_line_fn line_fn,
+		  void *consume_callback_data,
 		  xpparam_t const *xpp, xdemitconf_t const *xecfg)
 {
 	int ret;
@@ -149,9 +168,12 @@ int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 	xdemitcb_t ecb;
 
 	memset(&state, 0, sizeof(state));
-	state.consume = fn;
+	state.hunk_fn = hunk_fn;
+	state.line_fn = line_fn;
 	state.consume_callback_data = consume_callback_data;
 	memset(&ecb, 0, sizeof(ecb));
+	if (hunk_fn)
+		ecb.out_hunk = xdiff_out_hunk;
 	ecb.out_line = xdiff_outf;
 	ecb.priv = &state;
 	strbuf_init(&state.remainder, 0);
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 135fc05d72..2dbe2feb19 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -11,11 +11,17 @@
  */
 #define MAX_XDIFF_SIZE (1024UL * 1024 * 1023)
 
-typedef void (*xdiff_emit_consume_fn)(void *, char *, unsigned long);
+typedef void (*xdiff_emit_line_fn)(void *, char *, unsigned long);
+typedef void (*xdiff_emit_hunk_fn)(void *data,
+				   long old_begin, long old_nr,
+				   long new_begin, long new_nr,
+				   const char *func, long funclen);
 
 int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t const *xecfg, xdemitcb_t *ecb);
 int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
-		  xdiff_emit_consume_fn fn, void *consume_callback_data,
+		  xdiff_emit_hunk_fn hunk_fn,
+		  xdiff_emit_line_fn line_fn,
+		  void *consume_callback_data,
 		  xpparam_t const *xpp, xdemitconf_t const *xecfg);
 int parse_hunk_header(char *line, int len,
 		      int *ob, int *on,
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/9] diff: avoid generating unused hunk header lines
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
  2018-11-02  6:35 ` [PATCH 1/9] xdiff: provide a separate emit callback for hunks Jeff King
  2018-11-02  6:35 ` [PATCH 2/9] xdiff-interface: provide a separate consume " Jeff King
@ 2018-11-02  6:36 ` Jeff King
  2018-11-02 19:50   ` Stefan Beller
  2018-11-02  6:36 ` [PATCH 4/9] diff: discard hunk headers for patch-ids earlier Jeff King
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:36 UTC (permalink / raw)
  To: git

Some callers of xdi_diff_outf() do not look at the generated hunk header
lines at all. By plugging in a no-op hunk callback, this tells xdiff not
to even bother formatting them.

This patch introduces a stock no-op callback and uses it with a few
callers whose line callbacks explicitly ignore hunk headers (because
they look only for +/- lines).

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c             | 4 ++--
 diffcore-pickaxe.c | 3 ++-
 xdiff-interface.c  | 6 ++++++
 xdiff-interface.h  | 6 ++++++
 4 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 07be5879e5..d84356e007 100644
--- a/diff.c
+++ b/diff.c
@@ -3637,8 +3637,8 @@ static void builtin_diffstat(const char *name_a, const char *name_b,
 		xpp.anchors_nr = o->anchors_nr;
 		xecfg.ctxlen = o->context;
 		xecfg.interhunkctxlen = o->interhunkcontext;
-		if (xdi_diff_outf(&mf1, &mf2, NULL, diffstat_consume,
-				  diffstat, &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, discard_hunk_line,
+				  diffstat_consume, diffstat, &xpp, &xecfg))
 			die("unable to generate diffstat for %s", one->path);
 	}
 
diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 58df35670a..69fc55ea1e 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -62,7 +62,8 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
 	ecbdata.hit = 0;
 	xecfg.ctxlen = o->context;
 	xecfg.interhunkctxlen = o->interhunkcontext;
-	if (xdi_diff_outf(one, two, NULL, diffgrep_consume, &ecbdata, &xpp, &xecfg))
+	if (xdi_diff_outf(one, two, discard_hunk_line, diffgrep_consume,
+			  &ecbdata, &xpp, &xecfg))
 		return 0;
 	return ecbdata.hit;
 }
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 16d37ce6be..2622981d97 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -157,6 +157,12 @@ int xdi_diff(mmfile_t *mf1, mmfile_t *mf2, xpparam_t const *xpp, xdemitconf_t co
 	return xdl_diff(&a, &b, xpp, xecfg, xecb);
 }
 
+void discard_hunk_line(void *priv,
+		       long ob, long on, long nb, long nn,
+		       const char *func, long funclen)
+{
+}
+
 int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 		  xdiff_emit_hunk_fn hunk_fn,
 		  xdiff_emit_line_fn line_fn,
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 2dbe2feb19..7b0ccbdd1d 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -35,6 +35,12 @@ extern void xdiff_clear_find_func(xdemitconf_t *xecfg);
 extern int git_xmerge_config(const char *var, const char *value, void *cb);
 extern int git_xmerge_style;
 
+/*
+ * Can be used as a no-op hunk_fn for xdi_diff_outf(), since a NULL
+ * one just sends the hunk line to the line_fn callback).
+ */
+void discard_hunk_line(void *, long, long, long, long, const char *, long);
+
 /*
  * Compare the strings l1 with l2 which are of size s1 and s2 respectively.
  * Returns 1 if the strings are deemed equal, 0 otherwise.
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/9] diff: discard hunk headers for patch-ids earlier
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (2 preceding siblings ...)
  2018-11-02  6:36 ` [PATCH 3/9] diff: avoid generating unused hunk header lines Jeff King
@ 2018-11-02  6:36 ` Jeff King
  2018-11-02  6:37 ` [PATCH 5/9] diff: use hunk callback for word-diff Jeff King
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:36 UTC (permalink / raw)
  To: git

We do not include hunk header lines when computing patch-ids, since
the line numbers would create false negatives. Rather than detect and
skip them in our line callback, we can simply tell xdiff to avoid
generating them.

This is similar to the previous commit, but split out because it
actually requires modifying the matching line callback.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/diff.c b/diff.c
index d84356e007..8a83ce6058 100644
--- a/diff.c
+++ b/diff.c
@@ -5666,10 +5666,6 @@ static void patch_id_consume(void *priv, char *line, unsigned long len)
 	struct patch_id_t *data = priv;
 	int new_len;
 
-	/* Ignore line numbers when computing the SHA1 of the patch */
-	if (starts_with(line, "@@ -"))
-		return;
-
 	new_len = remove_space(line, len);
 
 	git_SHA1_Update(data->ctx, line, new_len);
@@ -5771,8 +5767,8 @@ static int diff_get_patch_id(struct diff_options *options, struct object_id *oid
 		xpp.flags = 0;
 		xecfg.ctxlen = 3;
 		xecfg.flags = 0;
-		if (xdi_diff_outf(&mf1, &mf2, NULL, patch_id_consume,
-				  &data, &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, discard_hunk_line,
+				  patch_id_consume, &data, &xpp, &xecfg))
 			return error("unable to generate patch-id diff for %s",
 				     p->one->path);
 	}
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/9] diff: use hunk callback for word-diff
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (3 preceding siblings ...)
  2018-11-02  6:36 ` [PATCH 4/9] diff: discard hunk headers for patch-ids earlier Jeff King
@ 2018-11-02  6:37 ` Jeff King
  2018-11-02  6:38 ` [PATCH 6/9] combine-diff: use an xdiff hunk callback Jeff King
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:37 UTC (permalink / raw)
  To: git

Our word-diff does not look at the -/+ lines generated by xdiff at all
(because they are not real lines to show the user, but just the
tokenized words split into lines). Instead we use the line numbers from
the hunk headers to index our own data structure.

As a result, our xdi_diff_outf() callback throws away all lines except
hunk headers. We can instead use a hunk callback, which has two
benefits:

  1. We don't have to re-parse the generated hunk header line, but can
     use the passed parameters directly.

  2. By setting our line callback to NULL, we can tell xdiff-interface
     that it does not even need to bother generating the other lines,
     saving a small amount of work.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c            | 12 +++++-------
 xdiff-interface.c |  3 +++
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/diff.c b/diff.c
index 8a83ce6058..37035110dc 100644
--- a/diff.c
+++ b/diff.c
@@ -1912,19 +1912,17 @@ static int color_words_output_graph_prefix(struct diff_words_data *diff_words)
 	}
 }
 
-static void fn_out_diff_words_aux(void *priv, char *line, unsigned long len)
+static void fn_out_diff_words_aux(void *priv,
+				  long minus_first, long minus_len,
+				  long plus_first, long plus_len,
+				  const char *func, long funclen)
 {
 	struct diff_words_data *diff_words = priv;
 	struct diff_words_style *style = diff_words->style;
-	int minus_first, minus_len, plus_first, plus_len;
 	const char *minus_begin, *minus_end, *plus_begin, *plus_end;
 	struct diff_options *opt = diff_words->opt;
 	const char *line_prefix;
 
-	if (line[0] != '@' || parse_hunk_header(line, len,
-			&minus_first, &minus_len, &plus_first, &plus_len))
-		return;
-
 	assert(opt);
 	line_prefix = diff_line_prefix(opt);
 
@@ -2074,7 +2072,7 @@ static void diff_words_show(struct diff_words_data *diff_words)
 	xpp.flags = 0;
 	/* as only the hunk header will be parsed, we need a 0-context */
 	xecfg.ctxlen = 0;
-	if (xdi_diff_outf(&minus, &plus, NULL, fn_out_diff_words_aux,
+	if (xdi_diff_outf(&minus, &plus, fn_out_diff_words_aux, NULL,
 			  diff_words, &xpp, &xecfg))
 		die("unable to generate word diff");
 	free(minus.ptr);
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 2622981d97..f2a3d9a577 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -95,6 +95,9 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 	struct xdiff_emit_state *priv = priv_;
 	int i;
 
+	if (!priv->line_fn)
+		return 0;
+
 	for (i = 0; i < nbuf; i++) {
 		if (mb[i].ptr[mb[i].size-1] != '\n') {
 			/* Incomplete line */
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/9] combine-diff: use an xdiff hunk callback
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (4 preceding siblings ...)
  2018-11-02  6:37 ` [PATCH 5/9] diff: use hunk callback for word-diff Jeff King
@ 2018-11-02  6:38 ` Jeff King
  2018-11-02  6:39 ` [PATCH 7/9] diff: convert --check to use a " Jeff King
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:38 UTC (permalink / raw)
  To: git

A combined diff has to line up the hunks for all of the individual
pairwise diffs, and thus needs to know their line numbers and sizes. We
get that now by parsing the hunk header line that xdiff generates.
However, now that xdiff supports a hunk callback, we can just use the
values directly.

Signed-off-by: Jeff King <peff@peff.net>
---
I learned to love --color-moved-ws=allow-indentation-change for this
one.

 combine-diff.c | 67 +++++++++++++++++++++++++++-----------------------
 1 file changed, 36 insertions(+), 31 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 3c479cfc3e..ad7752ea6b 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -345,38 +345,43 @@ struct combine_diff_state {
 	struct sline *lost_bucket;
 };
 
-static void consume_line(void *state_, char *line, unsigned long len)
+static void consume_hunk(void *state_,
+			 long ob, long on,
+			 long nb, long nn,
+			 const char *funcline, long funclen)
 {
 	struct combine_diff_state *state = state_;
-	if (5 < len && !memcmp("@@ -", line, 4)) {
-		if (parse_hunk_header(line, len,
-				      &state->ob, &state->on,
-				      &state->nb, &state->nn))
-			return;
-		state->lno = state->nb;
-		if (state->nn == 0) {
-			/* @@ -X,Y +N,0 @@ removed Y lines
-			 * that would have come *after* line N
-			 * in the result.  Our lost buckets hang
-			 * to the line after the removed lines,
-			 *
-			 * Note that this is correct even when N == 0,
-			 * in which case the hunk removes the first
-			 * line in the file.
-			 */
-			state->lost_bucket = &state->sline[state->nb];
-			if (!state->nb)
-				state->nb = 1;
-		} else {
-			state->lost_bucket = &state->sline[state->nb-1];
-		}
-		if (!state->sline[state->nb-1].p_lno)
-			state->sline[state->nb-1].p_lno =
-				xcalloc(state->num_parent,
-					sizeof(unsigned long));
-		state->sline[state->nb-1].p_lno[state->n] = state->ob;
-		return;
+
+	state->ob = ob;
+	state->on = on;
+	state->nb = nb;
+	state->nn = nn;
+	state->lno = state->nb;
+	if (state->nn == 0) {
+		/* @@ -X,Y +N,0 @@ removed Y lines
+		 * that would have come *after* line N
+		 * in the result.  Our lost buckets hang
+		 * to the line after the removed lines,
+		 *
+		 * Note that this is correct even when N == 0,
+		 * in which case the hunk removes the first
+		 * line in the file.
+		 */
+		state->lost_bucket = &state->sline[state->nb];
+		if (!state->nb)
+			state->nb = 1;
+	} else {
+		state->lost_bucket = &state->sline[state->nb-1];
 	}
+	if (!state->sline[state->nb-1].p_lno)
+		state->sline[state->nb-1].p_lno =
+			xcalloc(state->num_parent, sizeof(unsigned long));
+	state->sline[state->nb-1].p_lno[state->n] = state->ob;
+}
+
+static void consume_line(void *state_, char *line, unsigned long len)
+{
+	struct combine_diff_state *state = state_;
 	if (!state->lost_bucket)
 		return; /* not in any hunk yet */
 	switch (line[0]) {
@@ -421,8 +426,8 @@ static void combine_diff(struct repository *r,
 	state.num_parent = num_parent;
 	state.n = n;
 
-	if (xdi_diff_outf(&parent_file, result_file, NULL, consume_line,
-			  &state, &xpp, &xecfg))
+	if (xdi_diff_outf(&parent_file, result_file, consume_hunk,
+			  consume_line, &state, &xpp, &xecfg))
 		die("unable to generate combined diff for %s",
 		    oid_to_hex(parent));
 	free(parent_file.ptr);
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/9] diff: convert --check to use a hunk callback
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (5 preceding siblings ...)
  2018-11-02  6:38 ` [PATCH 6/9] combine-diff: use an xdiff hunk callback Jeff King
@ 2018-11-02  6:39 ` Jeff King
  2018-11-02  6:39 ` [PATCH 8/9] range-diff: " Jeff King
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:39 UTC (permalink / raw)
  To: git

The "diff --check" code needs to know the line number on which each hunk
starts in order to generate its output. We get that now by parsing the
hunk header line generated by xdiff, but it's much simpler to just pass
it directly using a hunk callback.

Signed-off-by: Jeff King <peff@peff.net>
---
 diff.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/diff.c b/diff.c
index 37035110dc..e38d1ecaf4 100644
--- a/diff.c
+++ b/diff.c
@@ -3128,6 +3128,15 @@ static int is_conflict_marker(const char *line, int marker_size, unsigned long l
 	return 1;
 }
 
+static void checkdiff_consume_hunk(void *priv,
+				   long ob, long on, long nb, long nn,
+				   const char *func, long funclen)
+
+{
+	struct checkdiff_t *data = priv;
+	data->lineno = nb - 1;
+}
+
 static void checkdiff_consume(void *priv, char *line, unsigned long len)
 {
 	struct checkdiff_t *data = priv;
@@ -3163,12 +3172,6 @@ static void checkdiff_consume(void *priv, char *line, unsigned long len)
 			      data->o->file, set, reset, ws);
 	} else if (line[0] == ' ') {
 		data->lineno++;
-	} else if (line[0] == '@') {
-		char *plus = strchr(line, '+');
-		if (plus)
-			data->lineno = strtol(plus, NULL, 10) - 1;
-		else
-			die("invalid diff");
 	}
 }
 
@@ -3684,8 +3687,9 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		memset(&xecfg, 0, sizeof(xecfg));
 		xecfg.ctxlen = 1; /* at least one context line */
 		xpp.flags = 0;
-		if (xdi_diff_outf(&mf1, &mf2, NULL, checkdiff_consume,
-				  &data, &xpp, &xecfg))
+		if (xdi_diff_outf(&mf1, &mf2, checkdiff_consume_hunk,
+				  checkdiff_consume, &data,
+				  &xpp, &xecfg))
 			die("unable to generate checkdiff for %s", one->path);
 
 		if (data.ws_rule & WS_BLANK_AT_EOF) {
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 8/9] range-diff: use a hunk callback
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (6 preceding siblings ...)
  2018-11-02  6:39 ` [PATCH 7/9] diff: convert --check to use a " Jeff King
@ 2018-11-02  6:39 ` Jeff King
  2018-11-02  6:40 ` [PATCH 9/9] xdiff-interface: drop parse_hunk_header() Jeff King
  2018-11-02 11:48 ` [PATCH 0/9] saner xdiff hunk callbacks Junio C Hamano
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:39 UTC (permalink / raw)
  To: git

When we count the lines in a diff, we don't actually care about the
contents of each line. By using a hunk callback, we tell xdiff that it
does not need to even bother generating a hunk header line, saving a
small amount of work.

Arguably we could even ignore the hunk headers completely, since we're
just computing a cost function between patches. But doing it this way
maintains the exact same behavior before and after.

Signed-off-by: Jeff King <peff@peff.net>
---
This one might be going overboard. It can be dropped without affecting
any of the other patches.

 range-diff.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/range-diff.c b/range-diff.c
index 245fc17fee..3958720f00 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -197,6 +197,12 @@ static void diffsize_consume(void *data, char *line, unsigned long len)
 	(*(int *)data)++;
 }
 
+static void diffsize_hunk(void *data, long ob, long on, long nb, long nn,
+			  const char *funcline, long funclen)
+{
+	diffsize_consume(data, NULL, 0);
+}
+
 static int diffsize(const char *a, const char *b)
 {
 	xpparam_t pp = { 0 };
@@ -210,7 +216,9 @@ static int diffsize(const char *a, const char *b)
 	mf2.size = strlen(b);
 
 	cfg.ctxlen = 3;
-	if (!xdi_diff_outf(&mf1, &mf2, NULL, diffsize_consume, &count, &pp, &cfg))
+	if (!xdi_diff_outf(&mf1, &mf2,
+			   diffsize_hunk, diffsize_consume, &count,
+			   &pp, &cfg))
 		return count;
 
 	error(_("failed to generate diff"));
-- 
2.19.1.1336.g081079ac04


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 9/9] xdiff-interface: drop parse_hunk_header()
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (7 preceding siblings ...)
  2018-11-02  6:39 ` [PATCH 8/9] range-diff: " Jeff King
@ 2018-11-02  6:40 ` Jeff King
  2018-11-02 11:48 ` [PATCH 0/9] saner xdiff hunk callbacks Junio C Hamano
  9 siblings, 0 replies; 14+ messages in thread
From: Jeff King @ 2018-11-02  6:40 UTC (permalink / raw)
  To: git

This function was used only for parsing the hunk headers generated by
xdiff. Now that we can use hunk callbacks to get that information
directly, it has outlived its usefulness.

Note to anyone who wants to resurrect it: the "len" parameter was
totally unused, meaning that the function could read past the end of the
"line" array. In practice this never happened, because we only used it
to parse xdiff's generated header lines. But it would be dangerous to
use it for other cases without fixing this defect.

Signed-off-by: Jeff King <peff@peff.net>
---
 xdiff-interface.c | 45 ---------------------------------------------
 xdiff-interface.h |  3 ---
 2 files changed, 48 deletions(-)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index f2a3d9a577..80f060d278 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -15,51 +15,6 @@ struct xdiff_emit_state {
 	struct strbuf remainder;
 };
 
-static int parse_num(char **cp_p, int *num_p)
-{
-	char *cp = *cp_p;
-	int num = 0;
-
-	while ('0' <= *cp && *cp <= '9')
-		num = num * 10 + *cp++ - '0';
-	if (!(cp - *cp_p))
-		return -1;
-	*cp_p = cp;
-	*num_p = num;
-	return 0;
-}
-
-int parse_hunk_header(char *line, int len,
-		      int *ob, int *on,
-		      int *nb, int *nn)
-{
-	char *cp;
-	cp = line + 4;
-	if (parse_num(&cp, ob)) {
-	bad_line:
-		return error("malformed diff output: %s", line);
-	}
-	if (*cp == ',') {
-		cp++;
-		if (parse_num(&cp, on))
-			goto bad_line;
-	}
-	else
-		*on = 1;
-	if (*cp++ != ' ' || *cp++ != '+')
-		goto bad_line;
-	if (parse_num(&cp, nb))
-		goto bad_line;
-	if (*cp == ',') {
-		cp++;
-		if (parse_num(&cp, nn))
-			goto bad_line;
-	}
-	else
-		*nn = 1;
-	return -!!memcmp(cp, " @@", 3);
-}
-
 static int xdiff_out_hunk(void *priv_,
 			  long old_begin, long old_nr,
 			  long new_begin, long new_nr,
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 7b0ccbdd1d..971aa84b57 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -23,9 +23,6 @@ int xdi_diff_outf(mmfile_t *mf1, mmfile_t *mf2,
 		  xdiff_emit_line_fn line_fn,
 		  void *consume_callback_data,
 		  xpparam_t const *xpp, xdemitconf_t const *xecfg);
-int parse_hunk_header(char *line, int len,
-		      int *ob, int *on,
-		      int *nb, int *nn);
 int read_mmfile(mmfile_t *ptr, const char *filename);
 void read_mmblob(mmfile_t *ptr, const struct object_id *oid);
 int buffer_is_binary(const char *ptr, unsigned long size);
-- 
2.19.1.1336.g081079ac04

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/9] saner xdiff hunk callbacks
  2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
                   ` (8 preceding siblings ...)
  2018-11-02  6:40 ` [PATCH 9/9] xdiff-interface: drop parse_hunk_header() Jeff King
@ 2018-11-02 11:48 ` Junio C Hamano
  9 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2018-11-02 11:48 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> ... it seems rather silly that we have xdiff generate a string
> just so that we can parse it from within the same process. So instead I
> improved the xdiff interface to pass the actual integers out, made use
> of it as appropriate, and then in the final patch we can just drop the
> buggy function entirely.

Acked-by: me.

Thanks.  This is exactly what I wanted to do for a very long time,
ever since I did the original xdiff-interface.c

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/9] diff: avoid generating unused hunk header lines
  2018-11-02  6:36 ` [PATCH 3/9] diff: avoid generating unused hunk header lines Jeff King
@ 2018-11-02 19:50   ` Stefan Beller
  2018-11-02 20:15     ` Jeff King
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Beller @ 2018-11-02 19:50 UTC (permalink / raw)
  To: Jeff King; +Cc: git

> +/*
> + * Can be used as a no-op hunk_fn for xdi_diff_outf(), since a NULL
> + * one just sends the hunk line to the line_fn callback).
> + */
> +void discard_hunk_line(void *, long, long, long, long, const char *, long);

Recently we had the discussion on style and naming things.
On the one hand I don't know what these 4 different longs do,
so I'd wish for some descriptive variable names in here.
On the other hand the docs explain clearly why I don't need
to care (a no-op ignores all of the parameters, no need
to take care of their order)

So to revive that discussion, I would strongly prefer
to have *some* names there, for the sake of a
simply described coding style without many exceptions
(especially those exceptions that rely on judgement).

Today I read [1], which describes the analog in the
mechanical world: To evolve and have more impact
you need tighter requirements on some parts. And
I would roughly translate that to our use case as
not having to worry about style (it's ironic I even type
out this email... if we could just run clang format or
some other tightly controlling formatter/linter, I'd be
much happier as our focus should be elsewhere,
such as UX or performance).

Apart from that, I read the whole series, and found
it a pleasant read.

Thanks,
Stefan

[1] https://www.nybooks.com/articles/2018/10/25/precision-accuracy-perfectionism/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/9] diff: avoid generating unused hunk header lines
  2018-11-02 19:50   ` Stefan Beller
@ 2018-11-02 20:15     ` Jeff King
  2018-11-03  0:32       ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff King @ 2018-11-02 20:15 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, git

On Fri, Nov 02, 2018 at 12:50:05PM -0700, Stefan Beller wrote:

> > +/*
> > + * Can be used as a no-op hunk_fn for xdi_diff_outf(), since a NULL
> > + * one just sends the hunk line to the line_fn callback).
> > + */
> > +void discard_hunk_line(void *, long, long, long, long, const char *, long);
> 
> Recently we had the discussion on style and naming things.
> On the one hand I don't know what these 4 different longs do,
> so I'd wish for some descriptive variable names in here.
> On the other hand the docs explain clearly why I don't need
> to care (a no-op ignores all of the parameters, no need
> to take care of their order)

Right, I actually did have the same thought while writing it. And ended
up following that line of reasoning (since it's just a placeholder, it
doesn't matter).  But I'm not opposed to putting in the names.

> So to revive that discussion, I would strongly prefer
> to have *some* names there, for the sake of a
> simply described coding style without many exceptions
> (especially those exceptions that rely on judgement).

Fair enough.

Squashable patch is below; it goes on 34c829621e (diff: avoid generating
unused hunk header lines, 2018-11-02).

Junio, let me know if you'd prefer a re-send of the series, but it
doesn't look necessary otherwise (so far).

> Apart from that, I read the whole series, and found
> it a pleasant read.

Thanks!

diff --git a/xdiff-interface.h b/xdiff-interface.h
index 7b0ccbdd1d..8af245eed9 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -39,7 +39,9 @@ extern int git_xmerge_style;
  * Can be used as a no-op hunk_fn for xdi_diff_outf(), since a NULL
  * one just sends the hunk line to the line_fn callback).
  */
-void discard_hunk_line(void *, long, long, long, long, const char *, long);
+void discard_hunk_line(void *priv,
+		       long ob, long on, long nb, long nn,
+		       const char *func, long funclen);
 
 /*
  * Compare the strings l1 with l2 which are of size s1 and s2 respectively.

-Peff

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/9] diff: avoid generating unused hunk header lines
  2018-11-02 20:15     ` Jeff King
@ 2018-11-03  0:32       ` Junio C Hamano
  0 siblings, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2018-11-03  0:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git

Jeff King <peff@peff.net> writes:

>> to have *some* names there, for the sake of a
>> simply described coding style without many exceptions
>> (especially those exceptions that rely on judgement).
>
> Fair enough.

For the record, there aren't "many" exceptions to the rule that was
suggested earlier: if you refer to parameters by name in the comment
to explain the interface, name them and use these names [*1*].

> Squashable patch is below; it goes on 34c829621e (diff: avoid generating
> unused hunk header lines, 2018-11-02).
>
> Junio, let me know if you'd prefer a re-send of the series, but it
> doesn't look necessary otherwise (so far).

Giving them proper names would serve as a readily usable sample for
those who write new hunk-line callbacks that do not ignore them, so
what you wrote is good.  Let me squash it in.

Thanks.  If this were to add bunch of "unused$n" names to the
parameters to this no-op function, only to "have *some* names
there", I would have said that it is not even worth the time to
discuss it, though.

> diff --git a/xdiff-interface.h b/xdiff-interface.h
> index 7b0ccbdd1d..8af245eed9 100644
> --- a/xdiff-interface.h
> +++ b/xdiff-interface.h
> @@ -39,7 +39,9 @@ extern int git_xmerge_style;
>   * Can be used as a no-op hunk_fn for xdi_diff_outf(), since a NULL
>   * one just sends the hunk line to the line_fn callback).
>   */
> -void discard_hunk_line(void *, long, long, long, long, const char *, long);
> +void discard_hunk_line(void *priv,
> +		       long ob, long on, long nb, long nn,
> +		       const char *func, long funclen);
>  
>  /*
>   * Compare the strings l1 with l2 which are of size s1 and s2 respectively.
>
> -Peff

[Footnote]

*1* You may say that the "if you refer to parameters by name" part
relies on judgment, but I think it is a good thing---it makes poor
judgment stand out.

When describing a function that takes two parameters of the same
type, for example, you could explain the interface as "take two ints
and return true if first int is smaller than the second int" in
order to leave the parameters unnamed, but if you gave such a
description, it is so obvious that you are _avoiding_ names.  Not
using names when you do not have to is good, but nobody with half an
intelligence would think it is good to give a bad description just
to avoid using names.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-11-03  0:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-02  6:31 [PATCH 0/9] saner xdiff hunk callbacks Jeff King
2018-11-02  6:35 ` [PATCH 1/9] xdiff: provide a separate emit callback for hunks Jeff King
2018-11-02  6:35 ` [PATCH 2/9] xdiff-interface: provide a separate consume " Jeff King
2018-11-02  6:36 ` [PATCH 3/9] diff: avoid generating unused hunk header lines Jeff King
2018-11-02 19:50   ` Stefan Beller
2018-11-02 20:15     ` Jeff King
2018-11-03  0:32       ` Junio C Hamano
2018-11-02  6:36 ` [PATCH 4/9] diff: discard hunk headers for patch-ids earlier Jeff King
2018-11-02  6:37 ` [PATCH 5/9] diff: use hunk callback for word-diff Jeff King
2018-11-02  6:38 ` [PATCH 6/9] combine-diff: use an xdiff hunk callback Jeff King
2018-11-02  6:39 ` [PATCH 7/9] diff: convert --check to use a " Jeff King
2018-11-02  6:39 ` [PATCH 8/9] range-diff: " Jeff King
2018-11-02  6:40 ` [PATCH 9/9] xdiff-interface: drop parse_hunk_header() Jeff King
2018-11-02 11:48 ` [PATCH 0/9] saner xdiff hunk callbacks Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).