All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Johannes Schindelin" <johannes.schindelin@gmx.de>,
	"Carlo Marcelo Arenas Belón" <carenas@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v3 18/22] xdiff-interface: allow early return from xdiff_emit_line_fn
Date: Mon, 12 Apr 2021 19:15:25 +0200	[thread overview]
Message-ID: <patch-18.22-76d667f152f-20210412T170457Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-00.22-00000000000-20210412T170457Z-avarab@gmail.com>

Finish the change started in the preceding commit and allow an early
return from "xdiff_emit_line_fn" callbacks, this will allows
diffcore-pickaxe.c to save itself redundant work.

Our xdiff interface also had the limitation of not being able to abort
early since the beginning, see d9ea73e0564 (combine-diff: refactor
built-in xdiff interface., 2006-04-05). Although at that time
"xdiff_emit_line_fn" was called "xdiff_emit_consume_fn", and
"xdiff_emit_hunk_fn" didn't exist yet.

There was some work in this area of xdiff-interface.[ch] recently with
3b40a090fd4 (diff: avoid generating unused hunk header lines,
2018-11-02) and 7c61e25fbf1 (diff: use hunk callback for word-diff,
2018-11-02).

In combination those two changes allow us to not do any work on the
hunks and diff at all, but didn't change the status quo with regards
to consumers that e.g. want the diff lines, but might want to abort
early.

Whereas now we can abort e.g. on the first "-line" of a 1000 line diff
if that's all we needed.

This interface is rather scary as noted in the comment to
xdiff-interface.h being added here, as noted there a future change
could add more exit codes, and hack xdl_emit_diff() and friends to
ignore or skip things more selectively as a result.

I did not see an inherent reason for why xdl_emit_{diffrec,record}()
could not be changed to ferry the "xdiff_emit_line_fn" error code
upwards instead of returning -1 on all "ret < 0".

But doing so would require corresponding changes in xdl_emit_diff(),
xdl_diff(). I didn't see any issue with narrowly doing that to
accomplish what I needed here, but it would leave xdiff's own return
values in an inconsistent state.

Instead I've left it at returning a more conventional (for git's own
codebase) 1 for an early return, and translating it (or rather, all
non-zero) to -1 for xdiff's consumption.

The reason for most of the "stop" complexity in xdiff_outf() is
because we want to be able to abort early, but do so in a way that
doesn't skip the appropriate strbuf_reset() invocations.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 xdiff-interface.c | 18 ++++++++++++++----
 xdiff-interface.h | 17 +++++++++++++++++
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 5d8c8c67dc2..50c0ef759dd 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -37,9 +37,12 @@ static int consume_one(void *priv_, char *s, unsigned long size)
 	char *ep;
 	while (size) {
 		unsigned long this_size;
+		int ret;
 		ep = memchr(s, '\n', size);
 		this_size = (ep == NULL) ? size : (ep - s + 1);
-		priv->line_fn(priv->consume_callback_data, s, this_size);
+		ret = priv->line_fn(priv->consume_callback_data, s, this_size);
+		if (ret)
+			return ret;
 		size -= this_size;
 		s += this_size;
 	}
@@ -50,11 +53,14 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 {
 	struct xdiff_emit_state *priv = priv_;
 	int i;
+	int stop = 0;
 
 	if (!priv->line_fn)
 		return 0;
 
 	for (i = 0; i < nbuf; i++) {
+		if (stop)
+			return 1;
 		if (mb[i].ptr[mb[i].size-1] != '\n') {
 			/* Incomplete line */
 			strbuf_add(&priv->remainder, mb[i].ptr, mb[i].size);
@@ -63,17 +69,21 @@ static int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 
 		/* we have a complete line */
 		if (!priv->remainder.len) {
-			consume_one(priv, mb[i].ptr, mb[i].size);
+			stop = consume_one(priv, mb[i].ptr, mb[i].size);
 			continue;
 		}
 		strbuf_add(&priv->remainder, mb[i].ptr, mb[i].size);
-		consume_one(priv, priv->remainder.buf, priv->remainder.len);
+		stop = consume_one(priv, priv->remainder.buf, priv->remainder.len);
 		strbuf_reset(&priv->remainder);
 	}
+	if (stop)
+		return -1;
 	if (priv->remainder.len) {
-		consume_one(priv, priv->remainder.buf, priv->remainder.len);
+		stop = consume_one(priv, priv->remainder.buf, priv->remainder.len);
 		strbuf_reset(&priv->remainder);
 	}
+	if (stop)
+		return -1;
 	return 0;
 }
 
diff --git a/xdiff-interface.h b/xdiff-interface.h
index 0198f9632f5..7d1724abb64 100644
--- a/xdiff-interface.h
+++ b/xdiff-interface.h
@@ -11,6 +11,23 @@
  */
 #define MAX_XDIFF_SIZE (1024UL * 1024 * 1023)
 
+/**
+ * The `xdiff_emit_line_fn` function can return 1 to abort early, or 0
+ * to continue processing. Note that doing so is an all-or-nothing
+ * affair, as returning 1 will return all the way to the top-level,
+ * e.g. the xdi_diff_outf() call to generate the diff.
+ *
+ * Thus returning 1 means you won't be getting any more diff lines. If
+ * you need something in-between those two options you'll to use
+ * `xdl_emit_hunk_consume_func_t` and implement your own version of
+ * xdl_emit_diff().
+ *
+ * We may extend the interface in the future to understand other more
+ * granular return values. While you should return 1 to exit early,
+ * doing so will currently make your early return indistinguishable
+ * from an error internal to xdiff, xdiff itself will see that
+ * non-zero return and translate it to -1.
+ */
 typedef int (*xdiff_emit_line_fn)(void *, char *, unsigned long);
 typedef void (*xdiff_emit_hunk_fn)(void *data,
 				   long old_begin, long old_nr,
-- 
2.31.1.639.g3d04783866f


  parent reply	other threads:[~2021-04-12 17:16 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-03  3:27 [PATCH 00/25] grep: PCREv2 fixes, remove kwset.[ch] Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 01/25] grep/pcre2 tests: reword comments referring to kwset Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 02/25] grep/pcre2: drop needless assignment + assert() on opt->pcre2 Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 03/25] grep/pcre2: drop needless assignment to NULL Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 04/25] grep/pcre2: correct reference to grep_init() in comment Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 05/25] grep/pcre2: prepare to add debugging to pcre2_malloc() Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 06/25] grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 07/25] grep/pcre2: use compile-time PCREv2 version test Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 08/25] grep/pcre2: use pcre2_maketables_free() function Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 09/25] grep/pcre2: actually make pcre2 use custom allocator Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 10/25] grep/pcre2: move back to thread-only PCREv2 structures Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 11/25] grep/pcre2: move definitions of pcre2_{malloc,free} Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 12/25] pickaxe tests: refactor to use test_commit --append Ævar Arnfjörð Bjarmason
2021-02-03  3:27 ` [PATCH 13/25] pickaxe -S: support content with NULs under --pickaxe-regex Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 14/25] pickaxe -S: remove redundant "sz" check in while-loop Ævar Arnfjörð Bjarmason
2021-02-04 16:16   ` René Scharfe
2021-02-04 17:56     ` Junio C Hamano
2021-02-04 21:13       ` Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 15/25] pickaxe/style: consolidate declarations and assignments Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 16/25] pickaxe tests: add test for diffgrep_consume() internals Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 17/25] pickaxe tests: add test for "log -S" not being a regex Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 18/25] perf: add performance test for pickaxe Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 19/25] pickaxe -G: set -U0 for diff generation Ævar Arnfjörð Bjarmason
2021-02-03 14:26   ` Ævar Arnfjörð Bjarmason
2021-02-03 19:42     ` Junio C Hamano
2021-02-03  3:28 ` [PATCH 20/25] grep.h: make patmatch() a public function Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 21/25] pickaxe: use PCREv2 for -G and -S Ævar Arnfjörð Bjarmason
2021-02-03 20:44   ` Ævar Arnfjörð Bjarmason
2021-02-04 18:11     ` Junio C Hamano
2021-02-04 18:22   ` Junio C Hamano
2021-02-03  3:28 ` [PATCH 22/25] Remove unused kwset.[ch] Ævar Arnfjörð Bjarmason
     [not found]   ` <CAPUEspgBmuTBHVZWY9fRtjbHWBRr0zHravLL1Czepc6jmib4HA@mail.gmail.com>
2021-02-03 14:13     ` Ævar Arnfjörð Bjarmason
     [not found]       ` <CAPUEsphN7QuSVsC1Tr4xE8yQgPTtpF7wL7zbk1crQU3n-5g6JQ@mail.gmail.com>
2021-02-03 16:45         ` Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 23/25] xdiff-interface: allow early return from xdiff_emit_{line,hunk}_fn Ævar Arnfjörð Bjarmason
2021-02-03  3:28 ` [PATCH 24/25] xdiff-interface: support early exit in xdiff_outf() Ævar Arnfjörð Bjarmason
2021-02-04 18:16   ` Junio C Hamano
2021-02-03  3:28 ` [PATCH 25/25] pickaxe -G: terminate early on matching lines Ævar Arnfjörð Bjarmason
2021-02-03 12:38 ` [PATCH 00/25] grep: PCREv2 fixes, remove kwset.[ch] Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 00/22] pickaxe: test and refactoring for follow-up changes Ævar Arnfjörð Bjarmason
2021-02-16 22:23   ` Junio C Hamano
2021-02-17  1:19     ` Junio C Hamano
2021-04-12 17:15   ` [PATCH v3 00/22] pickaxe: test and refactoring for future PCRE backend Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 01/22] grep/pcre2 tests: reword comments referring to kwset Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 02/22] pickaxe tests: refactor to use test_commit --append --printf Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 03/22] pickaxe tests: add test for diffgrep_consume() internals Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 04/22] pickaxe tests: add test for "log -S" not being a regex Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 05/22] pickaxe tests: test for -G, -S and --find-object incompatibility Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 06/22] pickaxe tests: add missing test for --no-pickaxe-regex being an error Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 07/22] pickaxe: die when -G and --pickaxe-regex are combined Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 08/22] pickaxe: die when --find-object and --pickaxe-all " Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 09/22] diff.h: move pickaxe fields together again Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 10/22] pickaxe/style: consolidate declarations and assignments Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 11/22] perf: add performance test for pickaxe Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 12/22] pickaxe: refactor function selection in diffcore-pickaxe() Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 13/22] pickaxe: assert that we must have a needle under -G or -S Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 14/22] pickaxe -S: support content with NULs under --pickaxe-regex Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 15/22] pickaxe: rename variables in has_changes() for brevity Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 16/22] pickaxe -S: slightly optimize contains() Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 17/22] xdiff-interface: prepare for allowing early return Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` Ævar Arnfjörð Bjarmason [this message]
2021-04-12 17:15     ` [PATCH v3 19/22] pickaxe -G: terminate early on matching lines Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 20/22] pickaxe -G: don't special-case create/delete Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 21/22] xdiff users: use designated initializers for out_line Ævar Arnfjörð Bjarmason
2021-04-12 17:15     ` [PATCH v3 22/22] xdiff-interface: replace discard_hunk_line() with a flag Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 01/22] grep/pcre2 tests: reword comments referring to kwset Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 02/22] test-lib-functions: document and test test_commit --no-tag Ævar Arnfjörð Bjarmason
2021-03-30 23:14   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 03/22] test-lib-functions: reword "test_commit --append" docs Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 04/22] test-lib functions: add --printf option to test_commit Ævar Arnfjörð Bjarmason
2021-03-30 23:11   ` Junio C Hamano
2021-04-12 13:19     ` Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 05/22] pickaxe tests: refactor to use test_commit --append --printf Ævar Arnfjörð Bjarmason
2021-03-30 23:26   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 06/22] pickaxe tests: add test for diffgrep_consume() internals Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 07/22] pickaxe tests: add test for "log -S" not being a regex Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 08/22] pickaxe tests: test for -G, -S and --find-object incompatibility Ævar Arnfjörð Bjarmason
2021-03-30 23:32   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 09/22] pickaxe: die when -G and --pickaxe-regex are combined Ævar Arnfjörð Bjarmason
2021-03-30 23:36   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 10/22] pickaxe: die when --find-object and --pickaxe-all " Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 11/22] diff.h: move pickaxe fields together again Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 12/22] pickaxe/style: consolidate declarations and assignments Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 13/22] perf: add performance test for pickaxe Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 14/22] pickaxe: refactor function selection in diffcore-pickaxe() Ævar Arnfjörð Bjarmason
2021-03-30 23:45   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 15/22] pickaxe: assert that we must have a needle under -G or -S Ævar Arnfjörð Bjarmason
2021-03-30 23:50   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 16/22] pickaxe -S: support content with NULs under --pickaxe-regex Ævar Arnfjörð Bjarmason
2021-03-30 23:54   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 17/22] pickaxe: rename variables in has_changes() for brevity Ævar Arnfjörð Bjarmason
2021-02-16 11:57 ` [PATCH v2 18/22] pickaxe -S: slightly optimize contains() Ævar Arnfjörð Bjarmason
2021-03-30 23:58   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 19/22] xdiff-interface: allow early return from xdiff_emit_{line,hunk}_fn Ævar Arnfjörð Bjarmason
2021-03-31  0:04   ` Junio C Hamano
2021-02-16 11:57 ` [PATCH v2 20/22] xdiff-interface: support early exit in xdiff_outf() Ævar Arnfjörð Bjarmason
2021-02-16 11:58 ` [PATCH v2 21/22] pickaxe -G: terminate early on matching lines Ævar Arnfjörð Bjarmason
2021-03-31  0:11   ` Junio C Hamano
2021-02-16 11:58 ` [PATCH v2 22/22] pickaxe -G: don't special-case create/delete Ævar Arnfjörð Bjarmason
2021-03-31  0:14   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=patch-18.22-76d667f152f-20210412T170457Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=carenas@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.